Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issues/100: Child processes can't get the parent FDs before they're closed #101

Closed
wants to merge 1 commit into from

Conversation

NorseGaud
Copy link
Contributor

@NorseGaud NorseGaud commented Apr 8, 2024

Context: #100

@NorseGaud NorseGaud marked this pull request as draft April 8, 2024 16:04
@NorseGaud NorseGaud marked this pull request as ready for review April 8, 2024 17:15
@@ -115,6 +116,71 @@ func (d *Context) parent() (child *os.Process, err error) {
d.rpipe.Close()
encoder := json.NewEncoder(d.wpipe)
err = encoder.Encode(d)

// wait for worker to start running or else network calls fail: https://github.com/sevlyar/go-daemon/issues/100
time.Sleep(2 * time.Second)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution.

The solution is not stable and makes the bug is noise. Any delay leaves chance for reproducing of the bug.

We should implement child-parent processes syncronization to fix the bug properly. Child process can write an acknowledgment to its parent after initialization and parent can wait the acknowledgment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @sevlyar, sorry about the extra changes. Forgot to remove those.

I tried child parent synchronization and it still doesn't work 😞

It seems like it has to have some sort of activity in worker happen first before parent exits or it doesn't let anything else happen.

But! I likely didn't do it right. Are you able to help me with the synchronization? I am unfortunately fairly new to golang and not a pro at this whole os process stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify a bit: It looks like the child finishes initializing wayy before the worker starts working. I used prints to track what executes when to confirm when I first tried parent/child sync.

Comment on lines +126 to +157
var initialized = false

func (d *Context) child() (err error) {
if initialized {
return os.ErrInvalid
}
initialized = true

decoder := json.NewDecoder(os.Stdin)
if err = decoder.Decode(d); err != nil {
return
}

// create PID file after context decoding to know PID file full path.
if len(d.PidFileName) > 0 {
d.pidFile = NewLockFile(os.NewFile(4, d.PidFileName))
if err = d.pidFile.WritePid(); err != nil {
return
}
defer func() {
if err != nil {
d.pidFile.Remove()
}
}()
}

if err = syscallDup(3, 0); err != nil {
return
}

if d.Umask != 0 {
syscall.Umask(int(d.Umask))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the PR contains a lot of changes. And i should review all of them. But really the PR contains the one-line fix and other it is just moves of the old code. Please return it back or provide the refactoring (code moves) in a separate PR with appropriate comment.

@NorseGaud
Copy link
Contributor Author

Hey @sevlyar , what do you want to do with this? This workaround seems to solve the issue in a production grade application we built, but I'm sure there is a better way of doing it.

@NorseGaud NorseGaud closed this Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants