Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle exiting threads in diffusion layer #2696

Merged
merged 1 commit into from
Oct 23, 2020
Merged

Conversation

karknu
Copy link
Contributor

@karknu karknu commented Oct 21, 2020

If a thread spawned by diffusion exits for any reason we kill
all other threads and return. This is done to ensure that the node isn't
left running in a semi-functional state, for example with the client
side working but the server has exited due to a configuration error.

Example output:
cardano-node: Network.Socket.bind: unsupported operation (Can't assign requested address)⏎

@karknu
Copy link
Contributor Author

karknu commented Oct 21, 2020

Fixes #2684 .

@karknu karknu marked this pull request as ready for review October 21, 2020 11:51
@karknu karknu requested a review from coot October 21, 2020 11:51
Copy link
Contributor

@coot coot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -319,9 +327,10 @@ runDataDiffusion tracers
sd
(daLocalResponderApplication applications)
localErrorPolicy
return ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this return? There's void applied to previous monadic action.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the return isn't needed. Will remove it.

@@ -350,11 +359,12 @@ runDataDiffusion tracers
sd
(daResponderApplication applications)
remoteErrorPolicy
return ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reply ;)

@@ -369,13 +379,13 @@ runDataDiffusion tracers
, spErrorPolicies = remoteErrorPolicy
, spSubscriptionTarget = daIpProducers
}
(daInitiatorApplication applications)
(daInitiatorApplication applications) >> return ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner to write:

void $ NodeToNode.ipSubscriptionWorker ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -391,7 +401,7 @@ runDataDiffusion tracers
, spErrorPolicies = remoteErrorPolicy
, spSubscriptionTarget = dnsProducer
}
(daInitiatorApplication applications)
(daInitiatorApplication applications) >> return ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@coot coot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments.

If a thread spawned by diffusion exits for any reason we kill
all other threads and return. This is done to ensure that the node isn't
left running in a semi-functional state, for example with the client
side working but the server has exited due to a configuration error.
@karknu
Copy link
Contributor Author

karknu commented Oct 21, 2020

This should fix #2519 too.

@karknu karknu requested a review from coot October 23, 2020 12:44
Copy link
Contributor

@coot coot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@karknu
Copy link
Contributor Author

karknu commented Oct 23, 2020

bors r+

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Oct 23, 2020

@iohk-bors iohk-bors bot merged commit 82d6922 into master Oct 23, 2020
@iohk-bors iohk-bors bot deleted the karknu/diffusion_fail branch October 23, 2020 17:09
coot pushed a commit that referenced this pull request May 16, 2022
2696: Handle exiting threads in diffusion layer r=karknu a=karknu

If a thread spawned by diffusion exits for any reason we kill
all other threads and return. This is done to ensure that the node isn't
left running in a semi-functional state, for example with the client
side working but the server has exited due to a configuration error.

Example output:
`cardano-node: Network.Socket.bind: unsupported operation (Can't assign requested address)⏎`

Co-authored-by: Karl Knutsson <karl.knutsson@iohk.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants