-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests: close open file descriptors from make full node #5057
tests: close open file descriptors from make full node #5057
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5057 +/- ##
==========================================
+ Coverage 53.56% 53.57% +0.01%
==========================================
Files 430 430
Lines 54091 54096 +5
==========================================
+ Hits 28975 28984 +9
+ Misses 22869 22867 -2
+ Partials 2247 2245 -2
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catches Shant
node/node.go
Outdated
@@ -988,6 +993,7 @@ func (node *AlgorandFullNode) loadParticipationKeys() error { | |||
return err | |||
} | |||
} | |||
part.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is probably not a right place since there are lots of return
statement within the if block above.
I suggest to wrap the if block in a function and defer closing like this:
if err != nil {
func() {
defer part.Close()
handle.Close()
...
}()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
part is used when deleting part keys. So, closing here is not right.
@@ -460,6 +460,11 @@ func (node *AlgorandFullNode) Stop() { | |||
} | |||
} | |||
|
|||
func (node *AlgorandFullNode) Shutdown() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is it different from Stop() ? mayb move these close statements there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
registry, err := ensureParticipationDB(genesisDir, node.log)
opened in MakeFull
and not reopened in Start
, so closing in Shutdown
:
node.accountManager.Registry().Close()`
Same for
crashAccess, err := db.MakeAccessor(crashPathname, false, false)
opened in MakeFull
and not reopened in Start
, so closing in Shutdown
:
node.agreementService.Accessor.Close()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then Shutdown should be called in daemon/algod/server.go
's Server::Stop
please rebase on top of master |
* node_test runs multiple nodes in a single process that leads to file descriptors leak (see algorand#5057 for more details). * just closing ledger is not enough because of concurrent operations evaluaiton operations done by transaction pool. * made transaction pool shutdown-able and stop it before ledger termination.
* node_test runs multiple nodes in a single process that leads to file descriptors leak (see algorand#5057 for more details). * just closing ledger is not enough because of concurrent operations evaluation operations done by transaction pool. * made transaction pool shutdown-able and stop it before ledger termination.
Pulled some changes into #6039 - with new p2p tests in node_test this became an issue. |
The tests using MakeFull node leave behind many many open files.
This PR closes these.
Some are closed as part of the test procedure, others needed changes to the node.go code.
The node has Start and Stop procedures, but some file descriptors are opened before Start, and not closed at Atop.
For these, Shutdown procedure is introduced to close them.
Despite these improvements, the test still fails occasionally. The reason I think is the network connections not getting stablished. So, this tests are still flaky, and should not be enabled yet.