Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Too many open files" error when running teos tests #119

Closed
Tracked by #543
seberm opened this issue Sep 8, 2022 · 7 comments · Fixed by #121
Closed
Tracked by #543

"Too many open files" error when running teos tests #119

seberm opened this issue Sep 8, 2022 · 7 comments · Fixed by #121

Comments

@seberm
Copy link

seberm commented Sep 8, 2022

Hello,
when I am running teos tests, I am hitting the standard soft limit of 1024 file descriptors (NOFILE).

Do you really need to exceed this limit or is this just a resource leak which needs a fix?

Due to this problem, we are unable to run teos tests in nix-bitcoin:

Thanks.

CC: @meryacine , @erikarvstedt

$ ulimit -a                                                                                                                                                                                              15:09:35
-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8192
-c: core file size (blocks)         unlimited
-m: resident set size (kbytes)      unlimited
-u: processes                       23029
-n: file descriptors                1024
-l: locked-in-memory size (kbytes)  131072
-v: address space (kbytes)          unlimited
-x: file locks                      unlimited
-i: pending signals                 23029
-q: bytes in POSIX msg queues       819200
-e: max nice                        0
-r: max rt priority                 0
-N 15: rt cpu time (microseconds)   unlimited

Reproducer:

$ cargo test
...
test responder::tests::test_handle_breach_rejected ... ok                                                              
test responder::tests::test_has_tracker ... FAILED                                                                                                                                                                                            
test responder::tests::test_rebroadcast_accepted ... FAILED                                                            
test responder::tests::test_rebroadcast_rejected ... FAILED                                                                                                                                                                                   
test responder::tests::test_responder_new ... FAILED                                                                                                                                                                                          
test watcher::tests::test_add_appointment ... FAILED                                                                   
test watcher::tests::test_block_disconnected ... FAILED                                                                                                                                                                                       
test watcher::tests::test_cache_fix ... ok                                                                                                                                                                                                    
test watcher::tests::test_cache_new ... ok           
test watcher::tests::test_cache_update ... ok                                                                                                                                                                                                 
test watcher::tests::test_delete_appointments ... FAILED                                                                                                                                                                                      
test watcher::tests::test_delete_appointments_from_memory ... FAILED
test watcher::tests::test_filter_breaches ... FAILED                                                                                                                                                                                          
test watcher::tests::test_filtered_block_connected ... FAILED                                                                                                                                                                                 
test watcher::tests::test_get_appointment ... FAILED    
test watcher::tests::test_get_breaches ... FAILED                                                                                                                                                                                             
test watcher::tests::test_new ... FAILED                                                                                                                                                                                                      
test watcher::tests::test_register ... FAILED            
test watcher::tests::test_store_appointment ... FAILED                                                                                                                                                                                        
test watcher::tests::test_store_triggered_appointment ... FAILED                                                                                                                                                                              
test carrier::tests::test_send_transaction_connection_error ... ok   
                                                                                                                                                                                                                                              
failures:                                                  
                                                           
---- responder::tests::test_has_tracker stdout ----                                                                                                                                                                                           
thread 'responder::tests::test_has_tracker' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
                                                                                                                                                                                                                                              
---- responder::tests::test_rebroadcast_accepted stdout ----
thread 'responder::tests::test_rebroadcast_accepted' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14
                                                                                                                                                                                                                                              
---- responder::tests::test_rebroadcast_rejected stdout ----
thread 'responder::tests::test_rebroadcast_rejected' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14
                                                                                                                                                                                                                                              
---- responder::tests::test_responder_new stdout ----      
thread 'responder::tests::test_responder_new' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14
                                                                                                                                                                                                                                              
---- watcher::tests::test_add_appointment stdout ----      
thread 'watcher::tests::test_add_appointment' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14
                                                                                                                                                                                                                                              
---- watcher::tests::test_block_disconnected stdout ----
thread 'watcher::tests::test_block_disconnected' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14

---- watcher::tests::test_delete_appointments stdout ----
thread 'watcher::tests::test_delete_appointments' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14

---- watcher::tests::test_delete_appointments_from_memory stdout ----
thread 'watcher::tests::test_delete_appointments_from_memory' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14

---- watcher::tests::test_filter_breaches stdout ----
thread 'watcher::tests::test_filter_breaches' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14

---- watcher::tests::test_filtered_block_connected stdout ----
thread 'watcher::tests::test_filtered_block_connected' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14

---- watcher::tests::test_get_appointment stdout ----
thread 'watcher::tests::test_get_appointment' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14

---- watcher::tests::test_get_breaches stdout ----
thread 'watcher::tests::test_get_breaches' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14

---- watcher::tests::test_new stdout ----
thread 'watcher::tests::test_new' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14

---- watcher::tests::test_register stdout ----
thread 'watcher::tests::test_register' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14

---- watcher::tests::test_store_appointment stdout ----
thread 'watcher::tests::test_store_appointment' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14

---- watcher::tests::test_store_triggered_appointment stdout ----
thread 'watcher::tests::test_store_triggered_appointment' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }', teos/src/test_utils.rs:575:14


failures:
    responder::tests::test_has_tracker
    responder::tests::test_rebroadcast_accepted
    responder::tests::test_rebroadcast_rejected
    responder::tests::test_responder_new
    watcher::tests::test_add_appointment
    watcher::tests::test_block_disconnected
    watcher::tests::test_delete_appointments
    watcher::tests::test_delete_appointments_from_memory
    watcher::tests::test_filter_breaches
    watcher::tests::test_filtered_block_connected
    watcher::tests::test_get_appointment
    watcher::tests::test_get_breaches
    watcher::tests::test_new
    watcher::tests::test_register
    watcher::tests::test_store_appointment
    watcher::tests::test_store_triggered_appointment

test result: FAILED. 143 passed; 16 failed; 0 ignored; 0 measured; 0 filtered out; finished in 4.43s

error: test failed, to rerun pass '-p teos --lib'
...
@mariocynicys
Copy link
Collaborator

Can you try to limit the number of test jobs run by cargo and see if this resolves it:

cargo test -j 1

@seberm
Copy link
Author

seberm commented Sep 8, 2022

Can you try to limit the number of test jobs run by cargo and see if this resolves it:

cargo test -j 1

Unfortunately I am getting the same result with no concurent jobs.

BTW:
Are you able to reproduce this issue on your distro? Try to set the ulimit to 1024. You should be able to reproduce it.

For me, the value of ulimit -n 1300 is enough and all tests passed. But still, it seems you have a resource leak in your code and this should not be necessary.

@mariocynicys
Copy link
Collaborator

mariocynicys commented Sep 8, 2022

Are you able to reproduce this issue on your distro? Try to set the ulimit to 1024. You should be able to reproduce it.

I will try.
But one thing that confuses me now is that tests are working fine in CI :/. might have to also see what limits github uses.

Update1:
Yup, it fails after setting the soft limit to 1024 on my system (ubuntu 18.04).

Update2:
Just checked GitHub codespaces (which I think is similar to the CI instances) and the soft limit for open files was 1048576.

@sr-gi
Copy link
Member

sr-gi commented Sep 13, 2022

@seberm I can reproduce too. Will look into this.

@sr-gi
Copy link
Member

sr-gi commented Sep 14, 2022

Ok, I found what is causing the issue, but I haven't found a way to fix it yet.

Turns out this is happening due to our BitcoindMock, which is spawning a different jsonrpc_http_server for each test that needs to interact with a mocked bitcoind. We're talking about around ~100 tests here if I'm not mistaken.

Why does this consume more than a thousand fds I'm not sure.

I guess the alternatives are either re-design the mock so it does not use a real jsonrpc_http server or find why the current approach is taking so many resources. There may be a third option involving spinning a single server for all tests, however, I haven't seen this being a thing in Rust (it looks like there's a lack of proper fixtures / mocks).

@erikarvstedt
Copy link

erikarvstedt commented Sep 14, 2022

Why does this consume more than a thousand fds I'm not sure.

Because these servers are only started and never cleaned up during runtime. (The thread produced in start_server never exits during runtime.)
Either work around spawning many servers (as described in your post) or add a server helper type that stops the server on scope exit at the end of a test.

@sr-gi
Copy link
Member

sr-gi commented Sep 14, 2022

Ohhhhh. I didn't catch that! Really appreciated @erikarvstedt 😄

sr-gi added a commit to sr-gi/rust-teos that referenced this issue Sep 15, 2022
@sr-gi sr-gi linked a pull request Sep 15, 2022 that will close this issue
sr-gi added a commit to sr-gi/rust-teos that referenced this issue Sep 15, 2022
sr-gi added a commit to sr-gi/rust-teos that referenced this issue Sep 17, 2022
sr-gi added a commit to sr-gi/rust-teos that referenced this issue Sep 17, 2022
sr-gi added a commit to sr-gi/rust-teos that referenced this issue Sep 17, 2022
Wraps `CloseHandle` inside `BitcoindStopper` and implements `Drop` for the latter so the server does not need to be manually stopped
sr-gi added a commit that referenced this issue Sep 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants