-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix possible deadlock if max_open_files set too low #908
Conversation
0dfef3f
to
0b3ab20
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Bazel Dev / ubuntu-22.04, Vercel, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), pre-commit-checks, ubuntu-20.04 / stable, vale, and 1 discussions need to be resolved (waiting on @adam)
nativelink-util/src/lib.rs
line 39 at r1 (raw file):
/// Initialize tracing. pub fn init_tracing() -> Result<(), nativelink_error::Error> {
I did this to help debug tests, so figured I'd sneak it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 15 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: 1 of 1 LGTMs obtained, and 1 discussions need to be resolved
nativelink-macro/src/lib.rs
line 40 at r2 (raw file):
let _ = nativelink_util::init_tracing(); // If already set it's ok. let _ = nativelink_util::fs::set_idle_file_descriptor_timeout(std::time::Duration::from_millis(100));
nit: Seem like something that should be a parameter defaulted to 100ms but configurable, 150-300ms hopefully should be max budget to complete a request in an ideal scenario
nativelink-store/src/filesystem_store.rs
line 805 at r2 (raw file):
}), ); drop(file);
nit: lets comment on why the drop since we are explicit about releasing the reference within a function
nativelink-store/tests/filesystem_store_test.rs
line 1421 at r2 (raw file):
async fn update_with_whole_file_closes_file() -> Result<(), Error> { let mut permits = vec![]; // Grab al permits to ensure only 1 permit is available.
nit: "Grab al" -> "Grab all"
nativelink-store/tests/filesystem_store_test.rs
line 1473 at r2 (raw file):
async fn update_with_whole_file_slow_path_when_low_file_descriptors() -> Result<(), Error> { let mut permits = vec![]; // Grab al permits to ensure only 1 permit is available.
nit: "Grab al" -> "Grab all"
nativelink-store/tests/filesystem_store_test.rs
line 1492 at r2 (raw file):
let store = Box::pin(FastSlowStore::new( // Note: The config is not needed for this test, so use dummy data. &nativelink_config::stores::FastSlowStore {
nit: alias or import the struct (applied to same below, if existing code is like that I guess leave)
nativelink-util/src/fs.rs
line 372 at r2 (raw file):
Ok(( permit, std::fs::File::options()
this code makes me happy and sad at the same time.
nativelink-util/src/lib.rs
line 42 at r2 (raw file):
use tracing_subscriber::prelude::*; static LOGGING_INITIALIZED: std::sync::Mutex<bool> = std::sync::Mutex::new(false);
nit: use std::sync::Mutex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 1 LGTMs obtained, and 1 discussions need to be resolved
We now close the file permit if the write future takes too long. fixes TraceMachina#907
0b3ab20
to
a6c91ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Local / ubuntu-22.04, Publish image, Vercel, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04
nativelink-macro/src/lib.rs
line 40 at r2 (raw file):
Previously, adam-singer (Adam Singer) wrote…
nit: Seem like something that should be a parameter defaulted to 100ms but configurable, 150-300ms hopefully should be max budget to complete a request in an ideal scenario
I do agree, but also don't want to spend time dealing with macros on this. For now i'll leave it this way. When we need it parameterized lets do it then.
Keep in mind this is test-only code.
nativelink-store/src/filesystem_store.rs
line 805 at r2 (raw file):
Previously, adam-singer (Adam Singer) wrote…
nit: lets comment on why the drop since we are explicit about releasing the reference within a function
Done.
nativelink-store/tests/filesystem_store_test.rs
line 1421 at r2 (raw file):
Previously, adam-singer (Adam Singer) wrote…
nit: "Grab al" -> "Grab all"
Done.
nativelink-store/tests/filesystem_store_test.rs
line 1473 at r2 (raw file):
Previously, adam-singer (Adam Singer) wrote…
nit: "Grab al" -> "Grab all"
Done.
nativelink-store/tests/filesystem_store_test.rs
line 1492 at r2 (raw file):
Previously, adam-singer (Adam Singer) wrote…
nit: alias or import the struct (applied to same below, if existing code is like that I guess leave)
Yeah to reduce the risk of name collision, we give full paths for all config variables. This is common in this entire file and most of the rest of the tests.
nativelink-util/src/fs.rs
line 372 at r2 (raw file):
Previously, adam-singer (Adam Singer) wrote…
this code makes me happy and sad at the same time.
Yeah. I found a case that we are using the same descriptor to write then read, and std::fs::File::create()
opens in write-only mode.
nativelink-util/src/lib.rs
line 42 at r2 (raw file):
Previously, adam-singer (Adam Singer) wrote…
nit: use
std::sync::Mutex
I think you meant parking_lot
. Done.
We now close the file permit if the write future takes too long.
fixes #907
This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)