-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make CAS and AC server not to use the same store #1175
Make CAS and AC server not to use the same store #1175
Conversation
Hope you to review this. cc: @allada , @MarcusSorealheis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and 2 discussions need to be resolved
a discussion (no related file):
We should also create an integration test for this.
src/bin/nativelink.rs
line 230 at r1 (raw file):
// Check if CAS and AC uses the same store. if let Some(cas_config) = &services.cas {
This is O(N^2)
. Can we instead create a hashmap, of one of them then index into the other to check? This will make it O(max(N, M))
which is literally exponentially more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and 2 discussions need to be resolved
a discussion (no related file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
We should also create an integration test for this.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and 2 discussions need to be resolved
a discussion (no related file):
Previously, MarcusSorealheis (Marcus Eagan) wrote…
+1
I will add integration test in mins.
src/bin/nativelink.rs
line 230 at r1 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
This is
O(N^2)
. Can we instead create a hashmap, of one of them then index into the other to check? This will make itO(max(N, M))
which is literally exponentially more efficient.
My algorithm is already O(N * logM).
I do not agree to this.
Well, fyi, cas_config
and ac_config
are already hashmaps here.
There is only one for loop here for looping through the hashmap cas_config
.
And then ac_config.get(instance_name)
- this part is O(logM) because ac_config
is already HashMap
.
Since we have hashmaps by default there's no need to create a new hashmap or something.
Hope you to check the code again. cc: @allada
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and 2 discussions need to be resolved
a discussion (no related file):
Previously, aleksdmladenovic (Aleksandar Mladenovic) wrote…
I will add integration test in mins.
cc: @allada , @MarcusSorealheis
Oh wait, @aleksdmladenovic you're already working on this issue. I'll check it now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a test, for sure. All you need to do is include a config JSON where the AC and CAS use the same store type.
Reviewable status: 0 of 1 LGTMs obtained, and 2 discussions need to be resolved
452ba6b
to
5aba799
Compare
5aba799
to
eb06c5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an integration test for this. cc: @allada , @MarcusSorealheis
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Installation / macos-13, Installation / macos-14, Installation / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-init, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable, and 2 discussions need to be resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some weird failure not related to the PR, I believe, at first glance. It was caused by some permissions issue with Tekton so I re-ran the commit. We should not be able to fail due to Tekton cc @aaronmondal in case it happens again.
I'll dig into the test now.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Remote / large-ubuntu-22.04, integration-tests (20.04), integration-tests (22.04), pre-commit-checks, and 2 discussions need to be resolved
a discussion (no related file):
Previously, aleksdmladenovic (Aleksandar Mladenovic) wrote…
cc: @allada , @MarcusSorealheis
Needs test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also have some pre-commit issues to work out. Probably something like trailing whitespace.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Remote / large-ubuntu-22.04, integration-tests (20.04), integration-tests (22.04), pre-commit-checks, and 2 discussions need to be resolved
eb06c5f
to
d9e2ce8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. @MarcusSorealheis
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Installation / macos-13, Installation / macos-14, Installation / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-init, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable, and 2 discussions need to be resolved
a discussion (no related file):
Previously, MarcusSorealheis (Marcus Eagan) wrote…
Needs test.
Done. cc: @allada , @MarcusSorealheis
89c5ba1
to
9d5e661
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made it pass all CI/CD. FYI, I've installed nix environment locally and finally it went through all pre-commit checks.
For my further commits, it won't fall into pre-commit issues again. :)
Waiting for your feedback. cc: @allada, @MarcusSorealheis
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Publish nativelink-worker-init, Remote / large-ubuntu-22.04, and 2 discussions need to be resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and 3 discussions need to be resolved
integration_tests/store_conflict_test.sh
line 26 at r4 (raw file):
EXIT_CODE=0 sudo docker compose --file docker-compose.test.yml up -d
@aaronmondal , this feels like the wrong way to use docker for integration tests. Do you see any low hanging fruit on how to make this less hacky?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and 3 discussions need to be resolved
integration_tests/store_conflict_test.sh
line 26 at r4 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
@aaronmondal , this feels like the wrong way to use docker for integration tests. Do you see any low hanging fruit on how to make this less hacky?
First I thought no, but actually yes!
For things that only need to test specific config options we can start the executable directly without wrapping it in Docker. The nativelink-is-installable
test does this:
- The test: https://github.com/TraceMachina/nativelink/blob/main/tools/nativelink-is-executable-test.nix (you could create the nativelink config inline in the script).
- Import it in the flake:
Line 142 in 3574149
nativelink-is-executable-test = import ./tools/nativelink-is-executable-test.nix {inherit pkgs nativelink;}; - Export it as flake output:
Line 232 in 3574149
inherit publish-ghcr local-image-test nativelink-is-executable-test nativelink nativelink-debug native-cli lre-cc nativelink-worker-init; - Add it as a github action or in this case just add it as a step above the step here so it can reuse the cache:
nativelink/.github/workflows/nix.yaml
Line 135 in 3574149
- name: Test nix run
Magicians. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and 3 discussions need to be resolved
integration_tests/store_conflict_test.sh
line 26 at r4 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
First I thought no, but actually yes!
For things that only need to test specific config options we can start the executable directly without wrapping it in Docker. The
nativelink-is-installable
test does this:
- The test: https://github.com/TraceMachina/nativelink/blob/main/tools/nativelink-is-executable-test.nix (you could create the nativelink config inline in the script).
- Import it in the flake:
Line 142 in 3574149
nativelink-is-executable-test = import ./tools/nativelink-is-executable-test.nix {inherit pkgs nativelink;}; - Export it as flake output:
Line 232 in 3574149
inherit publish-ghcr local-image-test nativelink-is-executable-test nativelink nativelink-debug native-cli lre-cc nativelink-worker-init; - Add it as a github action or in this case just add it as a step above the step here so it can reuse the cache:
nativelink/.github/workflows/nix.yaml
Line 135 in 3574149
- name: Test nix run
Thanks. @aaronmondal.
Let me try that way. :)
f35455b
to
36b0491
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I added an integration test for checking this feature in a non-hacky way using nix and github nix workflow in a way as @aaronmondal has mentioned. :)
I hope you to check this and work on other issues.
cc: @aaronmondal . @allada , @MarcusSorealheis
Reviewable status: 0 of 2 LGTMs obtained, and 0 of 6 files reviewed, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Installation / macos-13, Installation / macos-14, Installation / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-init, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable, and 3 discussions need to be resolved
36b0491
to
101fbbc
Compare
This feature is something we'd ideally like to get into the next release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 LGTMs obtained, and 0 of 6 files reviewed, and 5 discussions need to be resolved
src/bin/nativelink.rs
line 254 at r5 (raw file):
return Err(make_err!( Code::InvalidArgument, "CAS and AC use the same store '{}' in the config",
The error text here should be:
"CAS and AC cannot use the same store '{}' in the config"
It's open question for @bclark8923 and @adam-singer as to whether or not we should also link to the documentation on this issue.
tools/nativelink-prevent-store-conflict-test.nix
line 62 at r5 (raw file):
print_error_output=$(cat <<EOF Error: Error { code: InvalidArgument, messages: ["CAS and AC use the same store 'FILESYSTEM_STORE' in the config"] }
again, we should be clear that users cannot use
101fbbc
to
cc1d398
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed your changes. cc: @MarcusSorealheis, @allada
Reviewable status: 0 of 2 LGTMs obtained, and 0 of 6 files reviewed, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Installation / macos-13, Installation / macos-14, Installation / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-init, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable, and 5 discussions need to be resolved
src/bin/nativelink.rs
line 254 at r5 (raw file):
Previously, MarcusSorealheis (Marcus Eagan) wrote…
The error text here should be:
"CAS and AC cannot use the same store '{}' in the config"
It's open question for @bclark8923 and @adam-singer as to whether or not we should also link to the documentation on this issue.
Done.
tools/nativelink-prevent-store-conflict-test.nix
line 62 at r5 (raw file):
Previously, MarcusSorealheis (Marcus Eagan) wrote…
again, we should be clear that users cannot use
Done.
cc1d398
to
22de268
Compare
Rebased. cc: @MarcusSorealheis |
22de268
to
ae50187
Compare
Hope you to merge this. cc: @allada, @MarcusSorealheis, @aaronmondal |
When CAS and AC uses the same store, it might produce unexpected side-effects and wrong behaviors. To prevent system from such situation, it currently raises an error when CAS and AC uses the same store. An integration test is added for this using nix test file and github's nix workflow.
ae50187
to
57b63e8
Compare
Rebased and handled conflicts. Hope you to check this. cc: @MarcusSorealheis, @allada, @aaronmondal |
#1555 obviates this PR |
Description
When CAS and AC uses the same store, it might produce unexpected side-effects and wrong behaviors. To prevent system from such situation, it currently raises an error when CAS and AC uses the same store.
Fixes #768
Type of change
Please delete options that aren't relevant.
How Has This Been Tested?
Please also list any relevant details for your test configuration
Checklist
bazel test //...
passes locallygit amend
see some docsThis change is