Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement get_tree() feature #905

Merged

Conversation

aleksdmladenovic
Copy link
Contributor

@aleksdmladenovic aleksdmladenovic commented May 9, 2024

Description

I handled get_tree() GRPC request in the CAS Server. Used FIFO(First-In-First-Out) traversal method ( BFS Algorithm ) to loop over directories using VecDeque data structure. Paging is supported and page_token in GetTreeRequest is used to point the directory to start traversing with. And next_page_token in GetTreeResponse refers the page_token parameter which will be used in the next request.

Fixes #355

Type of change

Please delete options that aren't relevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Still not tested yet.

Checklist

  • Updated documentation if needed
  • Tests added/amended
  • bazel test //... passes locally
  • PR is contained in a single commit, using git amend see some docs

This change is Reviewable

@aleksdmladenovic
Copy link
Contributor Author

+@allada, @MarcusSorealheis

@adam-singer
Copy link
Contributor

@aleksdmladenovic can we add tests ?

Copy link
Member

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good, but needs some tests.

There's also a few performance improvements that could be made, but it's fine to omit them for now.

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: 0 of 1 LGTMs obtained, and 4 discussions need to be resolved


nativelink-service/src/cas_server.rs line 251 at r1 (raw file):

        let mut deque: VecDeque<DigestInfo> = VecDeque::new();
        let mut directories: Vec<Directory> = Vec::new();
        let page_token = inner_request.page_token;

nit: Lets convert this to a DigestInfo and error otherwise. If it's an empty string consider it ok.

You should not use hash_str(), because it also needs to match the size to be in spec.


nativelink-service/src/cas_server.rs line 257 at r1 (raw file):

        // `next_page_token` will return the `hash_str` of the next request round's first directory digest.
        // It will be an empty string when it reached the end of the directory tree.
        let mut next_page_token: String = String::new();

nit: Use digest and convert it last. This saves on allocations too.


nativelink-service/src/cas_server.rs line 261 at r1 (raw file):

        while !deque.is_empty() {
            let digest: DigestInfo = *deque.front().err_tip(|| "In VecDeque::front")?;

nit: use .pop_front() here.


nativelink-service/src/cas_server.rs line 269 at r1 (raw file):

            }
            if page_token_matched {
                if directories.len() as i32 == page_size + 1 {

nit: call directories.push(directory.clone()) first, so you don't need to add + 1 here.

Copy link
Contributor Author

@aleksdmladenovic aleksdmladenovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added tests and fixed nits you mentioned.

cc: @allada , @adam-singer

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable

Copy link
Member

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), macos-13, ubuntu-22.04, and 7 discussions need to be resolved


nativelink-service/src/cas_server.rs line 252 at r2 (raw file):

        let mut directories: Vec<Directory> = Vec::new();
        // `page_token` will return the `{hash_str}:{size_bytes}` of the current request's first directory digest.
        let page_token_parts: Vec<&str> = inner_request.page_token.split(":").collect();

nit: This is really inefficient and causes allocations. Instead you should use .split("-") then .next() on the iterator to get the two parts into the variables (with .err_tip()).


nativelink-service/src/cas_server.rs line 252 at r2 (raw file):

        let mut directories: Vec<Directory> = Vec::new();
        // `page_token` will return the `{hash_str}:{size_bytes}` of the current request's first directory digest.
        let page_token_parts: Vec<&str> = inner_request.page_token.split(":").collect();

nit: Don't use : use -, this is standard in the rest of the code.


nativelink-service/src/cas_server.rs line 292 at r2 (raw file):

        // `next_page_token` will return the `{hash_str}:{size_bytes}` of the next request's first directory digest.
        // It will be an empty string when it reached the end of the directory tree.
        let next_page_token: String = if let Some(value) = deque.front() {

nit: Because we are now grabbing the front-item, we are introducing a fence-post case where deque.front() could be None because the for directory in directory.directories { did not run.

We should move the for-loop above the if page_token_matched { branch to protect us.


nativelink-service/tests/cas_server_test.rs line 346 at r2 (raw file):

        let store_pinned = Pin::new(store_owned.as_ref());

        // Set up 5 sub-directories

nit: Period.


nativelink-service/tests/cas_server_test.rs line 361 at r2 (raw file):

                    mtime: Some(Timestamp {
                        seconds: 0,
                        nanos: i,

nit: put this in seconds. nanos might not get converted properly due to percision issues.


nativelink-service/tests/cas_server_test.rs line 393 at r2 (raw file):

        )
        .await?;
        let root_directory_digest: Digest = Digest::try_from(root_directory_digest_info).unwrap();

nit: use root_directory_digest_info.into() (ditto with others).


nativelink-service/tests/cas_server_test.rs line 431 at r2 (raw file):

        );

        // Must work when paging is enabled ( `page_size` is 2 ).

This should be a different test.

I suggest making a setup function that will get our state all prepared, then we can call it from different tests.

Copy link
Contributor Author

@aleksdmladenovic aleksdmladenovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved comments. cc:@allada, @adam-singer

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable, and 1 discussions need to be resolved


nativelink-service/tests/cas_server_test.rs line 431 at r2 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

This should be a different test.

I suggest making a setup function that will get our state all prepared, then we can call it from different tests.

Done.

@steedmicro steedmicro requested a review from allada May 14, 2024 14:32
Copy link
Member

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r3.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), macos-13, pre-commit-checks, ubuntu-22.04, and 6 discussions need to be resolved


nativelink-service/src/cas_server.rs line 260 at r3 (raw file):

                .next()
                .err_tip(|| "Failed to parse `size_bytes` in `page_token`")?
                .parse::<i64>()?,

nit: Please .err_tip() or .map_err().


nativelink-service/src/cas_server.rs line 261 at r3 (raw file):

                .err_tip(|| "Failed to parse `size_bytes` in `page_token`")?
                .parse::<i64>()?,
        )?;

ditto.


nativelink-service/src/cas_server.rs line 275 at r3 (raw file):

                page_token_matched = true;
            }
            for directory in directory.clone().directories {

nit: Clone not needed. You probably want to clone directory.digest() instead.


nativelink-service/src/cas_server.rs line 284 at r3 (raw file):

            }
            if page_token_matched {
                directories.push(directory.clone());

.clone() not needed.


nativelink-service/tests/cas_server_test.rs line 343 at r3 (raw file):

    ) -> Result<
        (
            Digest,

nit: Make a struct that represents these to make it more readable.


nativelink-service/tests/cas_server_test.rs line 349 at r3 (raw file):

            Vec<DigestInfo>,
        ),
        Box<dyn std::error::Error>,

nit: Use nativelink's Error.

Copy link
Contributor Author

@aleksdmladenovic aleksdmladenovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Publish image, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable, and 2 discussions need to be resolved


nativelink-service/src/cas_server.rs line 261 at r3 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

ditto.

Done.


nativelink-service/src/cas_server.rs line 284 at r3 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

.clone() not needed.

Done.

Copy link
Member

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 2 files at r4, all commit messages.
Reviewable status: 1 of 1 LGTMs obtained, and pending CI: Cargo Dev / macos-13, and 1 discussions need to be resolved


nativelink-service/tests/cas_server_test.rs line 341 at r4 (raw file):

    struct SetupDirectoryResult {
        root_directory_digest: Digest,

nit: just use DigestInfo, it's always safe to convert directly to Digest.

@aleksdmladenovic aleksdmladenovic force-pushed the implement-get-tree branch 2 times, most recently from 421103e to 0be5aeb Compare May 14, 2024 18:28
Copy link
Contributor Author

@aleksdmladenovic aleksdmladenovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved. cc: @allada

Reviewable status: 1 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable

Copy link
Member

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: 1 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Publish image, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), macos-13, ubuntu-20.04 / stable, ubuntu-22.04 / stable, windows-2022 / stable

@MarcusSorealheis
Copy link
Collaborator

This looks good. thank you for the PR aleksdmladenovic!

Copy link
Collaborator

@MarcusSorealheis MarcusSorealheis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 2 of 1 LGTMs obtained, and pending CI: pre-commit-checks

@MarcusSorealheis
Copy link
Collaborator

@allada If this looks good still, please merge it.Even though we don't use it, a lot of tools do use it so maybe it will be helpful to NativeLink users.

Copy link
Contributor

@adam-singer adam-singer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 2 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: 3 of 1 LGTMs obtained, and pending CI: Cargo Dev / macos-13, macos-13

Handled get_tree() GRPC request in the CAS Server.
Used FIFO(First-In-First-Out) traversal method ( BFS Algorithm ) to
loop over directories using `VecDeque` data structure.
Paging is supported and `page_token` in `GetTreeRequest` is used to
point the directory to start traversing with. And `next_page_token` in
`GetTreeResponse` refers the `page_token` parameter which will be used
in the next request.
@MarcusSorealheis
Copy link
Collaborator

@aleksdmladenovic what was the reason for the most recent push?

@MarcusSorealheis
Copy link
Collaborator

Looks like it may have been a rebase but correct me if I am wrong.

@aleksdmladenovic
Copy link
Contributor Author

Just rebase. cc: @MarcusSorealheis

Copy link
Collaborator

@MarcusSorealheis MarcusSorealheis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: again due to rebase

Reviewed all commit messages.
Reviewable status: :shipit: complete! 3 of 1 LGTMs obtained

@MarcusSorealheis MarcusSorealheis merged commit ae44878 into TraceMachina:main May 16, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement GetTree()
4 participants