Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document volatile Action Cache implications #667

Closed
lukts30 opened this issue Feb 18, 2024 · 4 comments · Fixed by #669
Closed

Document volatile Action Cache implications #667

lukts30 opened this issue Feb 18, 2024 · 4 comments · Fixed by #669
Assignees

Comments

@lukts30
Copy link

lukts30 commented Feb 18, 2024

Currently after I have build my project I can run buck2 clean && buck2 build and everything will be fetched from nativelink without redoing any build steps.
But if I restart nativelink with basic_cas.json and without deleting any data on the subsequent buck2 build the ResultRequest is not found. Input to that action that were previously uploaded are reused. So for buck2 I only see re_execute but no re_upload.

initial build:

2024-02-18T11:43:32.049511Z  INFO nativelink_service::ac_server: get_action_result Req: GetActionResultRequest { instance_name: "main", action_digest: Some(Digest { hash: "51073140fcbccabef2195dca45f8d457fbbf0f3623800c3f506cf7c72ec32cbd", size_bytes: 142 }), inline_stdout: false, inline_stderr: false, inline_output_files: [], digest_function: Unknown }
2024-02-18T11:43:32.049554Z  INFO nativelink_service::ac_server: get_action_result Resp: 0.00004321 Some("51073140fcbccabef2195dca45f8d457fbbf0f3623800c3f506cf7c72ec32cbd") Err(Error { code: NotFound, messages: ["Hash 51073140fcbccabef2195dca45f8d457fbbf0f3623800c3f506cf7c72ec32cbd not found"] })

buck2 clean && buck2 build :

2024-02-18T11:56:45.917431Z  INFO nativelink_service::ac_server: get_action_result Req: GetActionResultRequest { instance_name: "main", action_digest: Some(Digest { hash: "51073140fcbccabef2195dca45f8d457fbbf0f3623800c3f506cf7c72ec32cbd", size_bytes: 142 }), inline_stdout: false, inline_stderr: false, inline_output_files: [], digest_function: Unknown }
2024-02-18T11:56:45.917457Z  INFO nativelink_service::ac_server: get_action_result Resp: 0.00002689 Some("51073140fcbccabef2195dca45f8d457fbbf0f3623800c3f506cf7c72ec32cbd") Ok(Response

restart nativelink:

2024-02-18T12:02:57.462135Z  INFO nativelink_service::ac_server: get_action_result Req: GetActionResultRequest { instance_name: "main", action_digest: Some(Digest { hash: "51073140fcbccabef2195dca45f8d457fbbf0f3623800c3f506cf7c72ec32cbd", size_bytes: 142 }), inline_stdout: false, inline_stderr: false, inline_output_files: [], digest_function: Unknown }
2024-02-18T12:02:57.462179Z  INFO nativelink_service::ac_server: get_action_result Resp: 0.000044111 Some("51073140fcbccabef2195dca45f8d457fbbf0f3623800c3f506cf7c72ec32cbd") Err(Error { code: NotFound, messages: ["Hash 51073140fcbccabef2195dca45f8d457fbbf0f3623800c3f506cf7c72ec32cbd not found"] })

tests were done with commit 2a89ce6 because of #665.

@allada
Copy link
Member

allada commented Feb 18, 2024

Yes this is because for the basic_cas.json, we are using a memory store for the Action Cache:

    "AC_MAIN_STORE": {
      "memory": {
        "eviction_policy": {
          // 100mb.
          "max_bytes": 100000000,
        }
      }
    },

Try swapping it out for a filesystem store:

    "AC_MAIN_STORE": {
      "filesystem": {
        "content_path": "/tmp/nativelink/data-worker-test/content_path-ac",
        "temp_path": "/tmp/nativelink/data-worker-test/tmp_path-ac",
        "eviction_policy": {
          // 1gb.
          "max_bytes": 1000000000,
        }
      }
    },

@MarcusSorealheis or @bclark8923, could either of you make a PR to either change the implementation of the basic_cas.json to a filesystem store, or add some documentation around this?

@lukts30
Copy link
Author

lukts30 commented Feb 19, 2024

Oh, that explains it. I naively thought of it as a run-time accelerated store that would still be persisted somehow.
For my use case I changed the action cache to the filesystem store.

@lukts30 lukts30 changed the title GetActionResultRequest returns NotFound after restarting nativelink Document volatile Action Cache implications Feb 19, 2024
@MarcusSorealheis MarcusSorealheis self-assigned this Feb 19, 2024
@MarcusSorealheis
Copy link
Collaborator

I'll take care of it. After thinking about it for a few minutes @allada, I think we need to move entirely away from the memory store.

This issue sort of highlights why. We need to replace the memory store with an in-memory database that has some semblance of durability, among other factors. Targeting a system's internal memory without a database layer in-between can lead to lots of should be solved problems. Perhaps, replace is too strong.

@allada
Copy link
Member

allada commented Feb 20, 2024

I have mixed feelings about moving to an out-of-process memory store. We should have this conversation in another thread though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants