Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote_execution: implement OSS write_action_result for local_only cache uploads #764

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

thoughtpolice
Copy link
Contributor

@thoughtpolice thoughtpolice commented Sep 3, 2024

Forward port of part of #477, which implements code in the OSS path for putting results of actions marked local_only = True into the RBE ActionCache.

For people actually using the "remote execution" part of RBE and not just a cache, this rather large omission in the system will often be transparently mitigated because RBE services will not only record ActionCache entries on your behalf, they will also implement "EX-to-AC" forwarding where any execution calls will immediately get looked up in the ActionCache anyway as an optimization. Which means you'll never see this for things that properly hit CAS inputs/outputs.

However, a CommandExecutorConfig configured with remote_enabled = False can still enable remote_cache_enabled = allow_cache_uploads = True which will both enable the ActionCache/CAS support, and also allow uploads to those interfaces too.

The most useful example of this type of setup is in a CI system like GitHub Actions: where you run the build inside a container or image that provides something quasi-hermetic (e.g. including your toolchains), while the RBE system is purely a cache storing inputs/outputs/ActionCache entries and never uses remote execution.

The hope is that this functionality will soon be plug-and-play on GitHub using an RBE-to-GHA caching proxy, so Buck2 projects will be able to get a quick and easy cache, but without RE. Therefore this functionality will become highly useful.

Testing

I can submit tests too, though I'm not sure where. But, using the testing example from #477, running ./buck2 clean && BUCK_LOG=buck2_execute_impl=debug ./buck2 build -v3 //: where ./buck2 is the dotslash build from the 2024-09-02 tag gets me:

austin@GANON:~/tmp/buck2-test-477$ ./buck2 clean && BUCK_LOG=buck2_execute_impl=debug ./buck2 build -v3 //:
killing buckd server
Buck2 daemon pid 1862754 has exited
/home/austin/tmp/buck2-test-477/buck-out/v2/CACHEDIR.TAG
/home/austin/tmp/buck2-test-477/buck-out/v2/tmp
/home/austin/tmp/buck2-test-477/buck-out/v2/gen
/home/austin/tmp/buck2-test-477/buck-out/v2/log
/home/austin/tmp/buck2-test-477/buck-out/v2/forkserver
/home/austin/.buck/buckd/home/austin/tmp/buck2-test-477/v2
Starting new buck2 daemon...
Connected to new buck2 daemon.
Running action: root//:tests (<unspecified>) (stage0) (build), local executor: env -- 'TMPDIR=/home/austin/tmp/buck2-test-477/buck-out/v2/tmp/root/99fb5bba051d5fee/stage0' 'BUCK_SCRATCH_PATH=buck-out/v2/tmp/root/99fb5bba051d5fee/stage0' 'BUCK2_DAEMON_UUID=7f7061a9-f018-441c-8dbb-b5b149c0909b' 'BUCK_BUILD_ID=531fa58e-8263-468c-b908-7415b5e87944' /usr/bin/env sh -c '/usr/bin/env yes abcdefghijklmnopqrstuvwxyz | /usr/bin/head -c 65536 && echo done > "$1"' -- buck-out/v2/gen/root/6dd044292ff31ae1/__tests__/stage0
Build ID: 531fa58e-8263-468c-b908-7415b5e87944
Network: (GRPC-SESSION-ID)
Jobs completed: 10. Time elapsed: 0.1s.
Cache hits: 0%. Commands: 1 (cached: 0, remote: 0, local: 1)
[2024-09-03T17:04:15.718-05:00] 2024-09-03T22:04:15.564918Z  INFO buck2_execute_impl::executors::worker: Creating new WorkerPool
[2024-09-03T17:04:15.718-05:00] 2024-09-03T22:04:15.707678Z  INFO buck2_execute_impl::executors::local: Local execution command line:
[2024-09-03T17:04:15.718-05:00] ```
[2024-09-03T17:04:15.718-05:00] $ /usr/bin/env sh -c /usr/bin/env yes abcdefghijklmnopqrstuvwxyz | /usr/bin/head -c 65536 && echo done > "$1" -- buck-out/v2/gen/root/6dd044292ff31ae1/__tests__/stage0
[2024-09-03T17:04:15.718-05:00] ```
[2024-09-03T17:04:15.719-05:00] 2024-09-03T22:04:15.711154Z DEBUG buck2_execute_impl::executors::caching: Uploading action result for `4d0d66e6fcec174bb91a2173d9dbf46900028215ab714b32cceb4439be742eae:142`
[2024-09-03T17:04:15.719-05:00] 2024-09-03T22:04:15.714243Z  WARN buck2_execute_impl::executors::caching: Cache upload for `4d0d66e6fcec174bb91a2173d9dbf46900028215ab714b32cceb4439be742eae:142` failed: Upload for permission check: Remote Execution Error on write_action_result for ReSession GRPC-SESSION-ID
[2024-09-03T17:04:15.719-05:00] Error: (Not supported)
[2024-09-03T17:04:15.719-05:00] 2024-09-03T22:04:15.714274Z  INFO buck2_execute_impl::executors::caching: Dep file cache upload for `4d0d66e6fcec174bb91a2173d9dbf46900028215ab714b32cceb4439be742eae:142` not attempted
[2024-09-03T17:04:15.719-05:00] 2024-09-03T22:04:15.714558Z DEBUG materialize_artifact{event_dispatcher=EventDispatcher(TraceId = 531fa58e-8263-468c-b908-7415b5e87944) path=buck-out/v2/gen/root/6dd044292ff31ae1/__tests__/stage0}: buck2_execute_impl::materializers::deferred: nothing to materialize, adding to access times buffer

The (Not supported) comes directly from the previous code and is the error mentioned by Neil and Cormac in the previous issue.

Executing with this patch (truncated output of the exact same command above):

Cache hits: 0%. Commands: 1 (cached: 0, remote: 0, local: 1)
[2024-09-03T17:06:01.780-05:00] 2024-09-03T22:06:01.594777Z  INFO buck2_execute_impl::executors::worker: Creating new WorkerPool
[2024-09-03T17:06:01.780-05:00] 2024-09-03T22:06:01.759943Z  INFO buck2_execute_impl::executors::local: Local execution command line:
[2024-09-03T17:06:01.780-05:00] ```
[2024-09-03T17:06:01.780-05:00] $ /usr/bin/env sh -c /usr/bin/env yes abcdefghijklmnopqrstuvwxyz | /usr/bin/head -c 65536 && echo done > "$1" -- buck-out/v2/gen/root/6dd044292ff31ae1/__tests__/stage0
[2024-09-03T17:06:01.780-05:00] ```
[2024-09-03T17:06:01.780-05:00] 2024-09-03T22:06:01.764632Z DEBUG buck2_execute_impl::executors::caching: Uploading action result for `4d0d66e6fcec174bb91a2173d9dbf46900028215ab714b32cceb4439be742eae:142`
[2024-09-03T17:06:01.780-05:00] 2024-09-03T22:06:01.777826Z  INFO buck2_execute_impl::executors::caching: Cache upload for `4d0d66e6fcec174bb91a2173d9dbf46900028215ab714b32cceb4439be742eae:142` succeeded
[2024-09-03T17:06:01.780-05:00] 2024-09-03T22:06:01.777862Z  INFO buck2_execute_impl::executors::caching: Dep file cache upload for `4d0d66e6fcec174bb91a2173d9dbf46900028215ab714b32cceb4439be742eae:142` not attempted
[2024-09-03T17:06:01.780-05:00] 2024-09-03T22:06:01.778246Z DEBUG materialize_artifact{event_dispatcher=EventDispatcher(TraceId = 3c7147f3-2c6b-49e6-ba00-cfcf230b2dfc) path=buck-out/v2/gen/root/6dd044292ff31ae1/__tests__/stage0}: buck2_execute_impl::materializers::deferred: nothing to materialize, adding to access times buffer

And then running the command again, showing a cache hit:

Cache hits: 100%. Commands: 1 (cached: 1, remote: 0, local: 0)
[2024-09-03T17:06:04.975-05:00] 2024-09-03T22:06:04.834804Z  INFO buck2_execute_impl::executors::worker: Creating new WorkerPool
[2024-09-03T17:06:04.975-05:00] 2024-09-03T22:06:04.969252Z  INFO buck2_execute_impl::executors::action_cache: Action result is cached, skipping execution of:
[2024-09-03T17:06:04.975-05:00] ```
[2024-09-03T17:06:04.975-05:00] $ /usr/bin/env sh -c /usr/bin/env yes abcdefghijklmnopqrstuvwxyz | /usr/bin/head -c 65536 && echo done > "$1" -- buck-out/v2/gen/root/6dd044292ff31ae1/__tests__/stage0
[2024-09-03T17:06:04.976-05:00] ```
[2024-09-03T17:06:04.976-05:00]  for action `4d0d66e6fcec174bb91a2173d9dbf46900028215ab714b32cceb4439be742eae:142`
[2024-09-03T17:06:04.976-05:00] 2024-09-03T22:06:04.969338Z  INFO buck2_execute_impl::executors::caching: Cache upload for `4d0d66e6fcec174bb91a2173d9dbf46900028215ab714b32cceb4439be742eae:142` not attempted
[2024-09-03T17:06:04.976-05:00] 2024-09-03T22:06:04.969349Z  INFO buck2_execute_impl::executors::caching: Dep file cache upload for `4d0d66e6fcec174bb91a2173d9dbf46900028215ab714b32cceb4439be742eae:142` not attempted
[2024-09-03T17:06:04.976-05:00] 2024-09-03T22:06:04.969721Z DEBUG materialize_artifact{event_dispatcher=EventDispatcher(TraceId = 709c8472-7b45-4e22-af70-d72d789a98c0) path=buck-out/v2/gen/root/6dd044292ff31ae1/__tests__/stage0}: buck2_execute_impl::materializers::deferred: materialize artifact has_entry_and_method=true method=Some(CasDownload { info: CasDownloadInfo { origin: Execution(ActionExecutionOrigin { action_digest: [4d0d66e6fcec174bb91a2173d9dbf46900028215ab714b32cceb4439be742eae:142 expires at 1725401164], action_instant: 2024-09-03T22:06:04.969094718Z, ttl: TimeDelta { secs: 0, nanos: 0 } }), re_use_case: RemoteExecutorUseCase(Intern { pointer: InternedData { data: "buck2-default", hash: 13166193100074524988 } }) } }) has_deps=false version=2 cleaning=false
[2024-09-03T17:06:04.976-05:00] 2024-09-03T22:06:04.974376Z DEBUG materialization_finished{artifact_path=ProjectRelativePathBuf("buck-out/v2/gen/root/6dd044292ff31ae1/__tests__/stage0") timestamp=2024-09-03T22:06:04.969799792Z version=Version(2) path=buck-out/v2/gen/root/6dd044292ff31ae1/__tests__/stage0}: buck2_execute_impl::materializers::deferred: transition to Materialized has_deps=false

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 3, 2024
@thoughtpolice
Copy link
Contributor Author

@JakobDegen @ndmitchell PTAL if you get a chance. I think I've solved every major follow up for this change from the original PR, including Neil's (I know it's been a long time!), so I think this is OK to merge and strictly an improvement.

@thoughtpolice thoughtpolice force-pushed the aseipp/push-lpnpzpoovqpr branch 4 times, most recently from 586bb96 to 762456a Compare September 6, 2024 01:11
@adamhjk
Copy link
Contributor

adamhjk commented Sep 6, 2024

We're very interested in this - it would be a dramatic improvement for the way we use buck2 at System Initiative. Yum.

@thoughtpolice thoughtpolice force-pushed the aseipp/push-lpnpzpoovqpr branch 2 times, most recently from bdad79a to a7dc8a3 Compare September 8, 2024 22:26
@thoughtpolice thoughtpolice force-pushed the aseipp/push-lpnpzpoovqpr branch 2 times, most recently from b9d43ba to 0878208 Compare September 20, 2024 22:15
@thoughtpolice thoughtpolice force-pushed the aseipp/push-lpnpzpoovqpr branch 2 times, most recently from b57bed5 to f95ec50 Compare October 16, 2024 20:35
@thoughtpolice thoughtpolice force-pushed the aseipp/push-lpnpzpoovqpr branch 2 times, most recently from ba3e981 to 5e8cc1d Compare November 12, 2024 22:29
…` cache uploads

Forward port of part of <facebook#477>,
which implements code in the OSS path for putting results of actions marked
`local_only = True` into the RBE `ActionCache`.

For people actually using the "remote execution" part of RBE and not just a
cache, this rather large omission in the system will often be transparently
mitigated because RBE services will not only record `ActionCache` entries
on your behalf, they will also implement "EX-to-AC" forwarding where any
execution calls will immediately get looked up in the `ActionCache` anyway as an
optimization. Which means you'll never see this for things that properly hit CAS
inputs/outputs.

However, a `CommandExecutorConfig` configured with `remote_enabled = False` can
still enable `remote_cache_enabled = allow_cache_uploads = True` which will both
enable the ActionCache/CAS support, and also allow uploads to those interfaces
too.

The most useful example of this type of setup is in a CI system like GitHub
Actions: where you run the build inside a container or image that provides
something quasi-hermetic (e.g. including your toolchains), while the RBE system
is purely a cache storing inputs/outputs/ActionCache entries and never uses
remote execution.

The hope is that this functionality will soon be plug-and-play on GitHub using
an RBE-to-GHA caching proxy, so Buck2 projects will be able to get a quick and
easy cache, but without RE. Therefore this functionality will become highly
useful.

Co-authored-by: Austin Seipp <aseipp@pobox.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants