-
Notifications
You must be signed in to change notification settings - Fork 390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add comprehensive support for remote docker. #785
Conversation
This is a massive PR, so absolutely no rush, and I'll likely be incrementally improving the code quality, but reviews are greatly appreciated, as our beta-testers. |
1951761
to
3812d6d
Compare
I believe this might break private SSH dependency support, but I'm not sure if there's a great solution to this anyway. Maybe copying over the entire registry would work? |
this is great! will have to take some time to take it all in |
Just a few things as considerations:
Basically, a config file or 2 environment variable settings saying:
Thoughts? Good idea? Default both of these to off, but enable their use? |
2c8b1a5
to
f99be56
Compare
Ok I've added the registry support and cache support, which should further speed up execution time. Not setting Setting Both are disabled by default, since on a remote machine it can be quite slow. We use a temporary directory to store the files when copying only partial directories, to avoid too many calls to docker (quite slow). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the files are getting rather large, could you split out the remote support stuff into it's own module, e.g inside docker/remote.rs
4839df9
to
dccae12
Compare
I've refactored the codebase more thoroughly:
This should help isolated the behavior better between local and shared, and help |
470e5ac
to
54f2717
Compare
I've rebased after #794 and updated everything to use the remote options if |
So a few things have been resolved in the latest commit:
Giving the However.... #724 is reproducing, and without a way to run Other than that, I'll test this more but I think it's ready. |
For
Since it's a persistent data volume and we need to copy the data back to the host to make any changes, merely cleaning it in the volume won't be enough on its own, so we should do both. So it's worth adding it as a subcommand, but only if all those conditions are met also running it in the container, and then also falling back to the host. So I'll add this as a separate commit to the PR. |
why do we need to run cargo clean on the host as well? |
Because it copies the data back over to the host, although I guess we could skip this unless the copy cache is set, since a primary use of cross is to distribute cross-compiled binaries. I guess we could always make this a cross-util command? |
was mostly wondering, we copy over the target folder so makes sense to me to clean on the host. |
This is ready for review, and adds support for
A few things we can add to our wishlist later:
It might be safer currently to just delete the project directory, and then always copy over all the files, so it always works? This would also mean if the target directory is a subdirectory of the project, there are no incremental builds are therefore they are significantly slower. |
Create and robustly delete temporary data, which is stored in the user data directory under `${data_dir}/cross-rs/tmp`. Therefore, if program execution is ever halted, deleting the directory will automatically cleanup any remaining data. A termination handler cleans up the temporary data on exit by clearing a stack of temporary files/directories, and we have 2 resource handlers that cleanup the temporary data when the object is dropped. The new installer for the startup hooks is `install_exit_hooks`, which installs both the panic and termination handlers. To get the directory containing all temporary data, use `cross::temp::dir()`. An example of using temporary directories and files is: ```rust { // these are all safe due to single-threaded execution let tmpdir1 = unsafe { temp::TempFile::new() }?; let tmpdir2 = unsafe { temp::TempFile::new() }?; let tmpfile1 = unsafe { temp::TempFile::new() }?; let tmpfile2 = unsafe { temp::TempFile::new() }?; for entry in fs::read_dir(tmpdir1.path()) { ... } for entry in fs::read_dir(tmpdir2.path()) { ... } } // cleanup tmpfile2 -> tmpfile1 -> tmpdir2 -> tmpdir1 ``` Note that only these 2 wrappers are provided, since it guarantees the temporary file and directory stack stays ordered and resources are cleaned up as desired.
This supports the volume-based structure, and uses some nice optimizations to ensure that only the desired toolchain and cargo items are copied over. It also uses drops to ensure scoped deletion of resources, to avoid complex logic ensuring their cleanup. It also supports persistent data volumes, through `cross-util`. In order to setup a persistent data volume, use: ```bash cross-util volumes create ``` This will create a persistent data volume specific for your current toolchain, and will be shared by all targets. This volume is specific for the toolchain version and channel, so a new volume will be created for each toolchain. Make sure you provide your `DOCKER_HOST` or correct engine type to ensure these are being made on the remote host. Then, run your command as before: ```bash CROSS_REMOTE=1 cross build --target arm-unknown-linux-gnueabihf ``` Finally, you can clean up the generated volume using: ```bash cross-util volumes remove ``` A few other utilities are present in `cross-util`: - `volumes list`: list all volumes created by cross. - `volumes remove`: remove all volumes created by cross. - `volumes prune`: prune all volumes unassociated with a container. - `containers list`: list all active containers created by cross. - `containers remove`: remove all active containers created by cross. - `clean`: clean all temporary data (images, containers, volumes, temp data) created by cross. The initial implementation was done by Marc Schreiber, schrieveslaach. A few more environment variables exist to fine-tune performance, as well as handle private dependencies. - `CROSS_REMOTE_COPY_REGISTRY`: copy the cargo registry - `CROSS_REMOTE_COPY_CACHE`: copy cache directories, including the target directory. Both of these generally lead to substantial performance penalties, but can enable the use of private SSH dependencies. In either case, the use of persistent data volumes is highly recommended. Fixes cross-rs#248. Fixes cross-rs#273. Closes cross-rs#449.
Required to patch cross-rs#724 without deleting the entire volume for persistent data volumes. A few changes were required: the entire `/cross` mount prefix must be owned by the user so `/cross/target` can be removed. Next, we use the full path to the mounted target directory, rather than the symlink, since `cargo clean` would just delete the symlink. Finally, we've added `cargo clean` to a list of known subcommands, and it only needs docker if we use a remote host.
Updates the persistent volume using a fingerprint of all files in the project, skipping any cache directories by default. If the file modified date has changed, or the file has been added, copy it to the volume and update it. If the file has been removed, then remove it from the host. To avoid a massive command-line argument, we copy a file containing each changed file on a line to the container, and then remove each file by running a script on the container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
I think this all looks good, one todo is trying to figure out if we're in remote context and spit out a nice hint to use CROSS_REMOTE
.
feel free to r= me
That might be good, it can be quite tricky to unambiguously determine if we in a remote context, but it incurs some seriously overhead (see the comment here), which might matter when normally running on localhost. |
bors r=Emilgardis |
785: Add comprehensive support for remote docker. r=Emilgardis a=Alexhuszagh Add comprehensive support for remote docker. This supports the volume-based structure, and uses some nice optimizations to ensure that only the desired toolchain and cargo items are copied over. It also uses drops to ensure scoped deletion of resources, to avoid complex logic ensuring their cleanup. It also supports persistent data volumes, through `cross-util`. In order to setup a persistent data volume, use: ```bash cross-util volumes create ``` This will create a persistent data volume specific for your current toolchain, and will be shared by all targets. This volume is specific for the toolchain version and channel, so a new volume will be created for each toolchain. Make sure you provide your `DOCKER_HOST` or correct engine type to ensure these are being made on the remote host. Then, run your command as before: ```bash CROSS_REMOTE=1 cross build --target arm-unknown-linux-gnueabihf ``` Finally, you can clean up the generated volume using: ```bash cross-util volumes remove ``` A few other utilities are present in `cross-util`: - `volumes list`: list all volumes created by cross. - `volumes remove`: remove all volumes created by cross. - `volumes prune`: prune all volumes unassociated with a container. - `containers list`: list all active containers created by cross. - `containers remove`: remove all active containers created by cross. - `clean`: clean all temporary data (images, containers, volumes, temp data) created by cross. The initial implementation was done by Marc Schreiber, schrieveslaach. A few more environment variables exist to fine-tune performance, as well as handle private dependencies. - `CROSS_REMOTE_COPY_REGISTRY`: copy the cargo registry - `CROSS_REMOTE_COPY_CACHE`: copy cache directories, including the target directory. Both of these generally lead to substantial performance penalties, but can enable the use of private SSH dependencies. In either case, the use of persistent data volumes is highly recommended. Fixes #248. Fixes #273. Closes #449. Co-authored-by: Alex Huszagh <ahuszagh@gmail.com>
bors r- There's a merge conflict I think, or we might get a bad merge. Going to rebase and merge. |
Canceled. |
Nevermind, my fault, it actually merges fine. |
Build succeeded: |
Add comprehensive support for remote docker.
This supports the volume-based structure, and uses some nice optimizations to ensure that only the desired toolchain and cargo items are copied over. It also uses drops to ensure scoped deletion of resources, to avoid complex logic ensuring their cleanup.
It also supports persistent data volumes, through
cross-util
. In order to setup a persistent data volume, use:This will create a persistent data volume specific for your current toolchain, and will be shared by all targets. This volume is specific for the toolchain version and channel, so a new volume will be created for each toolchain.
Make sure you provide your
DOCKER_HOST
or correct engine type to ensure these are being made on the remote host. Then, run your command as before:Finally, you can clean up the generated volume using:
A few other utilities are present in
cross-util
:volumes list
: list all volumes created by cross.volumes remove
: remove all volumes created by cross.volumes prune
: prune all volumes unassociated with a container.containers list
: list all active containers created by cross.containers remove
: remove all active containers created by cross.clean
: clean all temporary data (images, containers, volumes, temp data) created by cross.The initial implementation was done by Marc Schreiber, schrieveslaach.
A few more environment variables exist to fine-tune performance, as well as handle private dependencies.
CROSS_REMOTE_COPY_REGISTRY
: copy the cargo registryCROSS_REMOTE_COPY_CACHE
: copy cache directories, including the target directory.Both of these generally lead to substantial performance penalties, but can enable the use of private SSH dependencies. In either case, the use of persistent data volumes is highly recommended.
Fixes #248.
Fixes #273.
Closes #449.