-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make maven_jar and friends smarter by re-using previously fetched artifacts across different projects #1752
Comments
+1, it's super annoying to have these re-downloaded on every new workspace. This is a little tricky to implement, because we do need a way to clear the "master cache," whether that's to clear or update individual entries that might be corrupt/out of date or just avoid taking up all of the user's disk space. We'll also have to be careful about correctness, perhaps using the cache should require a hash. (There were other requests for this a while ago, but now I can't find them. Will link if I come across them.) |
I've thought about these problems as well while re-implementing |
Aha, related to #1266. |
Thanks for the link. There is also similar feature request mentioned in #1266 on the dev mailing list. As pointed out by @jin it would be trivial, to teach |
Initial thoughts: a basic design uses a load("@bazel_tools//tools/build_defs/repo:maven_rules.bzl", "maven_local_repository")
maven_local_repository(
path = "/home/johndoe/.m2",
) This folder will then be symlinked to each |
This would need to work for all repository rules, not just Maven. Right now we download/link stuff into |
Caching the HttpDownloadValue is probably the easiest way to go forward. But we might want to offer that caching capabilities a bit more exposed directly (so execute result can also be cached) |
generally caching any download with a sha would be great. This is a major painpoint for our developers as we have a lot of downloads (many external repos: maven_jar + git_repository). |
…ted HttpCache skeleton to implement caching logic of HttpDownloadValues as the first step (more types of caches will come later). Having RepositoryDelegatorFunction initialize the cache in the respective RepositoryFunction handlers decouples the cache implementation from itself. It delegates the choice of Cache classes to the respective RepositoryFunctions, and let them decide what to do with the PathFragment of the cache location. Continuation of commit 239d995. A follow up CL will contain the implementation of HttpCache. For now, it's the empty interface of com.google.common.cache.Cache. GITHUB: #1752 -- MOS_MIGRATED_REVID=135400724
To set and use a RepositoryCache instance in HttpDownloader while parsing the command line options, we can pass an AtomicReference<HttpDownloader> instance from BazelRepositoryModule to the HttpArchiveFunctions. However, we'll need to change HttpDownloader download() calls to be non-static in order to initialize an instance of HttpDownloader in BazelRepositoryModule. Remaining TODOs: - RepositoryCache implementation and unit testing - RepositoryCache lockfiles - RepositoryCache integration testing GITHUB: #1752 -- MOS_MIGRATED_REVID=136593517
This is a basic implementation of writing and reading HttpDownloader download artifacts, keyed by the artifact's SHA256 checksum. For an artifact to be cached, its SHA256 value needs to be specified in the rule. Rules supported: http_archive, new_http_archive, http_file, http_jar, Remaining TODOs: - Lockfiles for concurrent operations in the cache. - Integration testing GITHUB: #1752 -- MOS_MIGRATED_REVID=137289206
Remaining TODOs: - Lockfiles for concurrent operations in the cache. GITHUB: #1752 -- MOS_MIGRATED_REVID=137296606
0590483 now lets you use |
…download_and_execute(). GITHUB: #1752 -- MOS_MIGRATED_REVID=137535936
…line instantiation of HttpDownloader and RepositoryCache in BazelRepositoryModule. There are sufficient similarities between the download flows of HttpDownloader and MavenDownloader such that we can extend HttpDownloader to MavenDownloader, and reuse method headers such as checkCache and download. GITHUB: #1752 -- MOS_MIGRATED_REVID=137982375
GITHUB: #1752 -- MOS_MIGRATED_REVID=138072464
With 38e54ac, |
Thanks. This is very much appreciated! I have built from the tip of master and am trying to integrate this great feature in Gerrit Code Revew Bazel build and having some questions: Neither
Note that
I added this feature to Buck 2 years ago: facebook/buck@a1ba001 For now I tested it with Question: Can it be that the cached artifact are copied to external artifacts and not linked? i cannot see that symbolic links are used. Any particular reason to not use symbolic links for that? |
We may change this in the future, but for now we decided to use copies to simplify cache cleanup. Bazel options generally don't support ~ nor $HOME, I filed #2054 to gauge interest/have discussion. |
Asides from all the benefits mentioned above, I think that having a build cache makes it easy to use bazel repositories with package managers that do not allow (by policy) the build process to download anything on its own. |
We are deprecating native maven_jar. |
what is "native maven_jar" ? is this Regardless of the answer, the documentation says:
Could someone give an example on how to transform the following example:
to the My guess is in WORKSPACE :
but what to put in the antlr.BUILD file? a
and direclty use the jar in the targets similar to how |
Current
maven_jar()
implementation is limited to only using the fetched artifact for specific project. It doesn't provide solution for very basic requirements (that native Maven would provide):See Gerrit Code Review
maven_jar
Bucklet: [1] implementation how to get it right. The implementation is putting all fetched artifact in project independent area and is linking the artifacts to the poject output: [2]. More context is here: [3,4].The bandwith is just too valuable resource to throw away (or ignore) previously downloaded artifacts and re-fetch the gigabytes of data again.
buck fetch
smarter by keeping previous fetched artifacts facebook/buck#602The text was updated successfully, but these errors were encountered: