-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a skylark repository rule for maven artifacts #1410
Comments
|
Added a proposal sorta related to this in #1733. Feel free to close it out and schlep it into this issue. |
This is an initial implementation of the maven_jar rule in Skylark, targeted at the FRs in issue bazelbuild#1410. Attributes `name`, `artifact`, `repository`, `sha1` have been implemented, but not `server`. This implementation uses `wget` as the underlying fetch mechanism for remote artifacts to simplify dependencies. My original implementation made use of an underlying call to the `mvn` binary and its `dependency:get` plugin, but it brought along complexities with parsing `pom.xml` files and creating local repositories within Bazel's cache. My personal opinion here is that it's easier to build up from `wget` than to trim down complexities when using `mvn`. With regards to server, there are some limitations with retrieving a maven_server's attribute at Loading Phase without the use of hacky macros (issue bazelbuild#1704), and even if macros are used, the maven_server is not treated as an actual dependency by maven_jar, hence it will not get analyzed during Analysis Phase. There is a test (`test_unimplemented_server_attr`) to ensure that the error message to shown to users if they use the server attribute with this rule. I will have to put more work into implementing maven_server appropriately, and possibly proposing an API review of maven_jar in another change set. Change-Id: I64166dd251d5d268b525dc219cef424a5b5534a1
This is an initial implementation of the maven_jar rule in Skylark, targeted at the FRs in issue bazelbuild#1410. Attributes `name`, `artifact`, `repository`, `sha1` have been implemented, but not `server`. Implemented a wrapper around the maven binary to pull dependencies from remote repositories into a directory under {output_base}/external. Caveat: this rule assumes that the Maven dependency is installed in the system. Hence, the maven_skylark_test integration tests are tagged with "manual" because the Bazel CI isn't configured with the Maven binary yet. Added a serve_not_found helper for 404 response tests. Added a `maven_local_repository` rule to fetch the initial maven dependency plugin for bazel tests. With regards to server, there are some limitations with retrieving a maven_server's attribute at Loading Phase without the use of hacky macros (issue bazelbuild#1704), and even if macros are used, the maven_server is not treated as an actual dependency by maven_jar. There is a test (`test_unimplemented_server_attr`) to ensure that the error message to shown to users if they use the server attribute with this rule. I will have to put more work into implementing maven_server appropriately, and possibly proposing an API review of maven_jar in another change set. Change-Id: I167f9d13835c30be971928b4cc60167a8e396893
**Experimental** This is an initial implementation of the maven_jar rule in Skylark, targeted at the FRs in issue #1410. Implemented a wrapper around the maven binary to pull dependencies from remote repositories into a directory under {output_base}/external. Attributes `name`, `artifact`, `repository`, `sha1` have been implemented, but not `server`. Caveat: this rule assumes that the Maven dependency is installed in the system. Hence, the maven_skylark_test integration tests are tagged with "manual" and commented out because the Bazel CI isn't configured with the Maven binary yet. Added a serve_not_found helper for 404 response tests. Usage: ``` load("@bazel_tools//tools/build_defs/repo:maven_rules.bzl", "maven_jar") maven_jar( name = "com_google_guava_guava", artifact = "com.google.guava:guava:18.0", sha1 = "cce0823396aa693798f8882e64213b1772032b09", repository = "http://uk.maven.org/maven2", ) ``` With regards to server, there are some limitations with retrieving a maven_server's attribute at Loading Phase without the use of hacky macros (issue #1704), and even if macros are used, the maven_server is not treated as an actual dependency by maven_jar. There is a test (`test_unimplemented_server_attr`) to ensure that the error message to shown to users if they use the server attribute with this rule. -- Change-Id: I167f9d13835c30be971928b4cc60167a8e396893 Reviewed-on: https://bazel-review.googlesource.com/c/5770 MOS_MIGRATED_REVID=133971809
👍 for adding transitive deps to this. is there any technical reason that we're aware of that we haven't made transitive deps work yet? |
It depends on what you mean by transitive deps working. The biggest problem right now I feel is that maven_jar doesn't let one define the dependency relationships. I've fixed this in the java_import_external repository rule which I'll be contributing to Bazel shortly. I've also built a web GUI which I'm currently seeking approval to launch which will make it easy for users to generate configurations for this rule. The web GUI will read the pom.xml files from the Maven server, resolve transitive and diamond dependencies, and create code that shows you exactly what's going into your project. I feel like this is the best direction for Bazel. It leads to much faster builds which are actually hermetically sealed without magic. |
By transitive deps working, I mean the rule fetching the dependency relationships from the Maven server and not requiring the developer to specify them. |
In order to do that in a repository rule, it would probably be necessary to have the rule shade all the transitive jars into the root jar. That means rewriting the transitive class names, rewriting the byte code, and then the code size increases quadratically. |
Hmmm, I'm not sure I follow. Why would it be necessary to shade the transitive jars and rewrite class names? Perhaps I'm missing something, but the way I would expect it to work would be:
I don't know how to do 1, but I assume there must be a way since other build tools do this. |
Having a single remote repository for all the maven jars required by the project, and each individual jar being its own rule within the repository, would avoid the need for shading. E.g. But doing things that way introduces another problem. What if another Bazel project depends on that Bazel project? It would have to adopt |
There's a lot of value to not fetching transitive dependencies auto-magically. For example, with the web gui I just wrote, I generated the following config for |
@jart The web gui sounds awesome. Are you close to open sourcing it? |
Expect it at some point in the upcoming months. I need to go through the process. I've also got a lot of other stuff on my plate with TensorFlow. |
Only option 2 supports AAR files. It would be great if whatever solution we settled on supported arbitrary artifact packaging types. |
@kchodorow I've mailed you a changelist adding java_import_external to Bazel. The community should be able to expect it soon. I've also added very helpful documentation with examples. |
Speaking as a Bazel newbie, presenting multiple solutions for maven migration is very confusing. A single, well supported, documented and "official" maven migration solution would be really nice, and I think is key for driving bazel adoption for Java projects. |
@jart we (scala people) have a need to be able to turn off ijar creation for some external jars. |
If the Bazel authors add an attribute to |
Thanks! @kchodorow are you the right person to ask? |
This Skylark rule is a replacement for maven_jar. See also #1410 PiperOrigin-RevId: 164642813
@jart did you end up adding |
@cgrushko Indeed I did. It was added to the Bazel codebase 28 days ago in 062fe70. Judging by the baseline, it doesn't look like it made it into 0.5.4, but it's certain to make it into the next one. I hope you enjoy this rule. Usage examples can be found in Closure Rules, Nomulus, and many other places. |
@jart Does that rule support authentication to a private maven repo (Artifactory in our case)? If not, any ETA? |
Hey @jart |
Any news regarding that web tool you've been mentioning in other bugs @jart |
Behold Bazel Maven Config Generator in #3946 and the demo video on YouTube. @or-shachar @StephenAmar |
From a quick glance of the above PR, it looks like this does not support private Maven repos such as Artifactory? |
@wstrange I don't see why it wouldn't. It also depends on what you mean. For example, you can just sed "repo1.maven.org" in index.html to whatever and it'll crawl the POMs. If you want to it to be able to crawl multiple POM repos, that might not be a trivial change. Also keep in mind that java_import_external has no awareness of POM metadata. It just grabs jars from whatever URL. I'm also pretty sure Bazel's downloader can do HTTP auth using environment variables. See ProxyHelper.java. It's also probably possible to put the user:pass in the URLs itself, although you might not want to check that into your codebase. It's also worth mentioning that Google Drive mirroring feature sort of magically and painlessly creates your own private Maven server on the fly. Although it just mirrors the JARs since that's all java_import_external needs. |
[Disclaimer: I am a Bazel newbie, so the questions I am asking may not make sense ;-) ] The way our Artifactory repo works is that there could be several different repos defined, and each has a potentially different set of credentials. So the http auth credentials used by java_import_external would vary depending on which repo the dependency is coming from. Maven handles all of this by using the credentials defined in ~/.m2/settings.xml. It is not clear to me how to accomplish the same thing with Bazel. |
Is Artifactory sort of like a really robust Squid caching proxy? Reading about it, I couldn't help but notice that Artifactory Enterprise Edition offers five-nines availability. I actually have a great deal of respect for the JFrog developers, for having achieving this level of reliability. It's a level of engineering most thought only AT&T and Chubby could master. Even Google Cloud Storage, with its transcontinental redundancy, is only able to promise three-nines. However But it might need improvement when it comes to that private authentication use case. It's one I haven't considered, because I mostly do open source stuff. Also internally at Google we just vendor everything in our monolithic repo. One thing you could do is put this in your zone: $TTL 0
artifacts IN A 192.168.10.4
IN A 192.168.10.5
IN A 192.168.10.6 Put this on your servers: import BaseHTTPServer
import SocketServer
import base64
import httplib
import shutil
import urlparse
basic = lambda u,p: 'Basic %s' % base64.b64encode('%s:%s' % (u,p))
AUTHORIZATIONS = {
'maven.initech.com': basic('aladdin', 'opensesame'),
'maven.vendoro.com': basic('aladdin', 'opensesame'),
'localhost:5000': basic('aladdin', 'opensesame'),
}
class Handler(BaseHTTPServer.BaseHTTPRequestHandler):
def go(self):
ru = urlparse.urlparse(self.path)
pu = urlparse.ParseResult('', '', ru.path, ru.params, ru.query, ru.fragment)
auth = AUTHORIZATIONS.get(str(ru.netloc))
if auth:
self.headers['Authorization'] = auth
self.headers['Host'] = ru.netloc
if ru.scheme == 'https':
c = httplib.HTTPSConnection(ru.netloc)
else:
c = httplib.HTTPConnection(ru.netloc)
try:
c.putrequest(self.command, pu.geturl())
for k, v in self.headers.items():
c.putheader(k, v)
c.endheaders()
r = c.getresponse()
self.send_response(r.status)
for k, v in r.getheaders():
self.send_header(k, v)
self.end_headers()
shutil.copyfileobj(r, self.wfile)
self.wfile.flush()
finally:
c.close()
do_GET = go
do_HEAD = go
class ThreadedHTTPServer(SocketServer.ThreadingMixIn,
BaseHTTPServer.HTTPServer):
daemon_threads = True
ThreadedHTTPServer(('', 4000), Handler).serve_forever() Then run Bazel like this: $ HTTP_PROXY=http://artifacts:4000 bazel build //... And you should be good. |
So I think what you are saying is that when you are at 10 nines of availability, you have no place to go. Bazel goes to 11 nines. Artifactory and Nexus are very common in the "enterprise" space. If Bazel is to attract hordes of Java developers (and that may not be a goal ;-) ), having first class support for private maven repositories (with authentication) is essential. The proxy idea is super creative (I really appreciate you taking the time to put together a solution). I'll review it - but I think it will be a non starter in my organization. The solution has to be integrated and out of the box. I return to looking at Bazel every 6 months or so, because we desperately need something like it (maven build and test times are getting absurd). But I have to sell this internally, and the maven migration experience is just not there yet. I'll be back though ;-) |
Hi Warren. I'd encourage you to file an issue on rules_maven. It uses gradle to resolve transitive deps under the hood. As gradle already factors in the |
@wstrange I encourage you to file a feature request asking for the ability to add to say I can't speak for the Bazel team or Google, but I'm sure they want nothing more than the largest number of people to benefit from Bazel as possible. While we're in the business of sharing world-class technology, we can't always be in the business of solutions, and some assembly is required. I think that's OK, because it creates opportunities for entrepreneurs to build those turn-key solutions on top of the work we're sharing. For example, nothing would make me happier than to see someone come along, take that Apps Script I posted a few comments ago, and get rich turning it into a business. If that ends up being one of you, buy me a drink next time you're in the Bay Area. |
@jart Thanks a lot for the config generator. It was very useful. A tricky question for you though. Any ideas? |
I would advise against doing anything nontrivial in |
All such feature requests now belong in https://github.com/bazelbuild/rules_jvm_external |
Some FRs that have come up in the past:
The text was updated successfully, but these errors were encountered: