Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully Featured Gradle Support with the Gradle Tooling API #1164

Open
JLLeitschuh opened this issue May 23, 2019 · 55 comments
Open

Fully Featured Gradle Support with the Gradle Tooling API #1164

JLLeitschuh opened this issue May 23, 2019 · 55 comments
Labels
F: language-support Issues specific to a particular language or ecosystem; may be paired with an L: label. Keep Exempt this from being marked by stalebot L: java:gradle Maven packages via Gradle T: feature-request Requests for new features

Comments

@JLLeitschuh
Copy link

JLLeitschuh commented May 23, 2019

Foreward I'm not intimately familiar with how Dependabot works so please excuse me if I wildly misrepresent something about Dependabot.

Problem

Currently, Dependabot is attempting to parse build.gradle files as text files instead of fully understanding the complexities of Gradle Builds.

This is because the dependencies declared in a build.gradle file doesn't tell the full story.

Additionally, since Gradle now supports the Gradle Kotlin DSL, there are now build.gradle.kts files. On top of that, users can also change the name of their build files so that they call the files my-project.gradle.kts and Gradle will handle that fine.

This current methodology completely misses any dependencies added by plugins, transitive dependencies, and dependencies added using Gradle in abnormal but valid ways.

Fundamentally this issue arises because unlike NPM and Yarn, where dependency declaration files are static, Gradle & Maven both require code execution to fully resolve the dependency graph.

Solution

Gradle offers something called the Gradle Tooling API. This tooling API is already in use in IntelliJ, Eclipse, and the Gradle Test Kit.

The Tooling API exposes the entire dependency graph for the repositories build in a way that can be transformed into a format Dependabot can understand.

The Tooling API executes the Gradle Build and can be used to extract metadata about the build for consumption by external tools.

Risks

There are certain risks that are taken on by moving forward with this plan.

Security

In order to use the Tooling API, Dependabot will be required to execute the user's code in their repository. Fundamentally, this introduces untrusted code execution into the same environment where Dependabot is being executed. I don't know what the security model of Dependabot's execution is, but if executing user code is a non-starter, using the Tooling API won't work.

Performance

Sometimes developers have really, really weird builds. Sometimes they also have really slow builds. Calling the ToolingAPI will bring the Dependabot execution time from being predictable with respect to the number of files being parsed, to a non-deterministic amount of execution time.

Enviromental

Certain versions of Gradle only work with certain versions of the JVM. Certain builds only work with certain versions of the JVM. Implementing this feature will require maintainers of repositories to explicitly declare what version of the JVM their build depends upon.

Rewards

Keeping the Java Ecosystem Safe.

Using the GitHub search functionality for filename:gradle-wrapper.jar returns 2.55 million results. Additionally, Gradle is the official build tool for the Android Ecosystem.

Having good tooling support around Gradle from GitHub and Dependabot would protect developers, corperations, and Android users around the world.

@JLLeitschuh
Copy link
Author

Gradle also has a Dependency Lock File, although it's not enabled by default and many projects still don't take advantage of it.

https://docs.gradle.org/current/userguide/dependency_locking.html

@sschuberth
Copy link

Maybe there's a way for Dependabot to leverage (code from) the OSS Review Toolkit's analyzer, which already uses the Gradle Tooling API to determine dependencies (and it supports several other package managers, too).

@JLLeitschuh
Copy link
Author

Would this issue be more appropriate if it had been opened against the Dependabot Feedback Repo?

If it is, feel free to move this issue there.

https://github.com/dependabot/feedback

@greysteil
Copy link
Contributor

Nope, here is the right place for it, I've just been super busy with the GitHub announcement (sorry).

I'm super 👍 on thus, but making the switch is going to require some serious work on our side.

On the risks above:

  • Dependabot already evaluates arbitrary code when doing Ruby, Python and Elixir updates. We don't do this at file fetching time, and need to use regex parsing to figure out which dependent files to pull down (we don't clone your repo either, so that doing that code evaluation won't leak it). As long as we can write a Gradle file fetcher that doesn't need to execute code we're in good shape.
  • Performance issues are generally fine, as long as the slowness is at the parsing or updating stage (not at the point when we check whether a new version is resolvable) and isn't too severe. Again, we have this in other languages where we need to shell out to the package manager
  • We've worked around environment requirements in other languages, too. It's tricky but not impossible

Basically I think the above is entirely do-able, but to do it needs someone who has (or wants to acquire) deep knowledge of Gradle. I can't be that person myself.

We haven't done any formal planning of what the team will look like on Dependabot going forward, wether we're hiring, etc., but it was always my intention to eventually have someone dedicated to Maven and Gradle support. If we can't do that internally I'll make sure it's as easy as possible for someone from the community to do it (but I'll push for us to do it internally).

@JLLeitschuh
Copy link
Author

Hi @greysteil,

I've just been super busy with the GitHub announcement (sorry).

Don't worry about it. Congratulations by the way! This is awesome news!

We don't do this at file fetching time, and need to use regex parsing to figure out which dependent files to pull down (we don't clone your repo either, so that doing that code evaluation won't leak it).

So, it sounds like you only want dependabot to check out the minimum number of files required to actually do the evaluation correct? This sounds like it's fraught with peril as it can be difficult to statically determine what files are relevant for a Gradle build's execution.

A potential workaround would be to allow the user to specify additional custom files that they need dependabot to check out in order for the build to operate successfully.

but it was always my intention to eventually have someone dedicated to Maven and Gradle support. If we can't do that internally I'll make sure it's as easy as possible for someone from the community to do it (but I'll push for us to do it internally).

I'm about to become a formal member of the Gradle team myself. There is a general interest in exploring how we can improve the software supply chain security of the JVM ecosystem. Although I can't commit any time to this project yet, I'm interested in trying to accrue the information we need to make a ballpark estimate of what kind of time/resources would be required to pull this off.


As an aside, what mechanism/project drives the GitHub dependency graph (dependabot example) feature?
I feel like adding proper Gradle support to that is an important first step in proper dependency security vulnerability detection that dependabot needs in order to even consider offering 'update' support.

Would this discussion need to be with an internal GitHub team if we wanted to integrate this feature at that level?

@greysteil
Copy link
Contributor

greysteil commented Jun 3, 2019

So, it sounds like you only want dependabot to check out the minimum number of files required to actually do the evaluation correct? This sounds like it's fraught with peril

Yep, fraught with difficulty! For now we think the safest way to ensure nothing can be stolen when executing unsafe code is to have nothing worth stealing accessible to that code, though.

I'm interested in trying to accrue the information we need to make a ballpark estimate of what kind of time/resources would be required to pull this off.

Let's talk! Things are hectic at the moment as myself and the Dependabot team integrate with GitHub, but once things settle I'd love to chat.

As an aside, what mechanism/project drives the GitHub dependency graph

It's not Dependabot yet, but we think we have a big role to play there, and support for Gradle is high up the list of priorities for dependency graph. We should definitely chat about that one too - let me get back to you.

@JLLeitschuh
Copy link
Author

For now we think the safest way to ensure nothing can be stolen when executing unsafe code is to have nothing worth stealing accessible to that code, though.

Sounds like a very sane security model.

What's the downside of just checking out the entire repository and allowing dependabot to operate with the whole codebase? Is there a concern about the size of the repository that is being checked out? I understand that you don't want to checkout a python repository and try to parse it as a Gradle project because that's a waste of CPU cycles.

Things are hectic at the moment as myself and the Dependabot team integrate with GitHub, but once things settle I'd love to chat.

I don't formally start with Gradle until June 17th. But we can have discussions as you become available.

You can also find me and the rest of the Gradle team in the Gradle Community Slack Channel here: https://is.gd/xPCSJh

Looking forward to chatting more about this soon!

@greysteil
Copy link
Contributor

What's the downside of just checking out the entire repository and allowing dependabot to operate with the whole codebase?

Just that that would require us to have to whole codebase in the insecure environment, and we wouldn't want folks to be able to steal it. That's us being super cautious, though - CI has exactly the same issues and obviously clones the whole repo...

Looking forward to chatting more about this soon!

Yes! I've joined the community channel so I'm always available there, and will get in touch once things settle at GitHub. Feel free to use me as your point of contact for finding the right person at GitHub on anything / everything, too, and I'll do my best.

@JLLeitschuh
Copy link
Author

JLLeitschuh commented Sep 19, 2019

Some additional details from one of our internal discussion threads. Thanks to @melix who I'm quoting from below:

Gradle source files are code. Dependencies can be added in very different ways (build scripts, plugins, ...) and there's no pattern how to add them (you can add extension methods to declare dependencies, you can rely on ext.something for the version number, ....)

Then even if you had a way to do [reliable parsing], you'd get a wrong answer.
Whatever is declared is not what is resolved. So it's not because you declare a dependency on org:foo:1.0 that you don't have a vulnerability; you could actually resolve to org:foo:1.1 which is vulnerable.
Or you could bring org:bar:1.3 which would also be vulnerable.
This is why such tools should always rely on the resolution result instead of what's declared in the code.


TL;DR: The only correct way to get the true dependency information out of a Gradle build is to use the resolution results. The easiest way to do this is to utilize the Gradle Tooling API.

CC: @jhutchings1

@apapia
Copy link

apapia commented Sep 19, 2019

One alternative to consider is supporting updates to lock files using this feature:
https://docs.gradle.org/current/userguide/dependency_locking.html

The lockfiles are already "resolved" and in a machine readable format so they would be easier to update automatically without necessary needing to use the Gradle Tooling API. There may be limitations to this approach but I think it would be a lot easier to implement. Generally if you want to keep your dependencies up-to-date using dynamic versions + dependency locking has been a really effective strategy.

@JLLeitschuh
Copy link
Author

@apapia Unfortunately, this would only work for newer versions of Gradle, and we'd have to start enabling dependency locking by default.

Currently, there are over 2 million Gradle projects on GitHub (that are public) we wouldn't want to leave those projects behind when implementing this feature.

@msridhar
Copy link

msridhar commented Oct 2, 2019

One question here. Using the Gradle Tooling API seems like a great way to accurately detect dependencies for a Gradle project. But when a dependency is out of date, there is still the problem of generating a pull request to update the dependency. AFAIK, the Tooling API would not help with that, and it seems that it could be quite a challenge. Is there a plan for robustly generating pull requests to update Gradle dependencies?

@JLLeitschuh
Copy link
Author

Unfortunately, I don't believe that we (Gradle) have a good solution for this right now. I do believe it's a problem we need to figure out how to resolve.

@mnonnenmacher
Copy link

The Gradle Versions Plugin could help with detecting newer versions. I believe you could apply it through the Tooling API on the fly.

@ben-manes
Copy link

ben-manes commented Feb 28, 2020

Maybe you can evangelize to the Gradle team to make this a built in task (like Maven's version plugin)? They are too comfortable having a plugin, which was fun to write in a weekend for Gradle 1.x, but I certainly thought they'd have bundled it by now.

@dotCipher
Copy link

Looks like this might have been solved from #2680 ?

@sschuberth
Copy link

No, #2680 "just" implements static parsing of .kts files via regexes. But this issue is about using a (build script language agnostic) way of actually semantically understanding the build logic and getting the dependencies, taking into account things like version conflict resolution etc. which you do not get when just parsing build files. If you need correctness on that level, I recommend to take a look at ORT's analyzer (disclaimer, I'm the founder of the ORT project).

@dotCipher
Copy link

Fair, there are many ways to describe versions outside of the build.gradle.kts file for a dependency, so it wouldn't be trivial to do

@JLLeitschuh
Copy link
Author

Fair, there are many ways to describe versions outside of the build.gradle.kts file for a dependency, so it wouldn't be trivial to do

This is very true. We, Gradle, have some active internal discussions going on around this and are working towards solutions to make improvements in this area.

@ben-manes
Copy link

@nsoft as a workaround, I use gradle-dependency-submission to push the dependency graph into Github, which parse the dependencies task as you suggested. This allows for security alerts and perhaps version alerts but I don't use that (since it cannot update the build files). I use (and wrote) the gradle-versions-plugin so I am used to that flow, and others have written extensions like gradle-update-checker for a github action and notification. Just an fyi if helpful until there is native support by gradle/github.

@bcmedeiros
Copy link

Break this in two.

@nsoft they know that, but I'm pretty sure that is some non-public reason for this delay.

Notifying flawed dependencies is clearly the best value/cost ratio, yet for some reason it seems it's not going this way.

@pioterj
Copy link

pioterj commented Feb 28, 2023

The Gradle Build Tool team is working on this now. See the related issue on our public roadmap: gradle/build-tool-roadmap#42 cc @bigdaz

@jeffwidman
Copy link
Member

jeffwidman commented Mar 14, 2023

@pioterj @bigdaz if there's anything we can do to help, please let us know.

My email is in my profile if you want to have a call about ways to have a clean interface between native Gradle and Dependabot. The general topic of interfaces between Dependabot and native helpers is something we've spent some time thinking about.

And we're very interested in leveraging native Gradle... so many ways it'd be better than our current approach of re-implementing Gradle in Ruby, but poorly.

@JLLeitschuh
Copy link
Author

if there's anything we can do to help, please let us know.

Source for the dependency extractor Gradle plugin can be found here:

If you're willing to work with a beta plugin in your build, you can give it a try, but the API is unstable, and it may break your build.

Eventually, it should end up as a native feature built into the https://github.com/gradle/gradle-build-action GitHub action. Once support has been added, as long as your build has the GitHub action, you will get support automatically. I have no idea what the timeline on that will be though.

@JLLeitschuh
Copy link
Author

Anyone who is interested in giving the plugin a try, it's currently in a pre-release state:

https://github.com/gradle/github-dependency-graph-gradle-plugin/releases/tag/v0.0.2

You can find it on the Gradle Plugin Portal here:

https://plugins.gradle.org/plugin/org.gradle.github-dependency-graph-gradle-plugin

Two things:

  • This plugin must be applied as an init script plugin, not a settings or build script plugin.
  • The upload isn't automated yet. You'll need to configure your build to upload the snapshot manifest file until that gets added to the official Gradle GitHub Action

@3flex
Copy link

3flex commented Sep 8, 2023

The plugin mentioned above is now integrated with the Gradle build action and the feature is considered stable since v2.7.0: https://github.com/gradle/gradle-build-action/releases/tag/v2.7.0

@jonjanego
Copy link
Member

Hello to the many watchers, just amplifying this a bit, the Gradle team has merged this (and other) functionality into a single setup-gradle action

@huehnerlady
Copy link

@jonjanego where does this github action check for new dependencies? I couldn't see anything in the readme

@gredler
Copy link

gredler commented Feb 27, 2024

Note that you may need to disable the Gradle caching (cache-disabled: true) if you're also using codeql-action, otherwise the CodeQL step will fail when there is nothing to compile (because the compile results were cached in the last run).

@jonjanego
Copy link
Member

@huehnerlady - this section of the docs talks about what i was referring to. the short version is, after running a gradle build, if you enable the option to do dependency submission, that will submit the dependencies to GitHub for us to analyze. This will build a far more complete version of the dependency graph than we'd be able to do without submitting them. Its impact onto dependabot usage is that you'll be getting a more complete list of packages with security updates available.

I'm sure that the gradle team would be happy to discuss more! I'll point them to this thread to see if they'd like to chime in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F: language-support Issues specific to a particular language or ecosystem; may be paired with an L: label. Keep Exempt this from being marked by stalebot L: java:gradle Maven packages via Gradle T: feature-request Requests for new features
Projects
Status: Planned
Development

No branches or pull requests