Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command Line Application #527

Open
t-rad679 opened this issue Feb 19, 2020 · 9 comments
Open

Command Line Application #527

t-rad679 opened this issue Feb 19, 2020 · 9 comments

Comments

@t-rad679
Copy link
Contributor

As discussed in #76, building a binary wrapper for Spotless seems to be a (probably?) necessary step to building a cleanly written Bazel extension. Since much of the work needs to be done anyway, why not make that executable accessible to Spotless users as a command line application?

@nedtwigg has kindly already provided some input on this in this comment

@t-rad679
Copy link
Contributor Author

t-rad679 commented Feb 19, 2020

Some responses to that comment:

I am fine with adding a new column for "bazel", or a new column for "CLI", but it's even better if "bazel" piggybacks on "CLI" which somehow piggybacks on either gradle or maven. jbang is a clever project which (ab)uses gradle to smash java into single-file-scripts, might have useful lessons for a similar piggybacking.

Bazel piggybacking on CLI is exactly what I want, but I'm wary of piggybacking the CLI on top of Gradle. It feels like Bazel calling Gradle with extra steps. I'm not sure I can articulate why, but my intuition would prefer to have a "native" Spotless CLI. I could maybe be convinced, though.

then the spirit of YAGNI might recommend that this CLI should have a package com.diffplug.spotless.bazel and another package com.diffplug.spotless.cli

If by package you mean java package, then com.diffplug.spotless.cli sure, but I don't expect (or think it's even possible for) there to be any java code in the Bazel extension. That's why the CLI route seems to be our best option. Bazel rules (extensions are sets of rules; the equivalent of Gradle's task is called a "target", which specifies a rule with some attributes) seem to be much simpler in principle than Gradle plugins:
Declare some inputs, declare some outputs, declare a thing (executable) that turns the inputs into the outputs, declare the stuff the thing needs to know to turn the inputs into the outputs. That's it.

If by package you mean directory, then I absolutely agree.

If the CLI gets its own column (which is fine), there are a few questions which have obvious answers in the context of a build-system-plugin (see docs), which are not-at-all-obvious in the context of an independent CLI.

  • does the CLI define just the formatter function (which steps to apply and how they are configured), or does it define both the formatter function and the files to be formatted". Either is fine, but it's an important distinction to make.
  • how does the CLI download artifacts from a central repository / how is that configured / how are they cached locally

I'm pretty sure I understand what you're getting at here, but I'm not sure I understand why these are special concerns. For the first question, I was thinking the latter. I was expecting that the CLI should do everything that Spotless does with Gradle. One should be able to run the CLI with a configuration completely analogous to a Gradle configuration as though they were running a Gradle task and get the same result. For example, I was thinking you could have a configuration like the following (definitely open to your preferences on the format):

java:
    target:
        - pattern: "**/*.java"
    formatterSteps:
        googleJavaFormat:
            version: "1.2"
            aosp: true
format:
    name: "docs"
    target:
        - pattern: "docs/*.md"
    formatterSteps:
        trimTrailingWhitespace: True
        endWithNewLine: True

and this would be equivalent to the following Gradle configuration:

spotless {
    java {
        target '**/*.java'
        googleJavaFormat('1.2').aosp()
    }
    format('docs') {
        target 'docs/*.md'
        trimTrailingWhitespace()
        endWithNewLine()
    }
}

I know that this would limit one's ability to use arbitrary logic in defining the targets, but isn't that preferred in some ways? One of the most common complaints I hear about Gradle is that it lets you do whatever you want and things get unnecessarily complicated as a result. Plus, isn't Spotless designed to allow only intended configuration options? As you've seen, I've tried to do weird stuff with Spotless in Gradle and it (seemingly by design) hasn't worked out great :P. Doing things with this configuration would just inhibit its users trying to break it.

For the other question, I guess what I had in mind was to build the Spotless CLI with Bazel or Gradle, and all of the dependencies would be specified that way. I guess it seems suboptimal to have everything packaged in the binary even if you're not going to use it, but I wanted all the dependencies to be part of the Spotless CLI build, rather than the Spotless run. It doesn't have that many optional dependencies, right?

As a side note about that question, Bazel handles dependencies through what it calls repositories. In a file called WORKSPACE, you use repository rules to specify the dependencies the build will use. There are a variety of maven based repository rules, the simplest of which is maven_install. For example, you could write a WORKSPACE target like the following:

maven_install(
    name = "maven",
    artifacts = [
        "junit:junit:4.12",
        "com.google.guava:guava:27.0-android",
    ],
    repositories = [
        "https://maven.google.com",
        "https://repo1.maven.org/maven2",
    ],
)

For maven projects which do not have Bazel build configurations, Bazel will generate them so you can reference their components as Bazel targets from the deps attributes of your own targets.

It might not be practical for a raw CLI to figure out the questions above without having some opinionated build system (e.g. bazel) to draw certain utilities from. In this approach, the CLI could only actually be run within a Bazel build, but there's the option that a future contributor could leverage the same CLI to add e.g. SCons, CMake or some other non-JVM build system.

This I don't understand. We would certainly use a build system to build the CLI binary (we could use any of them; Bazel, Gradle, Maven), but I don't see why it would be especially difficult to get it to run on its own without one of them. Can you elaborate?

@nedtwigg
Copy link
Member

nedtwigg commented Feb 19, 2020

Dependencies

It doesn't have that many optional dependencies, right?

Definitely hundreds, and possibly thousands :D Spotless supports many versions of many formatters. For example, when you say eclipse('4.10') or eclipse('4.11'), the next thing that Spotless does is load the appropriate lockfile, and then uses Provisioner to grab them dynamically:

/**
* Given a set of Maven coordinates, returns a set of jars which include all
* of the specified coordinates and optionally their transitive dependencies.
*/
public Set<File> provisionWithTransitives(boolean withTransitives, Collection<String> mavenCoordinates);
}

This is all rigorously cached, so it is fast, but you have to provide some implementation of this interface (bullet 3). It is not practicable to shadow every single dependency of every single JDT, and even if you did you'd need to invent some kind of packaging / unpackaging system.

This interface is very easy to implement within the context of a Gradle plugin or a Maven plugin, which is a benefit of piggybacking on "bazel calls cli that calls gradle". Jbang is an example of using gradle as a dynamic-dependency-grabber backend.

One bazel-native solution could be descriptive error messages. For example, "Spotless tried to resolve XXX, but it is not present. Add it to the maven_install(artifacts = " section. That is painful, but workable.

Files

java:
    target:
        - pattern: "**/*.java"

Resolving this wildcard is expensive. You can build a caching system which can resolve this wildcard much more quickly, and even build it so that only the changed files are passed along. Building this system from scratch within :spotless:cli is probably not a good solution - the other plugins leverage the existing systems within gradle and maven. I don't know Bazel, but it must have facilities for describing a group of files, detecting their up-to-date-status, and passing the files directly - it probably doesn't tell tools "apply to **/*.java", it probably passes file content on stdin, or maybe passes the argument as a path, or something like that.

Minimal implementation (Bazel vs CLI)

Spotless' pitch is that it is a "switchboard" - it is glue code and infrastructure, and the two systems above are the missing pieces that it gets from the host. I think a naked CLI would implement them very differently than a bazel-native tool, which is why I think that either:

  • the naked CLI should embed gradle, so that Gradle infrastructure (including DSL) can be reused with Spotless in non-gradle environments
  • the CLI should not attempt to be a naked CLI, because it will necessarily rely on some build system tooling for dependencies and up-to-date detection, and the CLI should only describe a json/xml/yaml/whatever format for the steps, but focus on the Bazel usecase, with a very light effort towards making it possible to embed into other non-JVM hosts.

The advantage of embedding Gradle is that now you have a truly naked CLI, which can be used anywhere. Going the non-naked route where you're still using the host to implement Provisioner, then you're always going to need a maven-friendly host.

Implement Provisioner with no-op.

If you look at the above and say "wow, that's a lot of complexity", you're right! With formatters, fixing a bug means reformatting code. It's a useful feature of Spotless that you can get bugfixes and features in the infrastructure, without also being forced to adopt the newest version of whatever formatters you use. That "switchboard" functionality does come at a high cost, but it's the minimum cost required to achieve that goal.

Another alternative is to consume Spotless as a library, declare the formatter and version you care about as a static dependencies, and implement Provisioner as "search the classpath for 'google-java-format-1.7.jar'" and return that. Such an implementation is not "a Spotless", it's just "using Spotless", but that's fine! We've always maintained semver on spotless-lib and spotless-lib-extra explicitly so that people could use our infrastructure (line endings from .gitattributes, padded cell, composable and serializable formatter steps, etc) without paying the full cost of the switchboard abstractions.

My instinct is that stealing JBang's dep resolution will be the easiest path forward, and that once you've gotten that far it might turnout to be pretty easy to reuse the Gradle DSL. My other instinct is that building a CLI absent a specific host-build-system usecase is likely to result in a lot of unused code. I would focus on doing what Bazel needs, and then generalize - rather than build a CLI and then figure out how to use it with Bazel.

@t-rad679
Copy link
Contributor Author

t-rad679 commented Feb 21, 2020

Relevant discussion in #529 (comment)

Hmmm...I see your point. I will definitely take these things into consideration. I feel really weird using a Gradle build script as a configuration for another build tool, but I guess my concerns are not worth all of the other problems not using Gradle would come with.

I also see your point about a CLI not really being necessary. It seems like Spotless is fundamentally intended to aid in aggregating and integrating a variety of formatting functionality into build systems that the linters in question do not support natively or do not support well. The issue is that, based on my investigation so far, it seems like Bazel's simplistic approach makes it prefer to simply provide inputs to an executable that produces outputs, declaring a clear relationship between the two. It doesn't have much support for using a library to perform build steps, no matter the language. Note, however, that Bazel itself is written in Java and runs on the JVM.

Admittedly, most of my knowledge here comes from personal experience and internet searches. It seems worthwhile to contact the Bazel community directly, as they are very knowledgeable and usually quick to respond. I have leaned on them in several cases, and they have been able to help me resolve problems that had me pulling my hair out. I will reach out to them before proceeding.

@jbduncan jbduncan reopened this Feb 21, 2020
@nedtwigg
Copy link
Member

nedtwigg commented Feb 21, 2020

<censorship>I deleted and edited comments above (including edits to every commenter in this thread), to remove debate about programming methodlogy</censorship>. https://gitter.im/diffplug/spotless is a good place to chat, I'd like to keep this issue focused on the specifics of the issue, rather than methodology in general. I'll use the term "minimal viable implementation" rather than "YAGNI" for the purposes of this issue.

@diffplug diffplug deleted a comment from jbduncan Feb 21, 2020
@diffplug diffplug deleted a comment from jbduncan Feb 21, 2020
@diffplug diffplug deleted a comment from jbduncan Feb 21, 2020
@jbduncan
Copy link
Member

I've since learned from a conversation over at openrewrite that jbang can be used as a middleman to download things from Maven Central, and in turn build a CLI app around, as per this example.

But re-reading this thread, I see that jbang has been mentioned a few times, so maybe this idea for making a Spotless CLI app was already mentioned and I've just missed it?

The downside seems to be that jbang itself has to be installed? I've never used it before, so I'm not entirely sure.

I've also just remembered about a Scala library/tool called Coursier for downloading things from Maven Central. Perhaps Coursier could be used instead?

@maxandersen
Copy link

fyi - coursier needs more things downloaded than jbang.

If you amde a diffplug/jbang-catalog repo and had a spotless cli jar you would be able to do:

curl -Ls https://sh.jbang.dev | bash -s - app setup
jbang spotless@diffplug <argumments>

that curl can be replaced with whatever package manager you like (see https://jbang.dev/download) but curl works on all major platforms, even windows bash based shells.

users can add extra deps if that makes sense by jbang --deps g:a:v,g1:a1:v1 spotless@diffplug

@maxandersen
Copy link

Jbang is an example of using gradle as a dynamic-dependency-grabber backend.

incorrect. jbang uses shrinkwrap resolver which in turn uses aether (which gradle itself uses too afaik).
jbang just uses gradle like syntax for dependencies.

@tomwhoiscontrary
Copy link

Not sure if this is any use, but a while ago i worked out the minimal code needed to drive Maven's dependency resolution machinery using its libraries - see maven-resolver-demo.

I think it would be possible to write a provisioner using code like this. A command-line Spotless tool using this would have some Maven libraries as dependencies (and Guice, and SLF4J, and a few other things), but it would not need Maven installed locally as a tool.

@maxandersen
Copy link

@tomwhoiscontrary fyi thats what jbang is :) we've now since this thread update moved
To use mima which is maintained by maven members for this exact usecase.

All we need is a spotless jar with a main method and we are golden.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants