Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI argument to ignore paths #250

Open
silverwind opened this issue Oct 18, 2022 · 19 comments
Open

CLI argument to ignore paths #250

silverwind opened this issue Oct 18, 2022 · 19 comments

Comments

@silverwind
Copy link

In some cases, it's desirable to not format certain files like test fixtures. Ignoring such files can be achieved by passing a explicit list of files to gofump, but then big projects risk into running into the maximum command line length which on Windows is only 8191 characters.

I suggest to support a -exclude argument where users could pass individual files to exclude to the CLI. It should support both file and directory arguments. Another benefit of such a argument is to speed up the invocation by ignoring big directories like node_modules which often contain upwards of 100k files.

go run mvdan.cc/gofumpt@v0.4.0 -exclude test/fixture.go,fixtures,node_modules .
@mvdan
Copy link
Owner

mvdan commented Oct 20, 2022

How about not using the .go file extension for those test fixtures? I have seen that done with Go files inside testdata directories which have bad syntax, bad formatting, or broken typechecking on purpose. Using any tool like gofmt -l -w . or gopls on those files will most likely not work or attempt to fix the code, which is not what you want.

As far as argument size limits, we could support a newline-separated list of filenames fed via a plaintext file, which is the same workaround that gcc or Go's own compiler support. We likely want to do that anyway, but being able to list lots of files and filtering them any way you wish is certainly a valid use case.

@silverwind
Copy link
Author

silverwind commented Oct 20, 2022

One such fixture is auto-generated go code generated by https://github.com/shurcooL/vfsgen, these absolutely must bear the .go extension to be recognized by the compiler. IIRC, gofumpt does support some magic string in files to ignore them individually, probably that's the way to go for now.

Still, from a performance standpoint, ignoring node_modules would be a big win. I'm still hoping that golang would provide a mechanism to ignore that tools like this one could leverage.

Taking file path input from a (temporary) file would also work for us, that would be welcome to have.

@mvdan
Copy link
Owner

mvdan commented Oct 20, 2022

I'm surprised to hear code generators like vfsgen are still being used, now that go:embed has existed for a while :)

Typically, if a project uses gofumpt and also a code generator, since that code generator will usually only abide by gofmt and not gofumpt, you can make sure that all your code remains consistent with gofumpt, no matter whether it's generated or not. For example:

//go:generate some-generator -o out.go
//go:generate gofumpt -w out.go

That is what I would personally do. It does add an automated step to your go generate, but the end result is nice and doesn't require any special cases or custom configuration.

Taking file path input from a (temporary) file would also work for us, that would be welcome to have.

Doing it seems like a good idea in any case. If this issue moves in a different direction, then I'll split that into its own issue.

Still, from a performance standpoint, ignoring node_modules would be a big win. I'm still hoping that golang would provide a mechanism to ignore that tools like this one could leverage.

I'd rather not invent something new in this space. We do already skip vendor directories unless they are explicitly given, so perhaps we could do the same for node_modules, with the assumption that it wouldn't be a regression for existing users or for users who come from gofmt. I'm still not a fan of hard-coding knowledge about other development environments, though.

@weeco
Copy link

weeco commented Dec 11, 2022

I'm facing another issue where an excldue arg would be handy. I have to include license files in my projects and IntelliJ / Goland has the concept of file templates which will be added to Go files. They put these file templates under ./idea/fileTemplates/internal/Go File.go. These template files can't be gofumpted, because they include variables which are not Go syntax compatible.

I agree that they should not use the file extension, but the point is that developers have to integrate with several of these usecases where it's not really up to us to chnage this. An exclude parameter in that case would come in really handy, otherwise it will fail:

task: [backend:fmt] /Users/martinschneppenheim/Documents/xy/z/build/bin/go/gofumpt -l -w .
.idea/fileTemplates/internal/Go File.go:10:9: illegal character U+0024 '$'
task: Failed to run task "backend:fmt": exit status 2

I'm sure there are more legitimate usecases which would benefit from an exclude parameter.

@mvdan
Copy link
Owner

mvdan commented Dec 13, 2022

IIRC, gofumpt does support some magic string in files to ignore them individually, probably that's the way to go for now.

I actually forgot to mention, @silverwind, that generated files are treated in a special way, per the README:

Note that vendor directories are skipped unless given as explicit arguments. Similarly, the added rules do not apply to generated Go files unless they are given as explicit arguments.

The magic string is

var rxCodeGenerated = regexp.MustCompile(`^// Code generated .* DO NOT EDIT\.$`)

@weeco like you point out, your case is broken by design: it's likely that many other existing Go tools, like gofmt -l . or go build ./..., would similarly get confused by the not-actually-Go file. I'd suggest filing the issue upstream in the project that wants that file to exist.

I'm sure there are more legitimate usecases which would benefit from an exclude parameter.

The only other use case mentioned here is generated files, which is already handled per the above. If Go tools in general come up with a standard in golang/go#42965, I'll be happy to follow. I'm reluctant to make up my own standard in the meantime.

@mvdan
Copy link
Owner

mvdan commented Mar 16, 2023

Thinking about excluding testdata directories by default, like we already do with vendor, per #260 (comment). Even if testdata directories contain Go files, they might be invalid Go syntax or be badly formatted on purpose, and commands like go fmt ./... or go build ./... would skip them. gofmt -l -w . would still format them, whereas gofumpt would not after the change, but I reckon that's likely an improvement.

@silverwind
Copy link
Author

silverwind commented Mar 16, 2023

I wonder why you want to force naming conventions like testdata onto the user. I put such files into a fixtures directory for example. There is no way around to making it configurable.

@silverwind
Copy link
Author

silverwind commented Mar 16, 2023

BTW for generated file detection, there is a common pattern to mark them as such via .gitattributes as invented by GitHub, which is arguably much better than magic string detection and it works on other software as well:

foo/*.go linguist-generated

There are other variants of this name as well, like .gitlab-generated, so for best support, match for regex -?generated$. There is also an git attribute to mark paths as vendored.

@mvdan
Copy link
Owner

mvdan commented Mar 16, 2023

testdata as a directory name and the // Code generated ... comment matching are both documented standards in Go. See https://pkg.go.dev/cmd/go#hdr-Test_packages and https://pkg.go.dev/cmd/go#hdr-Generate_Go_files_by_processing_source. We have to obey those because they are used in Go code rather frequently.

I guess we could also teach gofumpt to look for .gitattributes files and match out linguist-generated and linguist-vendored so that those files are not formatted. Although, is there a real example of a project which uses those for Go files which are not already matched by testdata or // Code generated ... per the above? Those are very common and (as far as I can tell) uncontroversial standards in Go, so I'd be honestly surprised if any Go project explicitly chose to not follow them.

@silverwind
Copy link
Author

See kubernetes as one example that has go files in linguist-generated:

https://github.com/kubernetes/kubernetes/blob/master/.gitattributes

@silverwind
Copy link
Author

Having configurability via .gitattributes would certainly be nice for gofumpt, but what I was really after in the original issues is to have some ability to skip over node_modules and the likes for all golang tooling, purely as a performance optimization for tools like gofumpt or golangci-lint.

@mvdan
Copy link
Owner

mvdan commented Mar 16, 2023

Interesting example with kubernetes. It does seem like they're not following the Go standard, which is concerning. There doesn't seem to be a good reason, so I'm willing to bet they just didn't notice.

I hear you about node_modules, but that's a tricky problem with Go tools in general, not just gofumpt. I'd really prefer not to invent my own way to skip over that directory. If the directory walking really is that slow for you, you could always consider something like gofumpt -w $(git ls-files '*.go') as a temporary workaround.

mvdan added a commit that referenced this issue Apr 9, 2023
And, just like vendor, one can opt into formatting them by explicitly
passing them as arguments.

Updates #250.
Fixes #260.
@silverwind
Copy link
Author

silverwind commented May 29, 2023

I'm surprised to hear code generators like vfsgen are still being used, now that go:embed has existed for a while :)

I see I never answered: vfsgen still brings one major benefit over go:embed in that it does support compression (sadly gzip and not brotli) which is ideal for frontend assets because the compressed byte stream can be pushed as-is to supporting clients as indicated by their accept-encoding header. And with compression, you of course get a significant binary size reduction as well.

@mvdan
Copy link
Owner

mvdan commented Jul 24, 2024

Just to reiterate what the plan is: Go's golang/go#42965 is accepted but not yet implemented. Once it is, we will copy the strategy in gofumpt.

Even though the proposal is already accepted, I'm reluctant to go ahead and implement it before they do, because that would lead to inconsistent behavior between us and upstream.

@silverwind
Copy link
Author

silverwind commented Jul 24, 2024

Do the go ignores always exactly intersect with gofumpt ignores? I think not.

For example consider the following use case: You have a vendored file that you want to preserve in original format for updating it later with an accurate diff preserved. Go tooling should still compile it, but you'd want it to be exempt from formatting.

@mvdan
Copy link
Owner

mvdan commented Jul 24, 2024

I don't find that to be a particularly strong argument. If a package would be tested via go test ./..., checked via go vet ./..., and formatted via go fmt ./..., it should be formatted by gofumpt -w . as well.

@silverwind
Copy link
Author

This kind of "one tooling fits all" mentality is why I dislike the go ecosystem. I prefer tools that are standalone and have their own configs without such dependencies like this go tooling, because ultimately this allows those tools to outgrow the narrow use cases that the go authors intend.

But I guess the go community is too small to see more flexible tools come to life like for example the JavaScript/TypeScript community has.

@mvdan
Copy link
Owner

mvdan commented Jul 24, 2024

If you have specific use cases that wouldn't be satisfied by following golang/go#42965, I'd like to hear them and think about options. Aside from that, vague arguments against Go aren't constructive here.

@cwrau
Copy link

cwrau commented Jul 26, 2024

If you have specific use cases that wouldn't be satisfied by following golang/go#42965, I'd like to hear them and think about options.

I personally don't think having random folders only needed by CI inside the go.mod just to ignore them is a particularly nice option, having an --exclude flag would be nicer imho

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants