Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please provide a way to identify a specific device that should receive color #14

Open
joshtriplett opened this issue Dec 11, 2023 · 6 comments

Comments

@joshtriplett
Copy link
Contributor

As has arisen in discussions of CLICOLOR_FORCE in some projects, setting CLICOLOR_FORCE can break builds where any command that acts on CLICOLOR_FORCE is run with redirected output (pipe, file, etc).

However, CLICOLOR_FORCE still provides a great deal of value for users of IDEs, CI systems, and similar.

Please consider adding a version of this that flags a specific output as being able to receive color, in such a way that subcommands with redirected output cannot accidentally interpret it as applying to them.

One possible way to do this: set CLICOLOR_FORCE_STAT='123 456', where 123 and 456 are the st_dev and st_ino fields returned by fstat. Then, when checking whether to output color, before trying isatty, check if CLICOLOR_FORCE_STAT is set and if fstat on the fd you want to check returns the corresponding st_dev and st_ino. If so, emit color, otherwise don't.

Pipes, sockets, logfiles, and any other kind of file descriptor have a unique pair of st_dev and st_ino. This will automatically work for anything that inherits the file descriptor. But the moment any process gets run with its output redirected, the stat check won't pass, so it won't force color. Conversely, any program running a subprocess can trivially implement this by calling fstat on the file descriptor it's about to give to a subprocess and setting the environment variable accordingly.

(The use of fstat applies to UNIX platforms. A similar mechanism on Windows is GetFileInformationByHandleEx retrieving FILE_ID_INFO, which has a similar pair of unique numbers. On Windows, those numbers are only guaranteed unique while the file is open, but that should likely be true over the course of a build.)

Would you be open to the idea of incorporating this into the standard? I'd be happy to write a pull request documenting it in detail, if so.

@jhasse
Copy link
Owner

jhasse commented Dec 22, 2023

Hi :)

I understand the issue. I'm a bit reluctant to add more complexity though, because I doubt that most tools would bother implementing it (especially when it needs a special handling for Windows). Right now it's a simple getenv call on all platforms.

Also there's the alternative to patch the tools that parse output so that they unset CLICOLOR_FORCE for their children. One could also argue that parsing output should never be done anyway and the tools should rely on using JSON or some other IPC protocol instead.

CLICOLOR_FORCE isn't the only environment variable that might break output parsing btw. So using a clean environment when parsing output is a good idea anyway? Not sure.

There's also the more general solution to actually implement a tty so that isatty works. Some IDEs and CIs already do this like GitHub Actions or Visual Studio Code.

My preferred solution is that all tools should ALWAYS output colors. When someone wants to parse just the text or don't have colors he should explicitly remove them. This would result in no surprises and avoid environment variables all together.

@joshtriplett
Copy link
Contributor Author

@jhasse

Hi :)

Hi! Thanks for your response.

I'm a bit reluctant to add more complexity though, because I doubt that most tools would bother implementing it (especially when it needs a special handling for Windows).

I would expect in practice that it should be implemented in a small library, and the caller just needs to ask "should I use color on this output". I'd certainly be happy to supply PRs implementing support in such a library.

Also there's the alternative to patch the tools that parse output so that they unset CLICOLOR_FORCE for their children.

It's not the responsibility of those tools to un-break something that wasn't broken before. They ran a program that produced text output and parsed that output, and then that program changed its behavior to break the text output and make it not parseable.

Also, we're not just talking about tools that launch specific other tools. We're talking about user-written shell scripts, things like $(prog | grep XYZ | sed ...), which expect to keep working. It is emphatically not the responsibility of every random user-written shell script to un-break something that is currently working correctly.

CLICOLOR_FORCE isn't the only environment variable that might break output parsing btw.

The other cases I'm aware of from the past have been specifically removed. For instance, grep used to have a GREP_OPTIONS, which it removed precisely because it causes such breakage. If you're aware of other such environment variables, please by all means mention them so that they can either get fixed or get documented as things that users should never set in their environment.

So using a clean environment when parsing output is a good idea anyway? Not sure.

It's possible that that might have been a better convention to begin with (which tools could have enforced), but it's not the world we're in. And part of the point of environment variables is precisely that they get inherited, so I'd argue that the mistake would be encouraging program behavior to be configurable via environment variables in a way that causes such breakage.

There's also the more general solution to actually implement a tty so that isatty works.

I'm a fan of doing this in general, and it's certainly the most universally working solution. It may occasionally cause tools to break by assuming they can prompt for user input, but prompting (if conditional at all) should be based on whether stdin is a tty rather than stdout or stderr, and automated processes can set stdout/stderr to a tty and stdin to /dev/null.

That said, this has other limitations, notably that you don't always want animated progress bars / spinners / etc even if you otherwise want colors, and that anything emulating a TTY has to set a terminal size and output prepared for a terminal of one size may not be correct for a terminal of another size.

But I still think in general that it's the right solution.

My preferred solution is that all tools should ALWAYS output colors.

This breaks many things, including not just the ability to parse the output of tools, but also the ability to run tools on the command line with things like grep. This would mean, for instance, that you could run a program, see xyz in the output, run program | grep xyz, and get no output.

Personally, my preferred solution is that most tools should be thin wrappers around libraries, the tools should be primarily for end-users, and most things should link the libraries. But that's not currently the world we're in. :(

I really do appreciate the problem of making CI and higher-level build tools show visually appealing output. I really do; I have a professional need for such a thing. That's part of why I filed this, in the hopes of having a solution that won't break anything to implement, and that I could happily recommend that tools implement, rather than having to recommend against to avoid breakage.

@jhasse
Copy link
Owner

jhasse commented Dec 27, 2023

I would expect in practice that it should be implemented in a small library, and the caller just needs to ask "should I use color on this output". I'd certainly be happy to supply PRs implementing support in such a library.

Hm ... a small library would be too much of a barrier for many in my experience.

It's not the responsibility of those tools to un-break something that wasn't broken before. They ran a program that produced text output and parsed that output, and then that program changed its behavior to break the text output and make it not parseable.

Parsing the output of a program can always break (e.g. version upgrade) and it already is the responsible of the program to watch out for environment variables that might change the output (e.g. LANG).

Also, we're not just talking about tools that launch specific other tools. We're talking about user-written shell scripts, things like $(prog | grep XYZ | sed ...), which expect to keep working.

Well their expectation is wrong already (version upgrades, LANG, ...).

It is emphatically not the responsibility of every random user-written shell script to un-break something that is currently working correctly.

It's only working in their current environment.

The other cases I'm aware of from the past have been specifically removed. For instance, grep used to have a GREP_OPTIONS, which it removed precisely because it causes such breakage. If you're aware of other such environment variables, please by all means mention them so that they can either get fixed or get documented as things that users should never set in their environment.

GREP_OPTIONS might be a good example where it breaks A LOT as grep really is an often used program in shell scripts. But as I mentioned there are still others, e.g. LANG (Git for example started to be localized a few years ago, even when piped!)

That said, this has other limitations, notably that you don't always want animated progress bars / spinners / etc even if you otherwise want colors, and that anything emulating a TTY has to set a terminal size and output prepared for a terminal of one size may not be correct for a terminal of another size.

Ah, good point. On the other hand wouldn't a progress bar / spinner actually make sense in a CI output or IDE?

This breaks many things, including not just the ability to parse the output of tools, but also the ability to run tools on the command line with things like grep. This would mean, for instance, that you could run a program, see xyz in the output, run program | grep xyz, and get no output.

Yes, one would than have to use program | col -b | grep xyz. Also if programs always did this, grep might have in response ignored escape codes by default.

I really do appreciate the problem of making CI and higher-level build tools show visually appealing output. I really do; I have a professional need for such a thing. That's part of why I filed this, in the hopes of having a solution that won't break anything to implement, and that I could happily recommend that tools implement, rather than having to recommend against to avoid breakage.

IMHO breaking scripts is the smaller issue than not having colored output, because the latter is hard to fix, while for the former you could just set -e CLICOLOR_FORCE.

@joshtriplett
Copy link
Contributor Author

joshtriplett commented Jan 2, 2024

Hm ... a small library would be too much of a barrier for many in my experience.

Many programs already use libraries as part of their text output; it's not necessarily an unreasonable ask. I can appreciate that it's not as simple as "just check a pair of environment variables", but the tradeoff there is that a library can implement a more robust check and thus be more widely adopted.

And in a high-level language I'd expect it to just be a few lines to implement even without a library, as well as easier to pull in a library. For C, it'd be doable in a small header-only library with a static inline function.

Parsing the output of a program can always break (e.g. version upgrade)

If the program is one whose output commonly gets parsed (e.g. the output is based exclusively on the input, rather than containing boilerplate), the output changing in such a way would commonly be considered a breaking change, a bug that would need fixing. For instance, the output of grep generally doesn't contain anything that needs localizing, and it'd be a breaking change to change how it emits line numbers or what symbols it uses. Many other tools are similarly parseable.

Ah, good point. On the other hand wouldn't a progress bar / spinner actually make sense in a CI output or IDE?

In an IDE, probably. In CI output, typically no; you don't want the log filled with erased-and-redrawn output. You could potentially post-process the result to eliminate the erase-and-redraw, but that's added complexity and potential breakage compared to normal log capturing, particularly if you want tools to be able to fetch the log.

Also, even if you do want to display a progress bar, using a pty requires setting a window size (e.g. 80x25), and depending on how tools display their output, particularly fancy output may not work when subsequently displayed to a terminal with a different size, without some further post-processing. As an example, some terminal-handling libraries may make assumptions about what will happen when printing to the last column, and those assumptions may not hold when displaying the output to a wider terminal.

IMHO breaking scripts is the smaller issue than not having colored output

I think we're likely to continue disagreeing on that point. Builds that work should continue working, and not spontaneously break because you upgraded a tool and that tool started emitting color output to a pipe because of an environment variable. the new version of the tool pays attention to an environment variable it didn't before. In semver terms, I'd consider a tool that changes its output like that to have had a semver-major breaking change.

The current definition of CLICOLORS_FORCE is one that tools like grep should never use, and only tools whose output is never parsed could ever safely use without risking breakage. (And history shows that tools that think their output is never parsed may find that they've been mistaken about that when they go to change something.) And that then leads to arguments about whether a given tool can safely look at it or not, for which the safe answer is "no", and tools that have a more conservative policy about breaking changes will not adopt it.

I don't want to get into a long argument about the value of not breaking builds versus the value of color. It seems sufficient to note that different tools will have different policies about whether the clicolors proposal is acceptable for them or not, and that's always going to limit adoption; there is no universal policy that every tool will agree with.

I'd love to develop a solution that makes it possible to satisfy both requirements---showing color in CI/IDEs, and not breaking builds---such that every tool could reasonably adopt it, in place of where they currently call isatty. If you'd be open to such a solution as a clicolors v2, I'd be interested in designing and proposing one, and then encouraging its adoption. If you're not interested in such a solution, you should go ahead and close this issue, and I should continue recommending that any tools whose output might ever be parsed should continue to avoid clicolors and CI/IDE tools caring about color should either use PTYs or live without colors.

@jhasse
Copy link
Owner

jhasse commented Jan 31, 2024

You're right that CLICOLORS_FORCE is probably something grep shouldn't use. This standard isn't so much about adding an environment variable to force colors, but rename the existing MY_PROGRAM_ALWAYS_USE_COLORS to CLICOLORS_FORCE - something I think you would also agree on is less worse (when something breaks, one has to only check one variable).

I think your solution is a good idea. I would merge a PR which adds it to the website with your explanation. But before that it might be a good idea to get some feedback of at least one application developer. Do you think you can find a project which currently has no way of forcing color, wouldn't accept the current solution, but would accept your way?

@adamhotep
Copy link
Contributor

I write a lot of wrappers (aliases, shell functions, and scripts) for programs to add or tweak their colors. These wrappers generally compose the when-to-color logic, then, when coloring is preferred, call the real program and instruct it to force colors on, like grep --color=always.

I don't see an easy way to abstract this in a manner that wouldn't itself be a tall order for developers to consider. The GNU grep authors chose to remove GREP_OPTIONS because it never really worked well. Containing the possibilities to just a color toggle (on/off or on/off/auto) is simple enough that it won't trip over all of those complications, but it will indeed have some issues when used improperly. CLICOLOR_FORCE is something that should only be used in scripts. Anybody who thinks it's wise to put export CLICOLOR_FORCE=1 into their ~/.bashrc or whatnot clearly hasn't used bash much since trying it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants