Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow user-defined string formats (close #1227) #3230

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

itchyny
Copy link
Contributor

@itchyny itchyny commented Jan 20, 2025

This commit extends the "def" syntax to define string formats.
By writing def @foo: ...;, @foo string format is available
in the following query. This idea is influenced by jaq. Close #1227.

This commit extends the "def" syntax to define string formats.
By writing `def @foo: ...;`, `@foo` string format is available
in the following query. This idea is influenced by jaq.
@itchyny itchyny force-pushed the user-defined-string-formats branch 2 times, most recently from 2148f37 to b86e734 Compare January 20, 2025 11:38
@itchyny itchyny force-pushed the user-defined-string-formats branch 2 times, most recently from 7e6217a to 562cc8a Compare January 22, 2025 12:46
@itchyny
Copy link
Contributor Author

itchyny commented Jan 28, 2025

Thank you. This is a language change. Any thoughts from other maintainers?

@pkoppstein
Copy link
Contributor

@itchyny wrote:

This is a language change. Any thoughts from other maintainers?

Thanks for the opportunity to ask what backward-compatibility issue this change entails.

My main question (posed purely out of ignorance) is whether the ability to use the @ filters both as ordinary 0-arity filters (... | @tsv) and as "prefix filters" (@tsv "abc \([1,2])") would be compromised.

@itchyny itchyny mentioned this pull request Feb 12, 2025
@myaaaaaaaaa
Copy link
Contributor

This commit extends the "def" syntax to define string formats. By writing def @foo: ...;, @foo string format is available in the following query.

How about instead redefining the @foo syntax to be syntactic sugar for calling a regular foo function? That way, users could do something like this:

$ "world" | @ascii_upcase "hello \(.)"
"hello WORLD"

Note how ascii_upcase is a function that is already defined by jq

@itchyny
Copy link
Contributor Author

itchyny commented Feb 14, 2025

How about instead redefining the @foo syntax to be syntactic sugar for calling a regular foo function?

Since existing string formatters are named to represent the resulting string (e.g. HTML string, URI string, etc.), I think it is better to have a different namespace from the 0-arity filters. Also, they are limited in the type they output and jq doesn't check typing statically, so the syntactic sugar for invoking the 0-arity filter would allow queries that wouldn't actually work.

@myaaaaaaaaa
Copy link
Contributor

Also, they are limited in the type they output and jq doesn't check typing statically, so the syntactic sugar for invoking the 0-arity filter would allow queries that wouldn't actually work.

I don't think this argument holds up - it's just as possible to define a @formatter that wouldn't work, such as def @foo: {};.

And if there were some way to statically analyze a def @foo body to catch such a mistake, is there anything preventing the same techniques from being used on filters invoked by the @ operator?

Since existing string formatters are named to represent the resulting string (e.g. HTML string, URI string, etc.), I think it is better to have a different namespace from the 0-arity filters.

I agree that adding html, uri, ... as regular builtins would cause namespace pollution. On the other hand, I don't think def @foo is the correct solution, given that it essentially introduces a second category of 0-arity filters that are identical in the way they're defined, differing only in the way they're called.

Is this really the best way to solve the namespace pollution problem? For example, one possible alternative would be to have the @ operator fall back on a fmt_ prefix - the query @foo would first attempt to resolve to foo, then to fmt_foo if that doesn't exist. This way, we could add the builtin formatters as fmt_html, fmt_uri, ..., and still use @html, @uri, ...

@itchyny
Copy link
Contributor Author

itchyny commented Feb 16, 2025

@myaaaaaaaaa String formatters can be called without strings (jq "@csv"). How would you like to redefine such syntax?

@myaaaaaaaaa
Copy link
Contributor

I think the most consistent approach would be to redefine @foo to mean something like @foo "\(.)", rather than simply making it an alias for foo. For example, this would allow jq to report an error if foo returns something other than a string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make it possible to define new @foo formats in jq, not just C
4 participants