Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic version utility #566

Open
nicowilliams opened this issue Aug 31, 2014 · 26 comments
Open

Semantic version utility #566

nicowilliams opened this issue Aug 31, 2014 · 26 comments

Comments

@nicowilliams
Copy link
Contributor

Consider this jq module:

module semver {version:1.0};

def _semver_obj2obj($req):
  if   . == $req                                       then true
  elif $req.hash   != null and .hash   != $req.hash    then false
  elif $req.commit != null and .commit != $req.commit  then false
  elif $req.major  != null and .major  != $req.major   then false
  elif $req.minor  != null and .minor  < $req.minor    then false
  elif $req.minor  != null and .minor == $req.minor and $req.micro != null and .micro < $req.micro    then false
  else true end;

def _ver_array2obj:
  . as $in |
  [range(length)] |
  map(. as $i|["major","minor","micro"]|.[$i]|{(.):($in|.[$i])}) |
  add;

def _ver2obj:
  if   type == "object" then .
  elif type == "array" then _ver_array2obj
  elif type == "number" then {minor:floor,micro:(.-floor)}
  else {major:.} end;

# Returns true if input version spec semantically matches the requested version
def semver($req):
  if . == $req then true
  elif type == "number" and ($req|type) == "number" and . >= $req then true
  elif type == "number" and ($req|type) == "number" then false
  elif type == "object" and ($req|type) == "object" then .version|_semver_obj2obj($req.version)
  else _ver2obj|_semver_obj2obj($req|_ver2obj) end;

Which means roughly:

  • exact matches are required of "hash" and/or "commit" when present
  • input must be > $req when the two are numeric or can be represented as numeric
  • major versions must match exactly when request
  • minor versions must be larger than or equal to the requested
  • micro versions must be larger than or equal to the requested one when the minor versions match exactly
  • arbitrary version content (e.g., strings) must match exactly

Version numbers of type "number" are compared numerically. Version numbers with major version number components can be given as objects and even as arrays.

It allows all of these as version specifications:

{"version":{"hash":"050e9e5016b529c765542dd0637bcca1019fb41b"}}
{"version":{"commit":"050e9e5"}}
{"version":{"minor":10,"micro":17}}
{"version":{"major":2,"minor":20,"micro":200}}
{"version":1.5}
{"version":[null, 1, 5]}
{"version":[3,1]}
{"version":true}
{"version":false}
{"version":null}

and even:

{"minor":10,"micro":17}
{"major":2,"minor":20,"micro":200}
1.5
[null,1,5]
[3,1]
true
false
null
"anything at all"

and so on.

One could then write

import foo {version:1.5};
import bar {version:{commit:"050e9e5"}};
import baz true;
import foobar {version:[3,5,6]};

and so on and on.

@nicowilliams nicowilliams self-assigned this Aug 31, 2014
@nicowilliams nicowilliams added this to the 1.5 release milestone Aug 31, 2014
@nicowilliams
Copy link
Contributor Author

I've C code that does this sort of matching using the jv interfaces, but perhaps we should always use a jq program to do this (it'd be less code!).

@nicowilliams
Copy link
Contributor Author

If we'd need this utility to be available for use in jq programs (rather than buried in the internals), then coding it once in jq seems best. OTOH, evaluating jq code in the process of loading libraries seems likely to slow things down unnecessarily.

@nicowilliams
Copy link
Contributor Author

I've got a better version in C.

@joelpurra
Copy link
Contributor

@nicowilliams: great! Quick comment.

  • There's a semantic versioning component in this code, but this augments the semantic versioning standard so much I wouldn't call the module semver - rather versioning or similar. Break it up and separate semver logic?
  • micro is called patch in the standard.
  • There's a BNF: Backus–Naur Form Grammar for Valid SemVer Versions. "It's the law."
  • The semver(...) function is doing things arbitrarily compared to semver range logic - it's a home cooked versioning comparison logic. Comparing 1.2.4 to 1.2.3 should return false since they're not exactly equal. What you're describing is closer to 1.2.4 ^1.2.3, which expands to >=1.2.3-0 <2.0.0-0 "Compatible with 1.2.3". See node-semver#ranges.

If we'd need this utility to be available for use in jq programs (rather than buried in the internals), then coding it once in jq seems best.

Yes, please. I'd like to be able to parse (for example) the suggested jq.json metadata file I'd use for a package manager. (Wouldn't put version information in the module or import directive, nor have a module directive at all if possible. That is metadata.)

@nicowilliams
Copy link
Contributor Author

On Tue, Sep 02, 2014 at 02:55:35AM -0700, Joel Purra wrote:

@nicowilliams: great! Quick comment.

I've re-written it rather differently, and in C. When it's closer I'll
post it.

Responding out of order.

If we'd need this utility to be available for use in jq programs
(rather than buried in the internals), then coding it once in jq
seems best.

Yes, please. I'd like to be able to parse (for example) the suggested
jq.json metadata file I'd use for a package manager. (Wouldn't put
version information in the module directive, nor have a module
directive at all if possible.)

I agree that putting version information in a .jq file is a bit awkward!
(for the same reasons that RCS IDs were awkward)

I'll definitely do this. Though I like having the option to put some
metadata in the module directive (for other reasons), so I'll leave that
available.

I also have been meaning to allow for large constant data to be stored
in .json files associated with a module. (Think of Unicode tables.)

  • micro is called patch in the standard.

Well, at Sun Microsystems we called it "micro" (Solaris marketing called
it "update" in latter years). Has the whole industry has adopted one
standard?

That seems much, much too complex for jq's needs.

  • The semver(...) function is doing things arbitrarily compared to
    semver range logic - it's a home cooked versioning comparison logic.
    Comparing 1.2.4 to 1.2.3 should return false since they're not
    exactly equal. What you're describing is closer to 1.2.4 ^1.2.3,
    which expands to >=1.2.3-0 <2.0.0-0 "Compatible with 1.2.3". See
    node-semver#ranges.

Clearly 1.2.4 satisfies a requirement for 1.2.3. It's just not equal.

There's not much point to having any semantics attached to major, minor,
and micro version numbers if you only ever compare them all for
equality: you might as well use an arbitrary string compared for
equality.

Though I agree that a way to express a strict equality requirement with
major.minor.micro versioning would be nice, so I'll add that too.

I won't implement all of the syntax for node-semver#ranges unless
there's a use case for all of it, and it can be left for the future
anyways. I've never needed more than "same major, larger than or equal
minor, and if equal minor then larger than or equal micro" semantics;
adding strict equality is not a big deal, but adding all the other
options is.

(Wildcards are easy, so that will be supported, and in my new code
already are.)

  • There's a semantic versioning component in this code, but this
    augments semantic versioning standard so much I wouldn't call the
    module semver - rather versioning or similar. Break it up and
    separate semver logic?

Semantic versioning is nothing new. Sun practiced it for decades, for
example. Details have generally varied. I may call it jqsemver, say.

Perhaps we could even have an option to use alternative versioning forms
(though that really would force the evaluation of jq-coded code during
linking and loading, which I was hoping to avoid). Maybe in the future?

@nicowilliams
Copy link
Contributor Author

Also, regarding the version string BNF, I'd much rather use a JSON
schema, basically an object with major, minor, micro,
prerelease, and build keys.

If these are absent in the importer's requirements, then they are
wildcarded. If they are absent in the candidate module, then they are
just absent and cannot match any non-wildcard value in the importer's
requirements.

An array representation is also ok ([$major, $minor, ...]).

The importer could have an exact: true key to denote exact matching,
otherwise semantic matching should be done.

I'm not really keen on parsing version strings no matter what the
syntax, especially given that we can bring the full power of JSON and jq
to bear on this. But if someone wants to write a parser that outputs a
normalized JSON object representation of a version string, that's fine
by me :) (it's largely a question of what we have time for).

BTW, the BNF seems underdeveloped. Identifiers can start with '-'?!
Why not encode alpha/beta conventions in the BNF too? Do build numbers
get to be compared semantically (as opposed to exactly)? What about
pre-release identifiers?

@nicowilliams
Copy link
Contributor Author

My new code does the following:

  • Versions are expressed as objects with major/minor/micro/
    pre_release/build/hash/commit/id keys and values

  • Semantic or exact (optional) matching of major/minor/micro/
    pre-release, and exact matching of hash/commit/id

    All but pre-release can be wild-carded (null) in the importer's
    spec, and all may be absent in the candidate, but if present in the
    importer then it must be present in the candidate.

    Semantic matching means:

    • candidate major must match exactly if not wild-carded
    • candidate minor must be larger than or equal to the requested if
      not wild-carded
    • candidate micro must be larger than or equal to the requested if
      not wild-carded and if the candidate minor matched exactly
    • candidate pre-release must match exactly if present; can't be
      wild-carded
    • build is ignored (even in exact match case)

    Exact matching still permits wild-carding, except for pre-release.

  • Versions given as numbers are normalized into the above object
    representation.

  • Versions given as strings are matched exactly, but eventually should
    get parsed and normalized into the above object representation.

    Parsing of the BNF should go here.

  • Versions given as booleans/nulls should ... behave how??

  • Versions given as arrays should ... behave how??

I should re-cast the code to be a cmp function, returning -1, 0, or 1,
so that it can be used for sorting. That would be super-convenient.
(Currently it returns a boolean.)

@joelpurra
Copy link
Contributor

@nicowilliams:

I also have been meaning to allow for large constant data to be stored
in .json files associated with a module. (Think of Unicode tables.)

That would be great!

  • micro is called patch in the standard.
    Well, at Sun Microsystems we called it "micro" (Solaris marketing called
    it "update" in latter years). Has the whole industry has adopted one
    standard?

I'm of course referring to the http://semver.org/ standard.

That seems much, much too complex for jq's needs.

Well, no. You're already about to implement all of it, just not standardized. Yes, I know the semver standard is working on strings, but they're not impossible to parse. Also, in this case it would be a good thing to indicate (throw an error?) a malformed semver string so there won't "ever" be any in the wild. (Comparing this to recently writing a URL library that had to swallow a lot of things, where "the wild" is really wild because malformed URLs are silently ignored.)

Clearly 1.2.4 satisfies a requirement for 1.2.3. It's just not equal.

There's not much point to having any semantics attached to major, minor,
and micro version numbers if you only ever compare them all for
equality: you might as well use an arbitrary string compared for
equality.

Having a default (for non-existent) operator that expands to >=1.2.3-0 <2.0.0-0 is more confusing than one that expands to =1.2.3. I understand that ^1.2.3 is a more common use case though, but I'd rather be explicit about creating ranges rather than strict equality.

Though I agree that a way to express a strict equality requirement with
major.minor.micro versioning would be nice, so I'll add that too.

That'd be =1.2.3, or plain 1.2.3 based on node-semver#ranges.

I won't implement all of the syntax for node-semver#ranges unless
there's a use case for all of it, and it can be left for the future
anyways. I've never needed more than "same major, larger than or equal
minor, and if equal minor then larger than or equal micro" semantics;
adding strict equality is not a big deal, but adding all the other
options is.

Whatever subset you implement, I'd still say that it's best to follow an established standard. Again, node-semver#ranges are used in tens of thousands of published NPM packages.

Semantic versioning is nothing new. Sun practiced it for decades, for
example. Details have generally varied. I may call it jqsemver, say.

You're making my point: "Details have generally varied." - and that's why there's now a semver standard that already has lots of momentum after only a few years.


From semver.org:

Why Use Semantic Versioning?

This is not a new or revolutionary idea. In fact, you probably do something close to this already. The problem is that "close" isn't good enough. Without compliance to some sort of formal specification, version numbers are essentially useless for dependency management. By giving a name and clear definition to the above ideas, it becomes easy to communicate your intentions to the users of your software. Once these intentions are clear, flexible (but not too flexible) dependency specifications can finally be made.


Perhaps we could even have an option to use alternative versioning forms
(though that really would force the evaluation of jq-coded code during
linking and loading, which I was hoping to avoid). Maybe in the future?

If you decide to deviate from the semver standard and node-semver#ranges defacto standard, then yes please. But bear in mind that I still don't want to see any module nor import directive with version ranges in jq code - awkward. It is package metadata - both when it comes to package (be it one module per package, or several) versioning and dependency selection. (I'm also seeing a "jq project" as a "jq package", but it might be private/not meant for distribution.)

Note that it'd be easy for you to use the semver command line utility to compare results in a test set. Plus node-semver has hundreds of tests written already.

@joelpurra
Copy link
Contributor

If these are absent in the importer's requirements, then they are
wildcarded. If they are absent in the candidate module, then they are
just absent and cannot match any non-wildcard value in the importer's
requirements.

Please see the semver standard before implementing this.

The importer could have an exact: true key to denote exact matching,
otherwise semantic matching should be done.

Then you'd end up with a property per node-semver#ranges logic. I'd rather see comparison: equal-to, something like:

  • = equal-to (also default for empty comparison type)
  • ^ compatible-with
  • ~ close-to
  • > greater-than
  • < less-than

I'm not really keen on parsing version strings no matter what the
syntax, especially given that we can bring the full power of JSON and jq
to bear on this. But if someone wants to write a parser that outputs a
normalized JSON object representation of a version string, that's fine
by me :) (it's largely a question of what we have time for).

That's a good idea! Internally converting semver strings to "proper" objects, which can be used for comparisons. (Again, I'd love to see strict parsing without any margin for inconsistencies/errors.)

BTW, the BNF seems underdeveloped. Identifiers can start with '-'?!
Why not encode alpha/beta conventions in the BNF too? Do build numbers
get to be compared semantically (as opposed to exactly)? What about
pre-release identifiers?

You might be right about these; I don't use semver outside of 1.2.3 much - but I'd rather patch the semver standard than deviate before the patches have been accepted.

Guess some of it is related to how git can be used to generate version numbers.
git describe --tags --match 'v[0-9]*' --always --dirty='-SNAPSHOT' => v0.2.3-38-g7a6fade-SNAPSHOT if running the command in a dirty working tree based on the commit 7a6fade, 38 commits after the most recent tag v0.2.3

@nicowilliams
Copy link
Contributor Author

but they're not impossible to parse. Also, in this case it would be a
good thing to indicate (throw an error?) a malformed semver string so
there won't "ever" be any in the wild. (Comparing this to recently

Sure. But for now I'm working with objects -- parsed versions. Parsing
should yield such objects.

writing a URL library that had to swallow a lot of things, where "the
wild" is really wild because malformed URLs are silently ignored.)

Oh I know :(

Clearly 1.2.4 satisfies a requirement for 1.2.3. It's just not equal.

There's not much point to having any semantics attached to major, minor,
and micro version numbers if you only ever compare them all for
equality: you might as well use an arbitrary string compared for
equality.

Having a default (for non-existent) operator that expands to
>=1.2.3-0 <2.0.0-0 is more confusing than one that expands to
=1.2.3. I understand that ^1.2.3 is a more common use case though,
but I'd rather be explicit about creating ranges rather than strict
equality.

Using equality as the default makes me wonder why bother with semantic
versioning: it will push users to use pkg managers that manage all
versioning themselves and avoid all conflicts by making all dependencies
local to dependents. That would save me a lot of trouble: I'd not have
to implement semver at all in jq and I'd be this close to being done
with modules :)

Seriously, why shouldn't we do that? Why wouldn't that be good enough?
The external pkg manager could use whatever semantic versioning it
wants.

Besides, to me semantic versioning is all about determining
compatibility, otherwise there's little point. Sure, you could use it
only for sorting, but if you only ever do equality matching the order is
not terribly interesting.

If you decide to deviate from the semver standard and
node-semver#ranges defacto standard, then yes please. But bear in mind
that I still don't want to see any module nor import directive
with version ranges in jq code - awkward. It is package metadata -
both when it comes to package (be it one module per package, or
several) versioning and dependency selection.

Version metadata on import directives strikes me as OK, but it would
be awkward indeed on module directives (as discussed earlier).

I'll need a schema for the metadata JSON text, something like:

{
  "module": { ..., "version": ...},
  "dependencies": { "<module-name": { "version": ... }, ... }
}
  • micro is called patch in the standard.
    Well, at Sun Microsystems we called it "micro" (Solaris marketing called
    it "update" in latter years). Has the whole industry has adopted one
    standard?

I'm of course referring to the http://semver.org/ standard.

I know, it's "a" standard. I'm asking if that really is "the" standard :)

Note that it'd be easy for you to use the semver command line
utility to compare results in a test set.

That does add appeal to Node semver, but I'm still not loving all of it.

@nicowilliams
Copy link
Contributor Author

I'm not really keen on parsing version strings no matter what the
syntax, especially given that we can bring the full power of JSON and jq
to bear on this. But if someone wants to write a parser that outputs a
normalized JSON object representation of a version string, that's fine
by me :) (it's largely a question of what we have time for).

That's a good idea! Internally converting semver strings to "proper"
objects, which can be used for comparisons. (Again, I'd love to see
strict parsing without any margin for inconsistency/errors.)

The problem I have is that if we're going to use equality by default we
might as well match commit hashes and so on instead of
major.minor.micro-prerelease+build, and in general I really want an
option for using commit/file hashes (e.g., so I can make a proper Merkle
hash tree out of a complete installation).

Wildcarding is -I agree- not a terribly good idea for major.minor.micro,
but making major optional is (for the parsed form anyways) (recall that
I think major versioning should really be part of the module/product
name; I don't mind letting others have and use major version numbers
though).

So I'm definitely adding something (exact matching of commit IDs and
such). This should be like using said additions as pre-release strings
(which must always match exactly). So perhaps I'm not even adding
anything so far (after removing wildcarding).

BTW, the BNF seems underdeveloped. Identifiers can start with '-'?!
Why not encode alpha/beta conventions in the BNF too? Do build numbers
get to be compared semantically (as opposed to exactly)? What about
pre-release identifiers?

You might be right about these; I don't use semver outside of 1.2.3
much - but I'd rather patch the semver standard than deviate before
the patches have been accepted.

Is there a forum for discussing that standard?

Guess some of it is related to how git can be used to generate version
numbers. git describe --tags --match 'v[0-9]*' --always --dirty='-SNAPSHOT' => v0.2.3-38-g7a6fade-SNAPSHOT if running the
command in a dirty working tree based on the commit 7a6fade, 38
commits after the most recent tag v0.2.3

It's traditional, really, not git-specific. People tend to do
pre-releases with version strings like 1.2-alpha1, 1.2-beta1, 1.2-rc1,
... and so on. And anything goes as for backwards compatibility in
pre-releases because, after all, a backwards incompatible change might
have snuck in, and alpha/beta testing is all about finding bugs.

@joelpurra
Copy link
Contributor

Using equality as the default makes me wonder why bother with semantic
versioning:

In a first iteration, I'd say stick to the very basics, and make sure to no deviate from the semver standard and the node-semver#ranges style of strings (albeit parsed to objects before used internally). Sticking with the basics means that a lookup for shazam version 1.2.3 would only look for that exact version.

(Stuff like avoiding a known bad version 1.3.6 with >=1.2.3 <1.3.6 || >1.3.6 <2.0.0 comes much later.)

it will push users to use pkg managers that manage all
versioning themselves

Yes! That is what package managers do, and that is what fuels a healthy language/system/code ecosystem where people can create and then share lego blocks.

and avoid all conflicts by making all dependencies
local to dependents.

In the first iteration, yes. jq's import "shazam" (note: I use a string, but no version selection metadata or anything) could then directly refer to the "jq project" or "jq package" base folder (the first folder/parent with a .jq/ subfolder) and look for ./.jq/packages/shazam/. (Further on, other logic might enter here, as .jq/packages/shazam/ could define a main jq file, but it could be .jq/packages/shazam/main.jq by default.)

(Further iterations could rely more on user-global/symlinked packages - but the import lookup wouldn't differ too much. Symlinking relieves import from doing runtime directory semver matching/lookups, as it can just follow the symlink the package manager has set up.)

Please read the smarts summed up here:
http://nodejs.org/api/modules.html

This would be the import "shazam" algorithm, require("shazam") in node:
http://nodejs.org/api/modules.html#modules_all_together

That would save me a lot of trouble: I'd not have
to implement semver at all in jq and I'd be this close to being done
with modules :)

Great! If you ask me, you could make import work on strings to accommodate relative same/subfolder file imports too (useful to split up logic within a "jq project/package" and to load ./unicode-table.json, at no lookup cost) instead of the :: separated mess. You could also drop module, as that is handled by the package manager and just means redundancy with package metadata in, say, jq.json.

Seriously, why shouldn't we do that? Why wouldn't that be good enough?
The external pkg manager could use whatever semantic versioning it
wants.

Great! Note that the node module lookup algorithm doesn't mention versions - the package manager part does.
http://nodejs.org/api/modules.html#modules_addenda_package_manager_tips

Besides, to me semantic versioning is all about determining
compatibility, otherwise there's little point. Sure, you could use it
only for sorting, but if you only ever do equality matching the order is
not terribly interesting.

Yes, following the semver.org standard is great for determining compatibility, "Only ever" isn't what I said though - but equality is a sane default. (Do I need to remind you that people who release packages might not be sane enough for kosher semver?) Either way, a package manager will take care of the details for you.

I'll need a schema for the metadata JSON text, something like:

I suggest looking at package.json as inspiration for jq.json. Just take it slow and implement only one part at at time. Use strings for package version and dependency selection though.
https://www.npmjs.org/doc/files/package.json.html

jq's import directive would only ever look at the .main string to get a same/subfolder relative path (or fallback to ./main.jq). Again, see the import/require(...) algorithm.

{
  "name": "shazam",
  "version": "1.0.2",
  "main": "./lib/i-call-this-what-i-want.jq",
  "dependencies": {
      "url-utils":  "^1.6.1",
      "advanced-statistics":  "^7.1.0"
    }
}

Other metadata such as author, license, webpage can be added later - and it's of no interest to the jq binary either way.

I'm of course referring to the http://semver.org/ standard.

I know, it's "a" standard. I'm asking if that really is "the" standard :)

Yes, I consider it the standard. It has spread to several languages and systems I use, and people understand it well enough for it to work on a larger scale. Tried and true, well defined enough.

@joelpurra
Copy link
Contributor

The problem I have is that if we're going to use equality by default we
might as well match commit hashes and so on instead of
major.minor.micro-prerelease+build

No, not the same thing. Semver equality as the sane semver default is not the same as hash/string equality. Commit hashes are random, unsortable as strings without (complete) history.

and in general I really want an
option for using commit/file hashes (e.g., so I can make a proper Merkle
hash tree out of a complete installation).

I agree on a package manager utilizing a D/CVS that uses hashes as a backend for loading code/versions from. But it's not an accident that these hash based D/CVS's use tags as a human-parseable named pointer. If you really want to build a Merkle tree (cool!) then you can read the underlying D/CVS hash separately. It doesn't sound like something the jq binary would do - it does sounds like something an external verification tool (quite possibly built into a package manger) could do (as it would require git, hg, bzr, possibly svn, cvs etc).

Wildcarding is -I agree- not a terribly good idea for major.minor.micro,
but making major optional is (for the parsed form anyways)

Not semver, no thanks. 1.2.3 (see compete BNF) form required.

(recall that
I think major versioning should really be part of the module/product
name; I don't mind letting others have and use major version numbers
though).

Sure, put it in your package name - but keep it in your semver version string as well. You wouldn't release the package shazam4 with the version 1.0.0, right? It would be 4.0.0. (I still think git would be perfectly able to handle tagged major versions 1.0.0, 2.0.0, 3.0.0 and 4.0.0 of a library in the same repo, but I understand people sometimes are not able to make the distinction.)

So I'm definitely adding something (exact matching of commit IDs and
such). This should be like using said additions as pre-release strings
(which must always match exactly). So perhaps I'm not even adding
anything so far (after removing wildcarding).

I think the package manager should take care of it. If you want commit 7840b33f (no other semver data) of shazam, let the package manager check the right version out and put it in ./.jq/packages/shazam/ for you. jq doesn't have to know. No metadata in code.

Again, see https://www.npmjs.org/doc/files/package.json.html

"repository": {
    "type" : "git",
    "url" : "http://github.com/npm/npm.git"
}

Is there a forum for discussing that standard?

Yes, it's on github.
https://github.com/mojombo/semver/issues

It's traditional, really, not git-specific. People tend to do
pre-releases with version strings like 1.2-alpha1, 1.2-beta1, 1.2-rc1,
... and so on. And anything goes as for backwards compatibility in
pre-releases because, after all, a backwards incompatible change might
have snuck in, and alpha/beta testing is all about finding bugs.

True. Since semver formalizes how prerelease versions etcetera are compared to other semvers, that adds to the appeal.


Sorry for making this thread about package mangers by the way =)

@nicowilliams
Copy link
Contributor Author

I'll be pushing the latest module system soon. There will be no versioning, but modules will be searched first locally to the modules importing them, thus allowing an external component to do all versioning.

@nicowilliams
Copy link
Contributor Author

@joelpurra @pkoppstein @wtlangford @stedolan

I've pushed a revamp of import ... to master.

@wtlangford doesn't like the idea that import ... without an as <name> clause imports the module's symbols into the current namespace, suggesting instead something more like Python's from foo import .... We'd not support a list of symbols to import, at least not at first -- if you don't trust a module, give it a namespace. Thoughts before I start a 1.5 release? Let's give this a couple of weeks, and then let's make a 1.5 release.

@pkoppstein
Copy link
Contributor

something more like Python's from foo import ...

So is the proposal to support:

from MODULE import *;

?

Would "import MODULE as null" not be simpler? It wouldn't preclude subsequent developments along the Python path.

@nicowilliams
Copy link
Contributor Author

@pkoppstein I have no specific proposals. I'm open to them. I could live with from MODULE import *;, and I can live with import MODULE as null.

I can also live with import MODULE; # into this namespace (what I pushed) because either way the programmer gets the control they want, and I'd rather not add more ceremony. But I'm open-minded.

@pkoppstein
Copy link
Contributor

@nicowilliams -- In the previous posting in this thread, you confirmed what you had said elsewhere, e.g. on Aug 12, 2014 you wrote:

Alright, here's the plan for importing into the importer's namespace:

  • the NAME in the module directive will be optional, if given that
    will be the name prefixed to the module's symbols when import
    directives don't specify an "as" clause

At present, however, it does not seem to be possible to import into the importer's namespace.
In particular, although 'import "library" as null;' does not in itself fail, it does
not seem to have the desired effect, e.g.

jq -n 'import "library" as L; 3 | L::factorial'
6

but

jq -n 'import "library" as null; 3 | factorial'
jq: error: factorial/0 is not defined at <top-level>, line 1:

At present, therefore, there does not seem to be a way to "include" multiple libraries. This had been supported before, and I believe that as things stand, ensuring that the "import" command can be used to do so would be an appropriate path forward.

Thanks!

@nicowilliams nicowilliams modified the milestones: 1.6 release, 1.7 release Nov 29, 2017
@nichtich
Copy link

I've asked at Stack Overflow how to manage module dependencies in jq and user peak came up with a useful workaround. This can be extended for full semantic versioning (the official standard) and recursive dependencies, so we'll have a solution that works with jq >= 1.5.:

module {
  name: "shazam",
  version: "1.0.2",
  dependencies: {
      "url-utils":  "^1.6.1",
      "advanced-statistics":  "^7.1.0"
    }
}
jq 'include "shazam"; include "dependencies"; "shazam"|dependencies,...' ...

Sure build in support in jq would be better to simplify usage to

jq 'include "shazam"; ,...' ...

@itchyny itchyny removed this from the 1.7 release milestone Jun 25, 2023
@JaneX8
Copy link

JaneX8 commented Dec 3, 2024

I was searching for a BNF specification of the JQ language as well. For the same reasons as mentioned here: https://softwarerecs.stackexchange.com/questions/91500/does-any-jmespath-jq-query-linter-validator-tool-exist-at-all.

A clear syntax specification in whatever format would allow for JQ query syntax validation and for linting/feedback in IDE's such as VSCode.

jmespath for example wrote their grammar using ABNF, as described in RFC4234, which can be found here: https://jmespath.org/specification.html.

An alternative to BNF or ABNF to my understanding is Parsing Expression Grammars (PEG) which could be tested for example with https://peggyjs.org/online.html.

@pkoppstein
Copy link
Contributor

There's a TextMate grammar at https://github.com/wooorm/starry-night/blob/main/lang/source.jq.js

The paper at https://arxiv.org/pdf/2403.20132 has a formal grammar for the syntax of a large subset of jq.

@wader
Copy link
Member

wader commented Dec 3, 2024

There is also https://github.com/wader/vscode-jq/tree/master/syntaxes which seems to be what the above TextMate grammar is based on, which itself was partially based on https://github.com/fadado/JBOL/blob/master/doc/JQ-language-grammar.md

There is also at least two tree-sitter grammars for jq https://github.com/flurie/tree-sitter-jq and https://github.com/nverno/tree-sitter-jq but i'm not sure how complete they are.

There is also https://github.com/wader/jq-lsp which i my LSP for jq which uses a modified version of gojq's parser which is yacc-based.

@wader
Copy link
Member

wader commented Dec 4, 2024

@JaneX8 please let us know how it goes! could maybe ends up being something to add to https://github.com/fiatjaf/awesome-jq

@JaneX8
Copy link

JaneX8 commented Dec 7, 2024

I will once I get to this but right now I'm breaking my head over it. Here are some thoughts, any suggestions are welcome.

I have a very custom situation where I dynamically read a bunch of files that I gave the jq extension that contain valid JQ which is basically semi-finished (and so invalid) JSON structure. It only becomes fully valid JSON after JQ ran it. Which afterwards I use JSONSchema on, to validate that the output is actual valid JSON and in the JSON format I expected.

I basically enclosed the jq-queries inside a static JSON structure that is passed and successfully used by jq to transform one JSON format to another JSON format, while also filtering and quering with the power of jq. It absolutely works and performs great for my use case. But my problem is, I really want to validate one step earlier, the 'semi-finished JSON' (read JQ-query). Even before it passes into JQ. Preferably even while I type in VSC by means of syntax highlighting, autocomplete, and linter warnings about the syntax. In a static code analysis way.

I considered writing at least some specific DevSkim (JSON) rules for this, and I still might for my use-case but those regular expressions are going to be insanely complicated, and I can't use JSON path inside those DevSkim rules too because well, it isn't valid JSON yet. So, I'm not looking forward to that without some kind of language implementation first.

I'm not even sure the JQ language definition describes this behaviour, as in having a JQ enclosed in a JSON structure. valid JSON == valid JQ but valid JQ != valid JSON. While I use JSONschema on the input and the output I have nothing to validate the JQ step in between.

I thought of using comments as placeholders to make it slightly easier, but I still need preprocessing then. An example:

{
  "name": "Example",
  "data": [
    # __JQ_START__
    .input | map({ key: .key, value: .value })
    # __JQ_END__
  ]
}

If this is getting slightly off-topic, I'm fine to open a new issue. Any (alternative) thoughts anyone because I'm kind of stuck here now?

Moved into a discussion here: #3218

@wader
Copy link
Member

wader commented Dec 7, 2024

Ok i see. The closest thing that can think of that can do that is https://github.com/wader/vscode-jq (or some other LSP-capable editor) combined with https://github.com/wader/jq-lsp. With that you can have jq expressions inside JSON and it will do syntax checking, scope checking, highlighting and completion to some extent.

With "validate one step earlier" do you mean using some CLI tool to validate/lint? i've thought making jq-lsp capable of this.

Yeap maybe at bit off-topic for this issue :) maybe continue in discussions?

@wader
Copy link
Member

wader commented Dec 7, 2024

👍 I think anyone should be able to create a new discussion at https://github.com/jqlang/jq/discussions

Yeap give them a try, i'm happy to help or improve them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants