-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split output array length inconsistency #552
Comments
@joelpurra - Hi, Joel. You'll be glad to know, I hope, that the current (regex-based) version of
Another interesting case is: '"x" | "split("x")` -- should that produce an array of length 0 (as does ruby), 1 (jq 1.4) or 2 (regex-based jq)? The approach currently taken in jq is not to make a "special case" out of any of these edge cases -- i.e. to let the chips fall where they may. By the way, I hope you will reconsider using a recent version of jq 1.4+. There are so many really useful enhancements, and there's no downside that I know of except having to build it. At worst, you might be tempted to "git pull" at some point :-) Note to @nicowilliams: We ought to document the changed behavior of split/1 if we retain it as is. |
@pkoppstein: good to know about this change, thanks! I'd run more recent versions if I wasn't using at least one server system that's out of my control. Given time constraints on finishing my degree, I don't think I'll get to it just yet. Let's hope I'll find a job where I can work with more json data processing, and hopefully help out with some jq development =) And yes, updated documentation, please. Perhaps running a test suite from 1.4 on the 1.5 code base and vice versa, to spot differences? |
I need to revisit the split thing, yes. Also, I do think I screwed up by If we're going to match modules to jq version numbers (and if we add a (I'm likely to introduce an |
@nicowilliams - Did you know that somehow some parts of v1.4+ have escaped into the wild -- http://stedolan.github.io/jq/manual/ includes sections on match, test, sub, gsub ... On the one hand, this is a bit scary. On the other, it would be great if jq 1.5 could be released sooner rather than later, even if it did not contain everything we'd want. As for streams, arrays, and backwards compatibility - personally, I'm comfortable with having split and splits, explode and explodes, etc, indefinitely, but having these pairs also provides a path forward that is consistent with stedolan's recommendations regarding the introduction of major changes. |
I'll revisit this, as I just now experienced problems with
|
It seems @pkoppstein's confirmed fix doesn't work - but guess it's because of #576. |
Thanks for the bug report! keep banging on it. I'll fix this sometime |
Released a bugfix package, which will alleviate problems for now. import "joelpurra/jq-bugfix-jq-552" as BugfixJq552;
# These tests fail for vanilla split/join, but they are detected and shows debug output.
#
# BugfixJq552::warningSplit: Single string to two empty parts. SHOWS DEBUG OUTPUT.
"a" | BugfixJq552::warningSplit("a") # Expected [ "", "" ], Actually [ "" ]
# BugfixJq552::warningSplit: Single string to two parts z. SHOWS DEBUG OUTPUT.
"xyz" | BugfixJq552::warningSplit("z") # [ Expected "xy", "" ], Actually [ "xy" ]
# BugfixJq552::warningJoin: Single string to two empty parts. SHOWS DEBUG OUTPUT.
[ "", "" ] | BugfixJq552::warningJoin("a") # Expected "a", Actually ""
# BugfixJq552::warningJoin: Two parts x. SHOWS DEBUG OUTPUT.
[ "", "yz" ] | BugfixJq552::warningJoin("x") # Expected "xyz", Actually "yz"
# Instead use the functions which attempt to fix these inconsistencies.
#
# BugfixJq552::attemptFixSplit: Single string to two empty parts.
"a" | BugfixJq552::attemptFixSplit("a") # [ "", "" ]
# BugfixJq552::attemptFixSplit: Single string to two parts z.
"xyz" | BugfixJq552::attemptFixSplit("z") # [ "xy", "" ]
# BugfixJq552::attemptFixJoin: Single string to two empty parts.
[ "", "" ] | BugfixJq552::attemptFixJoin("a") # "a"
# BugfixJq552::attemptFixJoin: Two parts x.
[ "", "yz" ] | BugfixJq552::attemptFixJoin("x") # "xyz" |
@nicowilliams: this fix broke another case, as detected by joelpurra/jq-bugfix-jq-552. Updated the package with additional tests (including plain These # split: Single string to two empty parts.
jq -n '"a" | split("a")' # Expected [ "", "" ]
# split: Single string to two parts z.
jq -n '"xyz" | split("z")' # [ Expected "xy", "" ] This # split: Single string split with empty string.
jq -n '"a" | split("")' # Expected [ "a" ], Actually [ "a", "" ] These # join: Single string to two empty parts.
jq -n '[ "", "" ] | join("a")' # Expected "a", Actually ""
# join: Two parts x.
jq -n '[ "", "yz" ] | join("x")' # Expected "xyz", Actually "yz" |
On Tue, Jan 13, 2015 at 06:56:58AM -0800, Joel Purra wrote:
Thanks. Shouldn't splitting on the empty string yield an array of single-character (well, single-codepoint; jq doesn't know about characters) strings, with an empty string in the front and an empty one at the end? "Obviously" in this case we can just drop the leading and trailing empty strings. But it's not so obvious to me.
Why shouldn't we expect
I didn't touch |
@nicowilliams: |
@nicowilliams wrote:
Yes - forgot this was the standard thing for split on an empty string, and didn't get to writing multi-character test cases. |
Splitting a string where the string to split on matches the last character in the input string returns an array that's shorter than other results. Compare with a split on the first character.
Shouldn't it return an empty string instead of omitting the last result?
The text was updated successfully, but these errors were encountered: