User Guide: jtc
. Examples and Use-cases
- Displaying JSON
- Walking JSON (
-w
)- Walking with subscripts (
[..]
) - Searching JSON (
<..>
,>..<
)- Searching JSON with RE (
<..>R
,<..>L
,<..>D
) - Search suffixes (
rRPdDNbnlLaoicewjstqQgG
) - Directives (
vkzfFuIZWS
) - Setting a custom JSON value into a namespace
- Fail-safe and Forward-Stop directives (
<..>f
,<..>F
) - RE generated namespaces (
$0
,$1
, etc) - Search quantifiers (
n
,+n
,n:m:s
) - Scoped search
[..]:<..>
- Non-recursive search (
>..<
)
- Searching JSON with RE (
- Addressing parents (
[-n]
) - Walking multiple walk-paths
- Sequential walk processing (
-n
) - Displaying walks with labels (
-l
) - Wrapping resulted walks to JSON array (
-j
) - Interleaved walk processing
- Aggregating walks (
-nn
) - Wrapping walked entries into a JSON object (
-jj
) - Extracting labeled values (
-ll
) - Succinct walk-path syntax (
-x
,-y
) - Controlling displayed walks (
-xn/N
)
- Sequential walk processing (
- Summary table of search lexemes
- Summary table of directives
- Summary of walk lexemes
- Summary of walk options
- Walking with subscripts (
- Interpolation
- Modifying JSON
- In-place JSON modification (
-f
) - Purging JSON (
-p
,-pp
) - Swapping JSON elements (
-s
) - Insert operations (
-i
) - Update operations (
-u
) - Insert, Update with move semantic (
-i
/-u
,-p
) - Insert, Update: argument shell evaluation (
-e
,-i
/-u
) - Mixing argument types for
-i
,-u
,-c
(e.g.:jtc -u<JSON> -u<walk-path>
) - Mixing argument types with
-e
- Cross-referenced insert, update
- Summary of modification options
- In-place JSON modification (
- Comparing JSONs (
-c
) - Processing input JSONs
- Some Examples
If no argument given, jtc
will expect an input JSON from the <stdin>
, otherwise JSON is read from the file(s) pointed by the
argument(s). jtc
will parse and validate input JSON and upon a successful validation will output:
bash $ <ab.json jtc
{
"Directory": [
{
"address": {
"city": "New York",
"postal code": 10012,
"state": "NY",
"street address": "599 Lafayette St"
},
"age": 25,
"children": [
"Olivia"
],
"name": "John",
"phone": [
{
"number": "112-555-1234",
"type": "mobile"
},
{
"number": "113-123-2368",
"type": "mobile"
}
],
"spouse": "Martha"
},
{
"address": {
"city": "Seattle",
"postal code": 98104,
"state": "WA",
"street address": "5423 Madison St"
},
"age": 31,
"children": [],
"name": "Ivan",
"phone": [
{
"number": "273-923-6483",
"type": "home"
},
{
"number": "223-283-0372",
"type": "mobile"
}
],
"spouse": null
},
{
"address": {
"city": "Denver",
"postal code": 80206,
"state": "CO",
"street address": "6213 E Colfax Ave"
},
"age": 25,
"children": [
"Robert",
"Lila"
],
"name": "Jane",
"phone": [
{
"number": "358-303-0373",
"type": "office"
},
{
"number": "333-638-0238",
"type": "home"
}
],
"spouse": "Chuck"
}
]
}
bash $
option -t
controls the indentation of the pretty-printing format (default is 3 white spaces):
bash $ <ab.json jtc -t10
{
"Directory": [
{
"address": {
"city": "New York",
"postal code": 10012,
"state": "NY",
"street address": "599 Lafayette St"
},
"age": 25,
"children": [
"Olivia"
],
...
Majority of the examples and explanations in this document are based on the above simplified version of the above address book JSON model.
Option -r
will instruct to display JSON in a compact (single row) format:
bash $ <ab.json jtc -r
{ "Directory": [ { "address": { "city": "New York", "postal code": 10012, "state": "NY", "street address": "599 Lafayette St" }, "age": 25, "children": [ "Olivia" ], "name": "John", "phone": [ { "number": "112-555-1234", "type": "mobile" }, { "number": "113-123-2368", "type": "mobile" } ], "spouse": "Martha" }, { "address": { "city": "Seattle", "postal code": 98104, "state": "WA", "street address": "5423 Madison St" }, "age": 31, "children": [], "name": "Ivan", "phone": [ { "number": "273-923-6483", "type": "home" }, { "number": "223-283-0372", "type": "mobile" } ], "spouse": null }, { "address": { "city": "Denver", "postal code": 80206, "state": "CO", "street address": "6213 E Colfax Ave" }, "age": 25, "children": [ "Robert", "Lila" ], "name": "Jane", "phone": [ { "number": "358-303-0373", "type": "office" }, { "number": "333-638-0238", "type": "home" } ], "spouse": "Chuck" } ] }
bash $
By default, the compact printing view will use a single spacer between all tokens, that also could be controlled if -r
and -t
used together, e.g., to print the above JSON w/o spacer:
bash $ <ab.json jtc -rt0
{"Directory":[{"address":{"city":"New York","postal code":10012,"state":"NY","street address":"599 Lafayette St"},"age":25,"children":["Olivia"],"name":"John","phone":[{"number":"112-555-1234","type":"mobile"},{"number":"113-123-2368","type":"mobile"}],"spouse":"Martha"},{"address":{"city":"Seattle","postal code":98104,"state":"WA","street address":"5423 Madison St"},"age":31,"children":[],"name":"Ivan","phone":[{"number":"273-923-6483","type":"home"},{"number":"223-283-0372","type":"mobile"}],"spouse":null},{"address":{"city":"Denver","postal code":80206,"state":"CO","street address":"6213 E Colfax Ave"},"age":25,"children":["Robert","Lila"],"name":"Jane","phone":[{"number":"358-303-0373","type":"office"},{"number":"333-638-0238","type":"home"}],"spouse":"Chuck"}]}
bash $
A semi-compact view is a middle ground between pretty and compact views. The semi-compact view is engaged with the suffix -c
appended
to the indent value in -t
option (e.g.: -t5c
) . In the semi-compact view all JSON iterables made of only atomic values and/or
empty iterables ([]
, {}
) will be printed in a single line, the rest if pretty-printed, compare:
bash $ <ab.json jtc -tc
{
"Directory": [
{
"address": { "city": "New York", "postal code": 10012, "state": "NY", "street address": "599 Lafayette St" },
"age": 25,
"children": [ "Olivia" ],
"name": "John",
"phone": [
{ "number": "112-555-1234", "type": "mobile" },
{ "number": "113-123-2368", "type": "mobile" }
],
"spouse": "Martha"
},
{
"address": { "city": "Seattle", "postal code": 98104, "state": "WA", "street address": "5423 Madison St" },
"age": 31,
"children": [],
"name": "Ivan",
"phone": [
{ "number": "273-923-6483", "type": "home" },
{ "number": "223-283-0372", "type": "mobile" }
],
"spouse": null
},
{
"address": { "city": "Denver", "postal code": 80206, "state": "CO", "street address": "6213 E Colfax Ave" },
"age": 25,
"children": [ "Robert", "Lila" ],
"name": "Jane",
"phone": [
{ "number": "358-303-0373", "type": "office" },
{ "number": "333-638-0238", "type": "home" }
],
"spouse": "Chuck"
}
]
}
bash $
JSON size is the total number of the JSON elements found within JSON, it could be printed using -z
, the size appears after input JSON
is printed (starting from version 1.75b the size is printed in a JSON format):
bash $ <ab.json jtc -rz
{ "Directory": [ { "address": { "city": "New York", "postal code": 10012, "state": "NY", "street address": "599 Lafayette St" }, "age": 25, "children": [ "Olivia" ], "name": "John", "phone": [ { "number": "112-555-1234", "type": "mobile" }, { "number": "113-123-2368", "type": "mobile" } ], "spouse": "Martha" }, { "address": { "city": "Seattle", "postal code": 98104, "state": "WA", "street address": "5423 Madison St" }, "age": 31, "children": [], "name": "Ivan", "phone": [ { "number": "273-923-6483", "type": "home" }, { "number": "223-283-0372", "type": "mobile" } ], "spouse": null }, { "address": { "city": "Denver", "postal code": 80206, "state": "CO", "street address": "6213 E Colfax Ave" }, "age": 25, "children": [ "Robert", "Lila" ], "name": "Jane", "phone": [ { "number": "358-303-0373", "type": "office" }, { "number": "333-638-0238", "type": "home" } ], "spouse": "Chuck" } ] }
{ "size": 56 }
bash $
if size only required (i.e., w/o printing the input JSON), then use -zz
option:
bash $ <ab.json jtc -zz
56
bash $
When JSON is read (from a file, or from stdin
), it get parsed and validated. If an invalid JSON is detected, a short exception
message will be displayed, e.g,:
bash $ <ab.json jtc
jtc json parsing exception (<stdin>:1214): unexpected_end_of_line
bash $
and though the message lets us knowing that there's a problem with the input JSON, it not very informative with regards whereabouts the
the problem. To visualize the spot where the problem is, as well as its locus pass a single debug option (-d
):
bash $ <ab.json jtc -d
.display_opts(), option set[0]: -d (internally imposed: )
.init_inputs(), reading json from <stdin>
.exception_locus_(), ...e": 80206,| "state": "CO,| "street address": "6213...
.exception_spot_(), --------------------------------------->| (offset: 1214)
jtc json parsing exception (<stdin>:1214): unexpected_end_of_line
bash $
the vertical pipe symbol |
in the debug showing JSON locus replaces new lines, thus it becomes easy to spot the problem.
The offset (1214
in the example) is given in unicode UTF-8 characters from the beginning of the input/file/stream.
In that particular failure instance, jtc
found the end of a line, while JSON string "Co,
is still open (JSON standard does not
permit multi-line strings). To fix that, the missing quotation mark to be added
JSON specification allows escaping solidus (/
) optionally. By default, jtc
is relaxed w.r.t. parsing solidus notation - it admits
both unescaped and escaped appearances:
bash $ <<<'{ "escaped": "\/", "unescaped": "/" }' jtc
{
"escaped": "/",
"unescaped": "/"
}
bash $
If there's a need for a strict solidus parsing, option -q
facilitates the need. It also will throw an exception upon facing
a non-escaped notation:
bash $ <<<'{ "escaped": "\/", "unescaped": "/" }' jtc -q -d
.display_opts(), option set[0]: -q -d (internally imposed: )
.init_inputs(), reading json from <stdin>
.exception_locus_(), { "escaped": "\/", "unescaped": "/" }
.exception_spot_(), --------------------------------->| (offset: 33)
jtc json parsing exception (<stdin>:33): unquoted_character
bash $
If a JSON itself (or a result from walking JSON) is a single JSON string, then sometimes there's a need to unquote it
(especially it comes handy if the string itself is an embedded JSON). -qq
allows unquoting it, here are a few examples:
bash $ jsn='"{ \"JSON\": \"example of an embedded JSON\" }"'
bash $ <<<$jsn jtc
"{ \"JSON\": \"example of an embedded JSON\" }"
bash $
# unquote (jsonize) embedded json:
bash $ <<<$jsn jtc -qq
{ "JSON": "example of an embedded JSON" }
bash $
# unquote (jsonize) embedded json and re-parse it:
bash $ <<<$jsn jtc -qq | jtc
{
"JSON": "example of an embedded JSON"
}
bash $
When unquoting empty JSON strings (""
) the resulted blank lines are not even printed:
bash $ <<<'[null, "", true]' jtc -w[:] -qq
null
true
bash $
If the source string contains Unicode code points, those will be correctly translated into respective UTF-8 characters:
bash $ <<<'"Unicode char: \u1234"' jtc -qq
Unicode char: ሴ
bash $
bash $ <<<'"Surrogate pair: \uD834\uDD1E"' jtc -qq
Surrogate pair: 𝄞
bash $
bash $ <<<'"Invalid surrogate: \uDD1E"' jtc -qq
jtc json exception: invalid_surrogate_code_pair
bash $
NOTE: the option notation
-q
, if both behaviors are required then both variants have to be spelled (e.g.jtc -q -qq
, orjtc -qqq
)
Also,-j
,-J
options, because of a risk of forming an ill-formed JSON, thus, when sighted together option
An opposite request is to string-quote a JSON itself (e.g. if you like to embed JSON as a string into another JSON). This is
achieved with the option notation -rr
:
bash $ jsn='[ "JSON", "example" ]'
bash $ <<<$jsn jtc
[
"JSON",
"example"
]
bash $
bash $ <<<$jsn jtc -rr
"[ \"JSON\", \"example\" ]"
bash $
The spacer for stringification is also controlled via -t
, e.g.: to stringify the above JSON w/o space use:
bash $ <<<$jsn jtc -rr -t0
"[\"JSON\",\"example\"]"
bash $
B.t.w., both string unquoting and JSON stringification also could be achieved via template operations.
-tN
: pretty printing using indentation with N spaces (by defaultN
is3
)-r
: compact (single row) printing using with a default spacer (1 white space)-rtN
: compact printing usingN
white spaces as a spacer-tNc
: semi-compact printing with indentation ofN
white spaces (default is3
)-z
: additionally print a size of an entire JSON (or of each walked element) at the bottom of a printed JSON (each walked element)-zz
: print a size instead of a JSON (walked elements)-q
: force parsing of a solidus/
as strictly quoted (i.e.,\/
)-qq
: unquote printed (walked) JSON string (drop the outer quotation marks and unqote all embedded quoted characters)-rr
: inquote (embed) JSON (walked elements) into a JSON string values-rrtN
: inquote (embed) JSON (walked elements) into a JSON string values with a spacerN
Whenever there's a requirement to select (extract) only one or multiple JSON elements, a walk-path (-w
) tells how to do it.
A walk path (an argument of -w
option) can be made of an arbitrary number of lexemes. Though there are only 2 kinds of the
lexemes:
- offset lexeme (
[..]
) - search lexeme (
<..>
,>..<
)
Offsets are always enclosed into square brackets [..]
. Selecting JSON elements always begins from a JSON root.
Both arrays and objects can be subscripted using numerical offsets, though it's best to utilize literal offsets to subscript objects.
E.g., let's select address
of the 2nd (all the indices in the walk-path are zero-based) record in the above JSON:
bash $ <ab.json jtc -w'[Directory][1][address]'
{
"city": "Seattle",
"postal code": 98104,
"state": "WA",
"street address": "5423 Madison St"
}
bash $
or equally, it could be done like in the example below, but the former syntax is preferable (for your own good - when giving indices you'd need to guess the index of a labeled entry, which might be prone to mistakes):
bash $ <ab.json jtc -w'[0][1][0]'
{
"city": "Seattle",
"postal code": 98104,
"state": "WA",
"street address": "5423 Madison St"
}
bash $
if a numerical subscript index is prepended with +
, then all the subsequent subscripted elements will be selected as well
(it makes the lexeme iterable), e.g., a following example prints all names out of the address book, starting from the 2nd record:
bash $ <ab.json jtc -w'[Directory][+1][name]'
"Ivan"
"Jane"
bash $
Any JSON iterable could be subscripted that way, and any number of such lexemes could appear in the walk-path, e.g.:
bash $ <ab.json jtc -w'[Directory][+0][phone][+1][number]'
"113-123-2368"
"223-283-0372"
"333-638-0238"
bash $
There, all records ([+0]
) from the Directory
have been selected and then in every record all phone
sub-records selected
starting from the 2nd entry ([+1]
)
The same way object elements could be subscripted, here's an example where all address entries starting from the 2nd one are printed, each one stating from the 3rd entry:
bash $ <ab.json jtc -w'[Directory][+1][address][+2]'
"WA"
"5423 Madison St"
"CO"
"6213 E Colfax Ave"
bash $
Another way to select multiple JSON elements is to use a slice notation [N:M:S]
. In that notation N
and M
could be either
positive or negative, while S
must be strictly positive. Any of positions (as well as position separator :
) may be omitted.
The first position (N
) designates a beginning of the slice selection, the second position (M
) designates an end of the slice
exclusively (i.e., not including the indexed element M
itself), the last position (S
) designates a step.
- positive
N
(andM
) refers toN
th element offset from the beginning of a collection (an array or an object) - negative
N
(andM
) refers to theN
th element offset from the end of the collection. - empty (missed
N
andM
) index tells to address either from the the beginning of the collection (in the first position), or from the end (last position) S
position indicates a step value when iterating over the selected slice, the default value is obviously1
Thus, multiple notations with the same semantics are possible, e.g.:
[:]
,[0:]
,[0::]
will select all the element in the collection and is equivalent of[+0]
notation[0:1]
,[:1]
, will select only the first element and is the same as[0]
[:-1]
,[:-1:]
will select all the elements except the last one[-2:]
,[-2::1]
will select at most 2 last elements in the collection[::2]
-will select every other element in the collection
E.g., let's print all phone numbers for the last 2 records in the Directory
:
bash $ <ab.json jtc -w'[Directory][-2:][phone][:][number]' -l
"number": "273-923-6483"
"number": "223-283-0372"
"number": "358-303-0373"
"number": "333-638-0238"
bash $
Walk-path lexemes enclosed into <..>
braces instruct to perform a recursive search off the value under a currently selected
JSON node. I.e., if a search lexeme appears as the first one in the walk-path, then the search will be performed off the root,
otherwise off the node in JSON where a prior lexeme has stopped.
By default (if no suffix is given), a search lexeme will perform a search among JSON string values only (i.e., it won't match JSON numerical or JSON boolean or JSON null values). E.g., following search produces a match:
bash $ <ab.json jtc -w'<New York>'
"New York"
bash $
while this one doesn't produce a match (the string value "New York"
is found only in the first Directory
record):
bash $ <ab.json jtc -w'[Directory][1:]<New York>'
bash $
Optionally, search lexemes may accept one-letter suffixes: a single character following the lexeme's closing bracket.
The suffixes define lexeme's search behavior.
For example, these 3 suffixes facilitate REGEX types of search:
R
: performs a REGEX search only among JSON string valuesD
: performs a REGEX search only among JSON numerical valuesL
: performs a REGEX search only among JSON labels
E.g., a following query finds recursively the first string entry with the RE:
bash $ <ab.json jtc -w'<^N>R'
"New York"
bash $
REGEX lexemes optionally may have flags altering RE search behavior (cppreference):
I
(icase): - character matching should be performed without regard to caseN
(nosubs): - sub-expressions(expr)
are treated as non-marking sub-expressions(?:expr)
O
(optimize): - make matching faster, with the potential cost of making construction slowerC
(collate): - character ranges of the form "[a-b]" will be locale sensitive
option multiline is unsupported due to JSON not permitting multiline strings
Also, following REGEX grammars are supported:
E
: modifiedECMAScript
regular expression grammar (default)S
: basic POSIX regular expression grammarX
: extended POSIX regular expression grammarA
: grammar used by theawk
utility in POSIXG
: grammar used by thegrep
utility in POSIXP
: regular expression grammar used by theegrep
utility
All of the above flags may be passed as quoted trailing characters in the lexeme:
bash $ <ab.json jtc -w'<^new york\I>R'
"New York"
bash $
Multiple options could be passed within the lexeme, however, if multiple grammars specified, only the first one will take the effect, e.g.:
<...\G\A>R
- betweenawk
andgrep
grammars the latter wins, because it's given first
All REGEX lexemes also support templates/namespace interpolation. The ineterpolation is applied before regex search performed:
bash $ <ab.json jtc -tc -w'<postal code>l:<PC>v[-1][street address]<[{PC}]>R'
"5423 Madison St"
"6213 E Colfax Ave"
bash $
- in the last lexeme (<[{PC}]>R
) the namespace PC
is getting interpolated first (it was set in the second lexeme - <PC>v
)
and then the REGEX search applied.
NOTE: the namespace tokens usage in REGEX lexemes is restricted to alphabetical names only (e.g.:
{abc}
):
- numerical namespaces (e.g.,{123}
) might be clashing with REGEX quantifiers and hence not supported,
- the auto-tokens (e.g.: '$abc') are also unsupported, because at the time of walking the iterator is yet unresolved
Be extreamly careful with all search lexemes supporting interpolation (namely: <..>R
, <..>L
, <..>D
, <..>j
) -
reckless interpolation may render walking quite slow
This is the complete list of suffixes that control search behavior:
r
: default (could be omitted), fully matches JSON string values (e.g.:<CO>
,<CO>r
)R
: the lexeme is a search RE, only JSON string values searched (e.g.:<^N.*>R
)P
: matches any string values, same like<.*>R
, just faster (e.g.:<>P
)d
: matches JSON numericals (e.g.:<3.14>d
)D
: the lexeme is an RE, only JSON numerical values searched (e.g.:<^3\.+*>D
)N
: matches any JSON numerical value, same like<.*>D
, just faster (e.g.:<>N
)b
: matches JSON boolean values; to match the exact boolean, it must be spelled as<true>b
,<false>b
; when the lexeme is empty (<>n
) it matches any booleann
: matches JSON null value (e.g.:<>n
)l
: fully matches JSON labels (e.g.:<address>l
)L
: the lexeme is a search RE, only JSON labels searched (e.g.<^[a-z]>L
)a
: matches any JSON atomic value, i.e., strings, numerical, boolean, null (e.g.:<>a
)o
: matches any JSON object{..}
(e.g.:<>o
)i
: matches any JSON array[..]
(e.g.:<>i
)c
: matches any container type - either arrays or objects (e.g.:<>c
)e
: matches end-nodes only, which is either atomic values, or empty iterables[]
,{}
(e.g.:<>e
)w
: matches any JSON value (wide range match): atomic values, objects, arrays (e.g.<>w
)j
: matches a JSON value; the lexeme can be either a valid JSON (e.g.:<[]>j
- finds an empty JSON array), or a template resulting in a valid JSON after interpolation (e.g.:<"{str}">j
- finds a JSON string whose value is in namespacestr
)s
: matches a JSON value previously stored in the namespace (e.g.:<Val>v ... <Val>s
)t
: matches a tag (label/index) previously stored in the namespace (e.g.<Lbl>k ... <Lbl>t
)q
: matches only original JSON values, i.e. selects non-duplicated values only (e.g.:<>q
)Q
: matches only repetitive (duplicating) JSON values (e.g.:<>Q
)g
: matches all JSON values in the ascending order (e.g.:<>g
)G
: matches all JSON values in the descending order (e.g.:<>G
)
Some search lexemes (and
directives)
require their content is set and be non-empty (R
,d
,D
,L
,j
,s
,t
,v
,z
,u
,I
,Z
,W
,S
), otherwise an exception
(walk_empty_lexeme
) will be thrown, e.g.:
bash $ <ab.json jtc -w'<name>L'
"John"
bash $ <ab.json jtc -w'<>L'
jtc json exception: walk_empty_lexeme
bash $
The walk-path might be quite long containing multiple lexemes and it might not be obvious which one throws the exception.
To figure that, run the command with -ddd
- the throwing lexeme will be displayed:
bash $ <ab.json jtc -w'<>L' -ddd
.display_opts(), option set[0]: -w'<>L' -d -d -d (internally imposed: )
.init_inputs(), reading json from <stdin>
..ss_init_(), initializing mode: buffered_cin
..ss_init_(), buffer (from <stdin>) size after initialization: 1674
..run_decomposed_optsets(), pass for set[0]
...parse(), finished parsing json
..demux_opt(), option: '-w', hits: 1
.walk_json(), copying input json for integrity check (debug only)
..walk(), walk string: '<>L'
...parse_lexemes_(), walked string: <>L
...parse_lexemes_(), parsing here: >|
...extract_lexeme_(), parsed lexeme: <>
...parse_lexemes_(), walked string: <>L
...parse_lexemes_(), parsing here: -->|
...parse_suffix_(), search type sfx: Label_RE_search
..main(), exception raised by: file: './lib/Json.hpp', func: 'parse_suffix_()', line: 3573
jtc json exception: walk_empty_lexeme
bash $
A few of search lexemes might be left empty, but then they cary a semantic of an empty match (r
,l
):
<>r
(same as<>
) - will match an empty JSON string<>l
- will match an entry with the empty JSON label
E.g.:
bash $ <<<'{"":"empty label"}' jtc -w'<>l'
"empty label"
bash $
The rest of the lexemes (search: P
,N
,b
,n
,a
,o
,i
,c
,e
,w
,q
,Q
,g
,G
and directives: k
,f
,F
) also might
be left empty - all those search lexemes carry semantic of any match. However, if those lexemes are non-empty, then their content
points to a namespace
where the found value (result of a match - for search lexemes, or currently walked JSON - for directives) will be stored, e.g.:
bash $ <ab.json jtc -w'<array>i1' -T'[{array}, "Sophia"]'
[
"Olivia",
"Sophia"
]
bash $
bash $ <ab.json jtc -w'<array:["Mia"]>i1' -T'[{array}, "Sophia"]'
[
"Mia",
"Sophia"
]
bash $
jtc
is super efficient searching recursively even huge JSONs structures - normally no exponential search decay will be observed
(which is very typical for such kind of operations). The decay is avoided because jtc
builds a cache for all searches (whenever
cacheing is required, both recursive and non-recursive) and thus all subsequent matches are taken from the cache.
There are a few lexemes that look like searches, though they do not perform any matching, rather they apply certain actions for the currently walked JSON elements, these are directives:
v
: saves the currently walked JSON value into a namespace under the name specified by the lexeme (e.g.:<Jsn>v
)k
: instructs to reinterpret the key (label/index) of the currently walked JSON and treat it as a value (thus a label/index can be updated/extracted programmatically) - e.g.:<>k
; if the lexeme is non-empty then it saves a found key (label/index) into the corresponding namespace and cancels reinterpretation of the label as a value (e.g.:<Lbl>k
)z
: erases namespace pointed by the lexeme's; the lexeme must not be empty (e.g.:<Jsn>z
)f
: fail-safe (branching): if walking past the fail-safe lexeme fails then, instead of progressing to the next iteration (a typical behavior), the walk for the lexeme immediately preceding the fail-safe will be reinstated; walking (of the same walk-path) may continue if><F
directive is present (past the failing point)F
: Forward-Stop: behavior of the directive is dependent on spelling:<>F
- when the directive is reached, the currently walked path is skipped and silently proceeds to the next walk iteration without ending the walk (likecontinue
loop operator in some programming languages)<>FN
, whereN
is a non-zero quantifier - implements a "jump" logic: it will skip over N lexemes starting with theF
lexeme itself; thus<>F1
just continues walk with the next lexeme,<>F2
will skip over the next lexeme, etc.><F
- when the directive is reached, the walk successfully ends for the output processing (similar tobreak
loop operator)><FN
, whereN
is a non-zero quantifier - implements "repeat" logic: repeats walked path N times, e.g.:><F1
will produce 2 identical walk results (original one, and one repeated)
u
: user evaluation of the walk-path: the lexeme is theshell cli
sequence which affects walking: if a returned result of the shell evaluation is0
(success) then walk continues, otherwise the walk fails; the lexeme is subjected for template interpolationI
: increment/multiply lexeme; the lexeme lets incrementing and/or multiplying the namespace value pointed by the lexeme, e.g.:<val>I3:2
- a JSON numerical stored in the namespaceval
will be incremented by 3 and then multiplied by 2.Z
: saved into a provided namespace a size of a currently walked JSON (recursive and non-recursive notations produces different effects - the former calculates the entire JSON size, while the latter does only the number of children); with the quantifier of1
(i.e.,<StrSize>Z1
) saves into the namespace a size of the currently walked JSON string, otherwise (if not a string)-1
W
saves into the provided namespace a currently walked JSON's walk-path as a JSON array (e.g.:<wp>W
)S
restores the point of walk (if it can be restored) previously saved byW
lexeme (e.g.:<wp>S
)
There's a set of lexemes (search and directives) which may reference a name in the
namespace for capturing a currently walked JSON elements:
P
,N
,b
,n
,a
,o
,i
,c
,e
,w
,q
,Q
,g
,G
,v
,k
,f
,F
.
All of those lexemes also allow capturing a custom JSON value in lieu of currently walked JSON - if the lexeme's value is given in
the format, e.g.:
- <name:JSON_value>v
then upon walking such syntax the JSON_value
will be preserved in the namespace name
instead of a currently walked JSON
Normally,
JSON_value
must be a valid JSON, otherwise it'll be promoted to a JSON string. If it still fails then an exception will be thrown (json_lexeme_invalid
)
All the lexemes in the walk-path are bound by a logical AND
- only once all succeed then the walk-path succeeds too (and printed
or passed for a respective operation). The fail-safe and Forward-Stop directives make possible to introduce branching logic
into the walk-path. Let's break it down:
When directive F
is paired with <>f
, together they cover all cases of walk-paths branching:
- ...
<>f
{if this path fails, then a walk ends reinstating the walk at<>f
point} - ...
<>f
{if this path succeeds, then skip the result}<>F
{otherwise keep walking this path starting from<>f
point} ... - ...
<>f
{if this path path succeeds, then end walking}><F
{otherwise walk this path} ... - ...
<>f
{if this path succeeds, then end walking}><F <>F
# otherwise skip it (i.e., skip the failed path/result) - etc. (there's unlimited number of times
<>f
and<>F
/><F
pairs could be present in the walk)
Say, we want to list all mobile
phone records, let's do it along with names of phone holders:
bash $ <ab.json jtc -w'[0][:][name]' -w'[0][:][phone]<mobile>[-1]' -r
"John"
{ "number": "112-555-1234", "type": "mobile" }
"Ivan"
{ "number": "223-283-0372", "type": "mobile" }
"Jane"
bash $
As it appears, Jane
has no mobile phone, but then our requirement is enhanced: for those who do not have a mobile
, let's list
the first available phone from the records, there a <>f
directive comes to a rescue:
bash $ <ab.json jtc -w'[0][:][name]' -w'[0][:][phone][0]<>f [-1]<mobile>[-1]' -r
"John"
{ "number": "112-555-1234", "type": "mobile" }
"Ivan"
{ "number": "223-283-0372", "type": "mobile" }
"Jane"
{ "number": "358-303-0373", "type": "office" }
bash $
as the path is walked, as soon <>f
directive is faced, jtc
memorizes the currently walked path point and will reinstate it shall
further walking fails, there:
- we resolve the first entry in the
phone
records and memorize its path location ([phone][0] <>f
) - then step back up and look for a
mobile
type of the record ([-1]<mobile>
), then:- if it's found, we step back up (
[-1]
) again to finish walking and display the whole record - if not found (i.e., walking indeed fails), a fail-safe is engaged and preserved location is recalled and displayed
- if it's found, we step back up (
A walk-path may contain multiple fail-safe, only the respective fail-safe will be engaged (more specific one and closest one to the failing point)
Say we want to list from the address book all the record holders and indicate whether they have any children or not in
this format:
<Name> has children: true/false
Thus, we need to build a single path, which will find the name
, then inspect children
record and transform it into
true
if there at least 1 child, or false
otherwise:
We can do it in steps:
- let's get to the
name
s first and memorize those:
bash $ <ab.json jtc -w'<name>l:<N>v'
"John"
"Ivan"
"Jane"
bash $
- Now let's inspect a sibling record
children
:
bash $ <ab.json jtc -w'<name>l:<N>v [-1][children]' -r
[ "Olivia" ]
[]
[ "Robert", "Lila" ]
bash $
- so far so good, now, we need to engage fail-safe to facilitate the requirement to classify those records as
true
/false
:
bash $ <ab.json jtc -w'<name>l:<N>v[-1][children]<C:false>f[0]<C:true>v'
"Olivia"
[]
"Robert"
bash $
- there namespace
C
will be populated first with JSON valuefalse
and will stay put shall further walking fails; - otherwise (i.e., upon a successful walk - addressing a first child
[0]
) the namespaceC
will be overwritten with the valuetrue
- lastly, we need to interpolate preserved namespaces for our final / required output using a template:
bash $ <ab.json jtc -w'<name>l:<N>v[-1][children]<C:false>f[0]<C:true>v' -T'"{N} has children: {C}"' -qq
John has children: true
Ivan has children: false
Jane has children: true
bash $
Let's consider another example, say, we have a following JSON:
bash $ jsn='[{"ip":"1.1.1.1", "name":"server"}, {"ip":"1.1.1.100"}, {"ip":"1.1.1.2", "name":"printer"}, {"ip":"1.1.1.101"}]'
bash $ <<<$jsn jtc -tc
[
{ "ip": "1.1.1.1", "name": "server" },
{ "ip": "1.1.1.100" },
{ "ip": "1.1.1.2", "name": "printer" },
{ "ip": "1.1.1.101" }
]
bash $
How do we list only those records which don't have name
and skip those which do? Well, one apparent solution then would be
to walk all those entries, which do have name
labels and purge them:
bash $ <<<$jsn jtc -pw'<name>l:[-1]' -tc
[
{ "ip": "1.1.1.100" },
{ "ip": "1.1.1.101" }
]
bash $
But what if we want to walk entries rather than purge (e.g., for reason of template-interpolating the entries at the output)?
The prior solution would require
chaining the output
to the next option set (which is quite a reasonable solution too, e.g.: <<<$jsn jtc -pw'<name>l:[-1]' / -w[:] -tc
),
however, it's possible to achieve the same using this simple query:
bash $ <<<$jsn jtc -w'[:]<>f[name]<>F' -tc
{ "ip": "1.1.1.100" }
{ "ip": "1.1.1.101" }
bash $
Without <>F
directive at the end, the walk would look like this:
bash $ <<<$jsn jtc -w'[:]<>f[name]' -tc
"server"
{ "ip": "1.1.1.100" }
"printer"
{ "ip": "1.1.1.101" }
bash $
Thus, <>F
skips those (successfully) matched entries, leaving only ones which fail - that's what we need in this query
(the records which do not have name
in it)
Now, what if in the example above (one with <>F
directive) we want to process failed JSON further, say, to display ip
only,
rather than the whole record? That is easy - walking of the failed path continues past the F
directive:
bash $ <<<$jsn jtc -w'[:]<>f[name]<>F[ip]' -qq
1.1.1.100
1.1.1.101
bash $
there are couple other uses for Fn
lexeme with a non-zero (non-default) quantifiers:
<>Fn
- that variant of the lexeme implements a jump logic for the walk path - i.e., once walked, it will jump to then
th lexeme (counting from the lexeme<>F
itself) and continues walking from there. E.g.:<>F1
does not do anything - it continues walking from the 1st lexeme after<>F1
,<>F2
will jump over the very next lexeme and continues walking from the 2nd one, and so on and so forth.><Fn
- this variant will repeat the same walk up to the lexeme additionallyn
times - that is useful when there's a need to repeat the path additionallyn
times
For example, to duplicate all the found address records, use <>Fn
:
# find all names who's spouse is not set to null:
bash $ <ab.json jtc -w'<spouse>l:<>f<>n<>F[-1][name]'
"John"
"Jane"
bash $
# duplicate the same logic once:
bash $ <ab.json jtc -w'<spouse>l:<>f<>n<>F[-1][name]><F1'
"John"
"Jane"
"John"
"Jane"
bash $
RE search lexemes (R
, L
, D
) also auto-populate the namespace with following names:
$0
is auto-generated for an entire RE match,$1
for the first RE captured subgroup,$2
for the second RE subgroup, and so on (predicated no\N
flag was given)
bash $ <ab.json jtc -w'<^J(.*)>R:'
"John"
"Jane"
bash $ <ab.json jtc -w'<^J(.*)>R:' -T'"j{$1}"'
"john"
"jane"
bash $
(coverage of REGEX is entirely out of scope of this document, rather refer to this external link: Regular Expression)
Optionally, a quantifier may follow the search lexeme (if a lexeme has a suffix, then the quantifier must come after the suffix). Quantifiers in search lexemes allow selecting match instance (i.e., select first match, second one, etc, or a range of matches) Quantifiers exist in the following formats:
n
, - a positive number - tells which instance of a match to pick. By default, a quantifier0
is applied (i.e., first match selected)+n
- selects all match instances starting fromn
th (zero based)n:m:s
- slice select: the notation rules for this quantifier the same as for subscript slices ([n:m:s]
), with just one understandable caveat:n
,m
here cannot go negative (there's no way of knowing upfront how many matches would be produced, so it's impossible to select a range/slice based off the last match), the rest of the notation rules apply
To illustrate the quantifiers (with suffixes), let's dump all the JSON arrays in the Directory
, except the top one:
bash $ <ab.json jtc -w'<>i1:' -tc
[ "Olivia" ]
[
{ "number": "112-555-1234", "type": "mobile" },
{ "number": "113-123-2368", "type": "mobile" }
]
[]
[
{ "number": "273-923-6483", "type": "home" },
{ "number": "223-283-0372", "type": "mobile" }
]
[ "Robert", "Lila" ]
[
{ "number": "358-303-0373", "type": "office" },
{ "number": "333-638-0238", "type": "home" }
]
bash $
- the trailing 1:
in the walk path is the slice quantifier, which selects (prints) all the matches (we are matching
JSON arrays here - suffix i
) starting from second one (all quantifiers and indices are zero-based)
There are two search lexemes types where matching non-first instance does not make sense, namely: >..<l
and >..<t
.
Those
non-recursive searches
will uniquely match label or index. Indeed, in a plain JSON array or an object it's possible
to address only one single label or index, there could not be any other, e.g., in this JSON:
bash $ jsn='{ "a": 1, "b":2, "c":3, "d":4, "e":6 }'
bash $ <<<$jsn jtc -r
{ "a": 1, "b": 2, "c": 3, "d": 4, "e": 6 }
bash $
there could be only one label "b"
, thus normally trying to match a second, third, etc instance of the label "b"
would not
make much of a sense: <<<'{ "a": 1, "b":2, "c":3, "d":4, "e":6 }' jtc -w'>b<l2'
Thus, the semantic of quantifiers only in those searches was enhanced (to extend use cases) - there, the quantifiers provide a
relative offset from a found label/index. So, for the notations like above: '>b<l2'
, the label "b"
will be matched and
then its second (successive) neighbor value will be selected:
bash $ <<<$jsn jtc -w'>b<l2' -l
"d": 4
bash $
Because of a change in semantic, those are the only search quantifiers (in the non-recursive lexemes >..<l
, >..<t
)
which allow negative values. Positive quantifiers let selecting next (successive) neighbors, while negative quantifiers let
selecting preceding neighbors:
bash $ <<<$jsn jtc -w'>d<l' -l
"d": 4
bash $ <<<$jsn jtc -w'>d<l-2:1' -l
"b": 2
"c": 3
"d": 4
bash $
Search lexemes perform a recursive search across the entire JSON tree off the point where it's invoked (i.e., the JSON node selected by walking all the prior lexemes). However, sometimes there's a need to limit searching scope only to the specific label. Here is the dump of all the JSON strings containing at least one digit:
bash $ <ab.json jtc -w'<\d>R:' -l
"street address": "599 Lafayette St"
"number": "112-555-1234"
"number": "113-123-2368"
"street address": "5423 Madison St"
"number": "273-923-6483"
"number": "223-283-0372"
"street address": "6213 E Colfax Ave"
"number": "358-303-0373"
"number": "333-638-0238"
bash $
Some of the values are street address
es, some are the phone number
s. Say, we want to dump only the phone records using
the same search criteria (lexeme). Knowing the label of the phone numbers ("number"
), it's achievable via this notation:
bash $ <ab.json jtc -w'[number]:<\d>R:' -l
"number": "112-555-1234"
"number": "113-123-2368"
"number": "273-923-6483"
"number": "223-283-0372"
"number": "358-303-0373"
"number": "333-638-0238"
bash $
I.e., once the literal subscript lexeme is attached to the search lexeme over :
, it makes a scoped search:
([..]:<..>
).
Sometimes there's a requirement to apply a non-recursive search onto iterable JSON nodes (arrays, objects) - i.e., find a value
within immediate children of the node and do not descend recursively. The notation facilitating such search is the same one, but
angular brackets to be placed inside-out: >..<
.
To illustrate that, say, we want to find all string values in the 1st Directory
record containing the letter o
.
If we do this using a recursive search, then all following entries will be found:
bash $ <ab.json jtc -w'[Directory][0]<o>R:'
"New York"
"John"
"mobile"
"mobile"
bash $
To facilitate our ask (find all such entries within the immediate values of the 1st record only), apply a non-recursive search notation:
bash $ <ab.json jtc -w'[Directory][0]>o<R:'
"John"
bash $
One subtle but crucial difference between recursive
<..>
and non-recursive>..<
searches:
- the former starts a recursive search from a currently walked/selected element itself, i.e., if a currently selected JSON is"pi"
, then this walk-path still matches:<pi><pi><pi><pi>
and so on. For the latter case the non-recursive search performs matching strictly among currently walked JSON iterable's children
One of the charming features of jtc
is the ability to address parents (any ancestors up till the root) off the found JSON nodes.
Typically, addressing parents would be required after search lexemes (but may occur anywhere in the walk-path). Parents can
be addressed using notation [-n]
. This feature allows building queries that answer quite complex queries.
Let's dump all the names from the Directory
whose records have a home
phone entry:
bash $ <ab.json jtc -w'[type]:<home>:[-3][name]'
"Ivan"
"Jane"
bash $
The magic which happens here (let's break down the walk-path into the lexemes):
[type]:<home>:
- this lexeme instructs to find each (ending quantifier:
) stringhome
scoped by the label"type"
([type]:
- is attached scope), thus all such phone records values will be selected:
bash $ <ab.json jtc -w'[type]:<home>:'
"home"
"home"
bash $
[-3]
- starting off an each found JSON element a 3rd ancestor will be selected. Let's see a parent selection in a slow-mo, one by one:
# select 1st immediate parent:
bash $ <ab.json jtc -w'[type]:<home>: [-1]' -tc
{ "number": "273-923-6483", "type": "home" }
{ "number": "333-638-0238", "type": "home" }
bash $
# select 2nd parent:
bash $ <ab.json jtc -w'[type]:<home>: [-2]' -tc
[
{ "number": "273-923-6483", "type": "home" },
{ "number": "223-283-0372", "type": "mobile" }
]
[
{ "number": "358-303-0373", "type": "office" },
{ "number": "333-638-0238", "type": "home" }
]
bash $
# select 3rd parent:
bash $ <ab.json jtc -w'[type]:<home>: [-3]' -tc
{
"address": { "city": "Seattle", "postal code": 98104, "state": "WA", "street address": "5423 Madison St" },
"age": 31,
"children": [],
"name": "Ivan",
"phone": [
{ "number": "273-923-6483", "type": "home" },
{ "number": "223-283-0372", "type": "mobile" }
],
"spouse": null
}
{
"address": { "city": "Denver", "postal code": 80206, "state": "CO", "street address": "6213 E Colfax Ave" },
"age": 25,
"children": [ "Robert", "Lila" ],
"name": "Jane",
"phone": [
{ "number": "358-303-0373", "type": "office" },
{ "number": "333-638-0238", "type": "home" }
],
"spouse": "Chuck"
}
bash $
[name]
- now we can select (subscript)[name]
of out those selected JSON nodes
Another example: who is the parent of a child Lila
?
bash $ <ab.json jtc -w'<children>l:<Lila>[-2][name]'
"Jane"
bash $
Explanation:
<children>l:
- finds each record with the labelchildren
<Lila>
- in the each found record find a string valueLila
, and once/if found[-2][name]
- go 2 levels (parents) up from the found entry"Lila"
and then subscript/offset by labelname
Even more complex query: who of the parents, who have children, have mobile numbers?
bash $ <ab.json jtc -w'<children>l:[0][-2][type]:<mobile>[-3][name]'
"John"
bash $
- that is the only correct answer (because Ivan has no children and Jane has no mobile
phone)
The walk-path break down:
<children>l:
- find each record by label"children"
[0]
- try addressing first element in the found records (that'll ensure thatchildren
is non-empty)[-2]
- go 2 parents up for those records which survived the prior step - that'll bring us to the person's record level[type]:<mobile>
- find recursivelymobile
string scoped bytype
(already only for those records which have children)[-3]
- go 3 levels (parents) up (for those records which havechildren
and havemobile
types of phone records) - it'll bring us again up to the person's record level[name]
- finally select the name
There's another way to address parents - through [^n]
notation, compare: the following walk-path achieves exactly the same ask:
bash $ <ab.json jtc -w'<children>l:[0][^2][type]:<mobile>[^2][name]'
"John"
bash $
Note [^2]
- this notation, likewise [-n]
also selects a certain parent, however, while [-n]
select the parent off the leaf
(i.e., from the currently selected node) [^n]
notation does it off the root.
When jtc
walks lexemes (traverse JSON tree), internally it maintains a path to the walked steps (it's visible via debugs -dddd
).
E.g., when the first lexeme's match found (for <children>l:
), the internal walked steps path would look like:
root -> [Directory] -> [0] -> [children]
,
then when the next lexeme is successfully applied, the internal path becomes:
root -> [Directory] -> [0] -> [children] -> [0]
The meaning of [-n]
and [^n]
notation then is easy to observe on this diagram:
etc.
[^5]
to address a parent from the root: [^0] [^1] [^2] [^3] [^4]
| | | | |
v v v v v
internally built path: root -> [Directory] -> [0] -> [children] -> [0]
^ ^ ^ ^ ^
| | | | |
to address a parent from the leaf: [-4] [-3] [-2] [-1] [-0]
[-5]
etc.
jtc
allows a virtually unlimited number of walk-paths (-w
), it would be limited only by the max size of the accepted string in
your shell. Though there are a few tweaks in jtc
which let harnessing the order of displaying resulted walks. By default jtc
will be displaying resulted successful walks in an interleaved manner, but first, let's take a look at
option -n
ensures that all given walk-paths (-w
) will be processed (and printed) sequentially in the order they given:
# dump names first and then phone numbers:
bash $ <ab.json jtc -w'<name>l:' -w'<number>l:' -n
"John"
"Ivan"
"Jane"
"112-555-1234"
"113-123-2368"
"273-923-6483"
"223-283-0372"
"358-303-0373"
"333-638-0238"
bash $
# dump names first the phone numbers and then the names:
bash $ <ab.json jtc -w'<number>l:' -w'<name>l:' -n
"112-555-1234"
"113-123-2368"
"273-923-6483"
"223-283-0372"
"358-303-0373"
"333-638-0238"
"John"
"Ivan"
"Jane"
bash $
if resulted walks have labels in the input JSON (i.e., they were inside JSON objects), then -l
lets dumping their labels too:
bash $ <ab.json jtc -w'<name>l:' -w'<number>l:' -nl
"name": "John"
"name": "Ivan"
"name": "Jane"
"number": "112-555-1234"
"number": "113-123-2368"
"number": "273-923-6483"
"number": "223-283-0372"
"number": "358-303-0373"
"number": "333-638-0238"
bash $
-j
does a quite simple thing - it wraps all walked entries back into a JSON array, however, predicated by -l
and -n
options
the result will vary:
-j
without-l
will just arrange walked entries into a JSON array:bash $ <ab.json jtc -w'<name>l:' -w'<number>l:' -nj [ "John", "Ivan", "Jane", "112-555-1234", "113-123-2368", "273-923-6483", "223-283-0372", "358-303-0373", "333-638-0238" ] bash $
- once
-n
,-j
and-l
given together, then entries which have labels be wrapped into own objects (array items won't be wrapped into objects):bash $ <ab.json jtc -w'<name>l:' -w'<number>l:' -tc -njl [ { "name": "John" }, { "name": "Ivan" }, { "name": "Jane" }, { "number": "112-555-1234" }, { "number": "113-123-2368" }, { "number": "273-923-6483" }, { "number": "223-283-0372" }, { "number": "358-303-0373" }, { "number": "333-638-0238" } ] bash $
Though even that behavior is influenced by the -n
option, the above output looks dull and hardly will have many use-cases, a lot more
often it's required to group relevant walks together and then place them into respective JSON structures. For that, let's review
Interleaved walk processing (and outputting) occurs by default, though there's a certain way to control it. Let's take a look at the
above outputs dropping the option -n
(i.e., print walks interleaved):
bash $ <ab.json jtc -w'<name>l:' -w'<number>l:'
"John"
"112-555-1234"
"Ivan"
"113-123-2368"
"Jane"
"273-923-6483"
"223-283-0372"
"358-303-0373"
"333-638-0238"
bash $
Those look interleaved, though it does not appear that they relate to each other properly: e.g.: a number "113-123-2368"
belong to "John"
and preferably should be displayed before "Ivan"
and so does apply to others. jtc
is capable of
processing/printing relevant entries, though it needs a little hint from the walk-paths: the latter supposed to express the
relation between themselves.
Right now both paths (<name>l:
and <number>l:
) do not have common base lexemes, thus it's unclear how to relate resulting walks
(hence they just interleaved one by one). Though if we provide walk-paths relating each of those searches to their own record,
then the magic happens:
bash $ <ab.json jtc -w '[Directory][:] <name>l:' -w'[Directory][:] <number>l:'
"John"
"112-555-1234"
"113-123-2368"
"Ivan"
"273-923-6483"
"223-283-0372"
"Jane"
"358-303-0373"
"333-638-0238"
bash $
And now, applying options -j
together with -l
gives a lot better result (we achieve grouping of relevant walks):
bash $ <ab.json jtc -w'[Directory][:]<name>l:' -w'[Directory][:]<number>l:' -jl -tc
[
{
"name": "John",
"number": [ "112-555-1234", "113-123-2368" ]
},
{
"name": "Ivan",
"number": [ "273-923-6483", "223-283-0372" ]
},
{
"name": "Jane",
"number": [ "358-303-0373", "333-638-0238" ]
}
]
bash $
B.t.w., the relation between walks could be also expressed in a relative (rather than via absolute, as above) path way:
bash $ <ab.json jtc -w '<name>l:' -w'<name>l:[-1]<number>l:'
"John"
"112-555-1234"
"113-123-2368"
"Ivan"
"273-923-6483"
"223-283-0372"
"Jane"
"358-303-0373"
"333-638-0238"
bash $
# and now groping relevant walks:
bash $ <ab.json jtc -w '<name>l:' -w'<name>l:[-1]<number>l:' -jl -tc
[
{
"name": "John",
"number": [ "112-555-1234", "113-123-2368" ]
},
{
"name": "Ivan",
"number": [ "273-923-6483", "223-283-0372" ]
},
{
"name": "Jane",
"number": [ "358-303-0373", "333-638-0238" ]
}
]
bash $
Note: such grouping is only possible with labeled values (obviously), it won't be possible to group array elements that easily, e.g., let's break array into pairs:
bash $ array='[0,1,2,3,4,5,6,7,8,9]' bash $ <<<$array jtc -w[::2] -w[1::2] -j -tc [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ] bash $it won't work even if we try relating walks:
bash $ <<<$array jtc -w[::2] -w'[::2]<I>k[-1]>I<t1' -j -tc [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ] bash $Thus grouping here should be achieved differently. One way is to use only a single walk collecting required elements of the group into the namespaces and then using template interpolating the latter:
bash $ <<<$array jtc -w'[::2]<I>k<V>v[-1]>I<t1' -T'[{{V}},{{}}]' -j -tc [ [ 0, 1 ], [ 2, 3 ], [ 4, 5 ], [ 6, 7 ], [ 8, 9 ] ] bash $Another way is to transofrm the walks into objects assigning labels from the first walk's index:
bash $ <<<$array jtc -w'[::2]<I>k' -w[1::2] -T'{"{I}":{{}}}' -ll / -tc { "0": [ 0, 1 ], "2": [ 2, 3 ], "4": [ 4, 5 ], "6": [ 6, 7 ], "8": [ 8, 9 ] } bash $and then re-walk dropping labels and encapsulating into the outer array:
bash $ <<<$array jtc -w'[::2]<I>k' -w[1::2] -T'{"{I}":{{}}}' -ll / -jw[:] -tc [ [ 0, 1 ], [ 2, 3 ], [ 4, 5 ], [ 6, 7 ], [ 8, 9 ] ] bash $
the walks results that have labels also could be aggregated (per label), option -nn
facilitates the ask:
bash $ <ab.json jtc -w'[Directory][:]<name>l:' -w'[Directory][:]<number>l:' -jlnn
[
{
"name": [
"John",
"Ivan",
"Jane"
],
"number": [
"112-555-1234",
"113-123-2368",
"273-923-6483",
"223-283-0372",
"358-303-0373",
"333-638-0238"
]
}
]
bash $
While -j
wraps walks into a JSON array, -jj
does into a JSON object. All the elements in JSON object must have labels,
thus any walked elements which do not have labels (i.e., elements in JSON array and root) will be ignored.
E.g., let's dump all values from Jane
's record and wrap them all into an object:
bash $ <ab.json jtc -w'<Jane>[-1]<>a:' -jj
{
"age": 25,
"city": "Denver",
"name": "Jane",
"number": "333-638-0238",
"postal code": 80206,
"spouse": "Chuck",
"state": "CO",
"street address": "6213 E Colfax Ave",
"type": "home"
}
bash $
Well, the above output though does not keep all the atomic entries from the Jane's record (e.g.: Jane has 2 phone numbers, while only
one is displayed). That is because clashing labels will override each other (as of version 1.75). To ensure aggregation of clashing
labels, use -m
option:
bash $ <ab.json jtc -w'<Jane>[-1]<>a:' -jjm
{
"age": 25,
"city": "Denver",
"name": "Jane",
"number": [
"358-303-0373",
"333-638-0238"
],
"postal code": 80206,
"spouse": "Chuck",
"state": "CO",
"street address": "6213 E Colfax Ave",
"type": [
"office",
"home"
]
}
bash $
As one can see, even though Jane
has 2 lovely children (Robert
and Lila
), they were not listed on the resulting output,
that is because they are enlisted in JSON array and therefore have no labels (and hence ignored in -jj
option, which considers
values with labels only).
Sometimes, when displaying outputs wrapped into an object, it's desirable to extract the labeled value from the object (i.e., reach inside the object and use inner label rather than outer). This become especially handy when dealing with templates.
Let's consider a following query:
Say, the ask here is to extract all names of all the people from ab.json
and group them with newly crafted record indicating if a person
has children or not, like this:
[
{ "name": "John", "has children": "yes" },
...
]
1. Extracting names is easy:
bash $ <ab.json jtc -w'<name>l:'
"John"
"Ivan"
"Jane"
bash $
2. Crafting a new record would require knowing templates, namespace and interpolation , for now let's just construct a walk which creates the required namespace:
bash $ <ab.json jtc -w'<name>l:' -w'<children>l: <C:no>f[0]<C:yes>v'
"John"
"Olivia"
"Ivan"
[]
"Jane"
"Robert"
bash $
- the second walk above features couple concepts:
- branching
(
<..>f
) fail-safe lexeme: ensures that the walk is reinstated at the placement of the lexeme if/once the subsequent walk fails - namespaces
(
<C:no>f
,<C:yes>v
): both lexemes setup the namespaceC
, initially to value"no"
then to value"yes"
; the latter value will override the former only if walking[0]
was successful (i.e., if a person indeed has at least one child, b/c if arraychildren
were empty, that walk would fail)
3. by now, each time when second walk finishes iteration, the namespace C
should be correctly populated with the respective values
reflecting if a person has children or not, but to see that, we'd need to interpolate that namespace using a template:
bash $ <ab.json jtc -w'<name>l:' -w'<children>l: <C:no>f[0]<C:yes>v' -TT -T'{"has children": {{C}}}' -tc
"John"
{ "has children": "yes" }
"Ivan"
{ "has children": "no" }
"Jane"
{ "has children": "yes" }
bash $
- okay, we're getting closer, but now we want to display all records with labels:
bash $ <ab.json jtc -w'<name>l:' -w'<children>l:<C:no>f[0]<C:yes>v' -T'{"has children": {{C}}}' -l
"name": "John"
jtc jnode exception: label_accessed_not_via_iterator
bash $
Bummer! The exception (rightfully) occurs here because trying to find an outer label (with -l
) of an interpolated JSON
{ "has children": "yes" }
fails - indeed it's a standalone JSON, and root does not have any label attached - hence the
exception. In the situations like this, we'd rather want to reach out inside the object for a labeled value rather than
finding an outer label. The option -ll
facilitates that requirement:
bash $ <ab.json jtc -w'<name>l:' -TT -w'<children>l: <C:no>f[0]<C:yes>v' -T'{"has children": {{C}}}' -llj -tc
[
{ "has children": "yes", "name": "John" },
{ "has children": "no", "name": "Ivan" },
{ "has children": "yes", "name": "Jane" }
]
bash $
Finally, what's -TT
in there? That's a dummy template (one which surely fails). It's needed because if it wasn't here, then
a single template would apply to both walks (and we don't want our template to apply onto the first walk). So, we'd rather
provide a dummy one so that each
template would relate to own walk.
If template fails (and -TT
surely does) then no interpolation applied and walk iteration result is used as it is.
If that dummy template discomforts you, there's a way to go without one: keep in mind that template gets applied only once it results in a legit JSON after the interpolation occurs (shall one occurs). Let's see what the result would be w/o that dummy template:
bash $ <ab.json jtc -w'<name>l:' -w'<children>l: <C:no>f[0]<C:yes>v' -T'{"has children": {{C}}}' -tc
"John"
{ "has children": "yes" }
{ "has children": "yes" }
{ "has children": "no" }
{ "has children": "no" }
{ "has children": "yes" }
bash $
Now, the same template (-T'{"has children": {{C}}}'
) gets applied to each walk, and while for the 1st walk the template
interpolation definitely fails (while walking the 1st walk iteration for the 1st time, the name C
does not exist yet in
the namespace) and therefore the very 1st walk is getting displayed as it is, for all the subsequent iterations of the 1st
walk the template interpolation would succeed (because the namespace C
now holds the value from the 2nd walk) and therefore
unwanted template substitution occurs.
To fix that problem easily, we can erase the namespace C
right at the beginning of the 1st walk:
bash $ <ab.json jtc -w'<C>z<name>l:' -w'<children>l: <C:no>f[0]<C:yes>v' -T'{"has children": {{C}}}' -tc
"John"
{ "has children": "yes" }
"Ivan"
{ "has children": "no" }
"Jane"
{ "has children": "yes" }
bash $
And adding -llj
provides the desired effect now:
bash $ <ab.json jtc -w'<C>z<name>l:' -w'<children>l: <C:no>f[0]<C:yes>v' -T'{"has children": {{C}}}' -tc -llj
[
{ "has children": "yes", "name": "John" },
{ "has children": "no", "name": "Ivan" },
{ "has children": "yes", "name": "Jane" }
]
bash $
all the above examples just illustrate capabilities of the options for instructional purpose. Practically, the same ask would be easier to achive using just a single walk:
bash $ <ab.json jtc -w'<name>l:<N>v[-1][children]<C:no>f[0]<C:yes>v' -T'{"name":{{N}}, "has children": {{C}}}' -jtc [ { "has children": "yes", "name": "John" }, { "has children": "no", "name": "Ivan" }, { "has children": "yes", "name": "Jane" } ] bash $
If you look at the prior example
(Aggregating walks),
you may notice that a common part of both walk-paths ([Directory][:]
) had been given twice. There's a way to express it in a more
succinct syntax: options -x
and -y
allow rearranging walk-paths so that -x
takes an initial common part of the walk-path,
whereas -y
would take care of the varying trailing pars. Thus the same example cold have been written like this:
bash $ <ab.json jtc -x'[Directory][:]' -y'<name>l:' -y'<number>l:' -jlnn
[
{
"name": [
"John",
"Ivan",
"Jane"
],
"number": [
"112-555-1234",
"113-123-2368",
"273-923-6483",
"223-283-0372",
"358-303-0373",
"333-638-0238"
]
}
]
- each occurrence of -x
will be reconciled with all subsequent -y
(until next -x
is faced). Options -x
, -y
is merely
a syntactical sugar and do not apply any walk-path parsing or validation, instead they just reconcile into respective -w
options
created internally, then the latter get processed. Thus, it's even possible to write it with what it seems a broken syntax at first:
bash $ <ab.json jtc -x'[Directory][:' -y']<name>l:' -y']<number>l:' -jlnn
...
However, if a reinstatement of the options results in a valid walk-path - that's all what matters.
It's possible to combine both syntaxes (e.g.: -w
with -x
and -y
), however, given that the processing of -x
and -y
internally reinstates respective options -w
, the former will be appended after any of given -w
options (which will affect the
order of processing/outputting) even though the order of their appearance is different:
bash $ <ab.json jtc -x'[Directory][:]' -y'<name>l:' -y'<number>l:' -w '<children>l:' -rnl
"children": [ "Olivia" ]
"children": []
"children": [ "Robert", "Lila" ]
"name": "John"
"name": "Ivan"
"name": "Jane"
"number": "112-555-1234"
"number": "113-123-2368"
"number": "273-923-6483"
"number": "223-283-0372"
"number": "358-303-0373"
"number": "333-638-0238"
bash $
- here
children
walked first, becausename
andnumber
walks undergo reconciliation (internally) and inserted after all options
By default all walks (-w
) will be displayed (unless jtc
carries any of modification operations like insert/update/swap/purge, then
the entire JSON will be displayed). However, there's a way to control which ones are displayed - option -x
is overloaded to provide
such capability.
If argument of option -x
is given in any of notations: -xn
, -xn/N
, -x/N
- where n
and N
are numbers, then it controls
a frequency and offset of the displayed walks (and then -x
does not represent a common portion of a walk-path).
The first number n
in that notation tells to display every n
th walk. if n
is 0
it tells to display N
th walk once (and in such
case 0
- a default value - can be omitted resulting in the syntax -x/N
)
The second (optional) number N
tells to begin displaying walks starting from N
th one (N
is an index and thus is zero based, default
value is 0
).
Both n
and N
are generally positive numbers, though there's a special notation -x/-1
(or equally -x0
) - in such case
the last walk is ensured to be displayed (only if it wasn't displayed yet)
Say, we want to display every 4th walk of the below JSON:
bash $ jsn='[1,2,3,4,5,6,7,8,9]'
bash $ <<<$jsn jtc -w[:]
1
2
3
4
5
6
7
8
9
bash $
One way to achieve that would be to use templates (the trick is shown in
Multiple templates and walks
section), but it's quite impractical. Much easier is to use -x
option here:
bash $ <<<$jsn jtc -w[:] -x4
4
8
bash $
To display every 4th walk starting from 3rd one, use this notation:
bash $ <<<$jsn jtc -w[:] -x4/2
3
7
bash $
remember: the second number in the option is an index and thus is zero based
Lets add to the output the very first walk and the last one:
bash $ <<<$jsn jtc -w[:] -x4/2 -x/0 -x/-1
1
3
7
9
bash $
sfx | Empty lexeme semantic (<> ) |
Non-empty semantic (<..> ) |
Semantic with namespace assignment (<ns:..> ) |
---|---|---|---|
r | <> matches empty string ("" ) |
<str> matches given string "str" |
- |
R | - | <RE>R matches string value using REGEX RE |
- |
P | <>P matches any string |
<val>P stores (any) found string value in the namespace val |
<v:...>P upon any string match stores in the namespace v a user's value |
d | - | <3.14>d matches given numerical values |
- |
D | - | <RE>D matches numerical value using REGEX RE |
- |
N | <>N matches any numerical value |
<val>N stores (any) found numerical value in the namespace val |
<v:...>N upon any numerical match stores in the namespace v a user's value |
b | <>b matches any boolean value |
<true>b , <false>b matches given boolean value; <v>b stores any found boolean value in v |
<v:...>b upon any boolean match stores in the namespace v a user's value |
n | <>n matches null value |
<val>n stores found null value in the namespace val |
<v:...>n upon any null match stores in the namespace v a user's value |
l | <>l matches a value with an empty label ("": ... ) |
<lbl>l matches a value by the given label "lbl": ... |
- |
L | - | <RE>L matches a value by the label using REGEX RE |
- |
a | <>a matches any atomic value |
<val>a stores (any) found atomic value in the namespace val |
<v:...>a upon any atomic match stores in the namespace v a user's value |
o | <>o matches any object value |
<val>o stores (any) found object value in the namespace val |
<v:...>o upon any object match stores in the namespace v a user's value |
i | <>i matches any array value |
<val>i stores (any) found array value in the namespace val |
<v:...>i upon any array match stores in the namespace v a user's value |
c | <>c matches any container (object or array) value |
<val>c stores (any) found container value in the namespace val |
<v:...>c upon any container match stores in the namespace v a user's value |
e | <>e matches any end node (atomic or [] or {}) value |
<val>e stores (any) found end node value in the namespace val |
<v:...>e upon any end node match stores in the namespace v a user's value |
w | <>w matches any JSON value |
<val>w stores any found value in the namespace val |
<v:...>w upon any value match stores in the namespace v a user's value |
j | - | <JSN>j matches literal JSON (JSN could be also a template) |
- |
s | - | <val>s matches a JSON value previously stored in the namespace val |
- |
t | - | <val>t matches a label or index previously stored in the namespace val |
- |
q | <>q - matches an original (non-duplicating) value |
<val>q stores (any) found original value in the namespace val |
<v:...>q upon an original value match stores in the namespace v a user's value |
Q | <>Q - matches a duplicating value |
<val>Q stores (any) found duplicating value in the namespace val |
<v:...>Q upon an original value match stores in the namespace v a user's value |
g | <>g - matches values in the ascending order |
<val>g stores (any) found ascending value in the namespace val |
<v:...>g upon an ascending value match stores in the namespace v a user's value |
G | <>G - matches values in the descending order |
<val>G stores (any) found descending value in the namespace val |
<v:...>G upon an descending value match stores in the namespace v a user's value |
sfx | Empty lexeme semantic (<> ) |
Non-empty semantic (<..> ) |
Semantic with namespace assignment (<ns:..> ) |
---|---|---|---|
v | - | <val>v stores a currently_walked JSON value in the namespace val |
<val:...>v stores in the namespace val a user's JSON value |
k | <>k reinterprets a currently walked JSON's label/index as a value |
<val>k stores a currently walked JSON value's label/index in the namespace val |
<val:...>k stores in the namespace val a user's string value |
z | - | <val>z removes from the namespace the value stored in val |
- |
f | <>f sets a branching point |
<val>f sets a branching point and stores a currently walked JSON value in the namespace val |
<v:...>f sets a branching point and stores in the namespace v a user's JSON value |
F | <>F ,><F ,<>Fn ,><Fn facilitates various actions as per notation |
<val>f stores a currently walked JSON value in the namespace val and facilitates walk's action |
<v:...>F stores in the namespace v a user's JSON value and facilitates walk's action |
u | - | <cli>u evaluates a shell cli and gates further walking by the result |
- |
I | - | <val>I increments/multiplies (as per quantifier(s)) a value stored in the namespace val |
<val:..>I initializes the namespace val by a given value and then applies increment/multiplication (as per given quantifier(s)) |
W | - | <val>W preserves in the namespace val a currently walked JSON path |
- |
S | - | <val>S walks from the root a path previously preserved in the namespace val by the lexeme W , or hand-crafted |
- |
[]
: matches an empty label in a currently walked JSON element (e.g.: like in{ "": "empty label" }
)[text]
: matches an exact label in a currently walked JSON element (e.g.: like in{ "text": "is a label" }
)[N]
: matchesN
th child (all indices are zero based injtc
) in a currently walked JSON iterable[-N]
: selectsN
th parent for a currently walked JSON node/element referring from the node itself[^N]
: selectsN
th parent for a currently walked JSON node though referring the parent from the JSON root (rather than from a node)[+N]
: matches every child in a currently walked JSON iterable starting fromN
th child[N:M:S]
: matches every child in a currently walked JSON iterable starting fromN
th child up till, but not includingM
th with the step ofS
(N
,M
could be negative defining the start/end of the slice from the end of the iterable rather than from the start, whileS
can only be positive); either of positional parameters could be omitted (e.g.:[:]
,[::]
,[1:]
,[:-2]
,[::3]
, etc)<>
: finds recursively the first empty string (e.g.: like in{ "empty string": "" }
)<text>
: finds recursively the first string"text"
(e.g.: like in[ "text" ]
)<..>X
:X
is an optional one-letter suffix altering the behavior of the lexeme:- if
X
is any ofrRPdDNbnlLaoicewjstqQgG
- then it's a search matching a first occurrence of the lexeme, as per the suffix description - if
X
is any ofvkzfFuIZWS
then it's a directive and applies the respective behavior as per the suffix description
- if
<text>n
: finds recursivelyn
th occurrence of"text"
in a currently walked JSON element<text>+n
: finds recursively each occurrence of"text"
in a currently walked JSON element starting fromn
th occurrence<text>n:m:s
: finds recursively each occurrence of"text"
in a currently walked JSON element for the selected slice, wheren
,m
,s
parameters comply with all[N:M:S]
rules with an additional limitation:n
andm
cannot go negative and one additional liberation: either of parametersn
,m
,s
could be interpolated from the namespace<..>Xn
,<..>X+n
,<..>Xn:m:s
:- if
X
is a search lexeme suffix, then quantifier notations apply exactly the same way as per in abovetext
search quantifiers - if
X
is a directive lexeme suffix, then the quantifier behavior is either ignored (like in directivesvkzW
), or is specific for the given directive (refer to the relevant description of directives:fFuIZ
)
- if
[label]:<..>
: scoped recursive search - the search and match is performed in the currently selected JSON element only among values with the specified label; directives lexemes ignore scoping; all search suffixes and quantifiers are applicable in the scoped searches>..<
: a non-recursive search notation - the search is performed strictly among children of the currently selected JSON iterable without any descend into children's children; all search suffixes and quantifiers applied here the same way, with the following suffixes exemptions:>NS<sn
:NS
is the referred namespace being searched and matched against in the currently selected JSON iterable>NS<tn
: the same definitions applies as above, but the match is performed agains a label (for objects), or index (for arrays), plus this lexeme cannot have a scoped notation (obviously), however, a quantifiern
here refers to a relative offset from the found match (hencen
may be negative, effectively allowing selecting a sibling of the found element)>..<l
: the non-recursive exact label match cannot be scoped as well, howevern
also may go negative here>..<L
: the non-recursive REGEX label match cannot be scoped as well
<NS>X
: some search lexemes and some directives allow capturing a currently walked/matched JSON into a namespaceNS
:- if
X
is a suffix any ofPNbnaoicewqQgGvfF
, then the namespaceNS
will be populated upon a successful match (for searches) or upon walking (for directives) - for the rest of the searches (
rRdDlLjst
), the lexeme defines a search context (rather than the namespace reference), - for for the rest of directives (
kzuIZWS
) the behavior varies - refer to a respective directive description
- if
<NS:JSON_value>X
: the same searches and directives allowing capturing JSON values, allow setting customJSON_value
s instead of capturing (the same rules apply)
-w
: defines a walk-path (multiple walks supported), by default the results produced from multiple walks are interleaved-l
: when used w/o-j
just prints labels for those walked JSONS which have one-ll
: when used w/o-j
prints all JSON objects "naked", i.e. removes JSON object encasement and then uses inner labels; for the rest of JSON types the behavior is the same like with the single-l
-n
: turns on a sequential behavior for walk-paths (process first all the results from the first walk, then from then second, etc),-nn
: when used together with-j
and-l
allows aggregated behavior for values with clashing labels, see below-nn
: also triggers a round-robin templates application when a number of templates (-T
) and walks (-w
) is the same but the round_robin template application must be favored over the per-walk's in this situation-j
: arranges all walked elements (from all walk-paths) into a JSON array-jl
: arranges all walked elements (from all walk-paths) into a JSON array, puts labeled nodes into separate JSON objects, any clashing labels will be aggregated (into JSON arrays) within those objects-jln
: arranges all walked elements into a JSON array, each labeled element will be placed into own JSON object, thus dodging possible label clashingjlnn
: arranges all walked elements into a JSON array, all labeled values placed into a single JSON object, all clashing labels are aggregated-jj
: arranges all walked elements into a JSON object (i.e. all walked elements which do not have labels will be ignored), the values with clashing labels will override the prior values (note:-l
,-n
and-nn
with-jj
have no effect)-jjm
: alters behavior of-jj
by enabling the aggregation of the values with the clashing labels into JSON arrays-jjll
: combined behavior of-jj
and-ll
-jjllm
: combined behavior of-jj
and-ll
and-m
-x
,-y
: facilitates breaking walk-paths (-w
) into a common (-x
) and variables parts (-y
); i.e. an argument of each-x
will be combined with every subsequent argument of-y
(if any), e.g.:-xA
-yB
-yC
-xD
-yE
-yF
result in 4 walk-paths:-wAB
-wAC
-wDE
-wDF
- thus-x
/-y
notation provides more succinct notation (than-w
) for multiple walk-paths with a common head (and varying tails)-xN/M
: controls which walk result(s) get(s) printed: print eachN
th walk starting withM
th (M
is an index and hence is zero based);-xN
prints everyN
th walk result (same as-xN/M
, whereM
=N
-1
),-x/M
prints a singleM
th walk (same as-x0/M
); special handling for-x0
or-x/-1
- it ensures that the very last walk result is always gets printed (but not duplicated)
Interpolation may occur in any of the following cases:
- in templates (
-T ...
) for walk/update/insert/compare type of operations - for the argument undergoing a
shell evaluation
(
-e
+ insert/update operation, e.g.:-eu ... \;
), - in the lexeme
<..>u
applying a shell evaluation on its content - in the lexeme
<..>j
where lexeme is expressed as a template - for any search lexeme where a quantifier is expressed through interpolation, e.g.:
<>a{N}
Interpolation occurs either from the namespace, or from a currently walked JSON element. Every occurrence of tokens {..}
or {{..}}
will trigger the interpolation attempt:
- if the content under braces is empty (
{}
,{{}}
) then the currently walked JSON element is getting substituted - if the content is non-empty (e.g.:
{val}
,{{val}}
) then interpolation is attempted from the referred namespace
Whenever template interpolation (-T ...
, <..>j
) occurs, the result of interpolation must always be a valid JSON,
otherwise the template will fail to get applied
The difference between single {}
and double {{}}
notations:
- double notation (a.k.a. dressed notation)
{{..}}
interpolates JSON elements exactly, so it's always a safe type - a single notation (a.k.a. naked notation)
{..}
"strips" the interpolated object and interpolates JSON strings, JSON arrays, JSON objects differently:- when interpolating JSON string, the outer quotation marks are dropped, e.g., instead of
"blah"
, it will be interpolated asblah
. Thus, it makes sense to use this interpolation inside double quotation marks - when interpolating JSON array, then enclosing brackets
[
,]
are dropped (allows extending arrays); e.g.,[1,"2",{}]
will be interpolated as1,"2",{}
(which is invalid JSON), thus, to keep it valid the outer brackets must be provided, e.g.:-T'[ 0, {}, 4, [] ]'
after becomes[ 0, 1, "2", {}, 4, [] ]
- when interpolating JSON object, then enclosing braces
{
,}
are dropped (allows extending objects), e.g.,{"pi":3.14}
will be interpolated as"pi": 3.14
, so to keep it valid the outer braces must be provided, e.g.,-T'{ {}, "type": "irrational" }'
- if you meant to insert in a template an empty object
{}
rather than a naked token (which has the same notation), then spell it over space:{ }
- when interpolating JSON string, the outer quotation marks are dropped, e.g., instead of
A string interpolation using a naked notation is handy when there's a requirement to alter/extend an existing string, here's example of altering JSON label:
bash $ jsn='{ "label1": "value1", "label2": "value2" }'
bash $ <<<$jsn jtc -w'<label>L:<>k' -u'<label>L:<>k' -T'"new {}"'
{
"new label1": "value1",
"new label2": "value2"
}
bash $
There, in the example above, the template token {}
refers to the result of walking -u
rather than -w
(the same holds
true for insert -i
and compare -c
operators: themplate operation (-T
) refers to the option argument rather than to the walk).
The walk -w
points to the destination location(s) for the update (which is the label, as per description of the lexeme <>k
),
while source is pointed by -u
walk.
Alternatively, the same could be achieved like this, in a bit more succinct way:
bash $ <<<$jsn jtc -w'<label>L:<T>k<>k' -u0 -T'"new {T}"'
{
"new label1": "value1",
"new label2": "value2"
}
bash $
The update argument (-u0
) in this case is a dummy JSON (0
is taken just as one of the shortest JSON values), its value will be
unused in the template and hence is irrelevant to the operation. The template now will refer to the value T
from the namespace,
which will be populated when destination -w
is walked (which, btw, always occurs before any other walks, as per design logic).
Why <..>k
lexeme is used twice? As per the lexeme design, when it's empty (and only then) it lets reinterpreting currently walked
element's label as a value (which is being updated in this case), Thus, the first lexeme <L>k
only preserves the label in the
namespace, while the second lexeme allows pointing to the label as the destination point for the update - which will be the
template-interpolated JSON value '"new label"'
.
Starting from jtc
v1.176
, it's possible to achive the same interpolation even in a more efficient and laconic way:
bash $ <<<$jsn jtc -w'<label>L:<>k' -u'"new {}";'
{
"new label1": "value1",
"new label2": "value2"
}
bash $
options -u
/-i
/-c
may accept now an argument in various forms: JSON, walk, template. In the latter case the template
token {}
will refer to the walk (-w
) result.
- Why trailing ;
is there?
- It boils down to how argument
disambiguation
for the options occurs: among various forms of an argument (JSON, walk, template) it's possible that the argument semantic
can be ambigous.
In our case, if the template string was like "new {}"
it would be impossible to tell if it's a JSON argument, or a template
(indeed, in your great design {}
might represent either a token, or an empty object, or just a literal value - there's no way for jtc
to know that).
So, jtc
needs a little hint from you how to treat that argument (in case if there's an ambiguity).
JSON argument does not tolerate any trailing symbols, while template argument disregards any trailing symbols after
interpolation and parsing. Thus symbol ;
only ensures that the argument is treated as a template rater than a JSON (in fact,
it could be any combinations of any trailing symbols)
Here's an illustration when a naked notation is required:
bash $ jsn='{ "pi": 3.14, "type": "irrational" }'
bash $ <<<$jsn jtc
{
"pi": 3.14,
"type": "irrational"
}
bash $
# swap around values and labels:
bash $ <<<$jsn jtc -i'[:]<Key>k<Val>v' -T'{ "{Val}": {{Key}} }' -p
{
"3.14": "pi",
"irrational": "type"
}
bash $
there, values getting into the namespace Val
will be different types in each pass: the first time it's a numeric value 3.14
,
in the second pass it'll be a string "irrational"
. Therefore, in the template, where Val
is used with the label semantic,
we have to ensure that the interpolation occurs of the naked value.
Let's consider both scenarios where interpolation of the namespace Val
uses a dressed notation:
bash $ <<<$jsn jtc -rpi'[:]<Key>k<Val>v' -T'{ {{Val}}: {{Key}} }'
{ "irrational": "type" }
- the interpolation of{{VAL}}
for the first record fails here (template becomes{ 3.14: "pi" }
, which is invalid JSON) and hence insertion happens of the same existing value (which is moot) and then getting purgedbash $ <<<$jsn jtc -rpi'[:]<Key>k<Val>v' -T'{ "{{Val}}": {{Key}} }'
{ "3.14": "pi" }
- here, the interpolation of"{{VAL}}"
for the second record fails: template becomes{ ""irrational"": "type" }
, which is also an invalid JSON.
The Key
token notation could have been spelled either way, e.g.: "{Key}"
- would work as well.
An array interpolation using a naked notation ({..}
) is handy when there's a requirement to extend the array during
template interpolation.
There's a special case though - template-extending of an empty array. Let's consider a following example:
bash $ jsn='[ {"args": [123],"Func": "x + y"}, { "args":[], "Func":"a * b" } ]'
bash $ <<<$jsn jtc
[
{
"Func": "x + y",
"args": [
123
]
},
{
"Func": "a * b",
"args": []
}
]
bash $
And the ask here would be to extend all arrays in each args
with the arguments from the respective Func
:
bash $ <<<$jsn jtc -w'<args>l:' -u'<Func>l:<(.+)[ +*]+(.+)>R[-1][args]' -T'[{}, {{$1}}, {{$2}}]' -tc
[
{
"Func": "x + y",
"args": [ 123, "x", "y" ]
},
{
"Func": "a * b",
"args": [ "a", "b" ]
}
]
bash $
When interpolating the second record (the one with "Func": "a * b"
), the interpolation in fact, would result in the invalid JSON:
the array in args
is empty initially ("args": []
), thus when it gets interpolated via template [{}, {{$1}}, {{$2}}]
it
becomes [, "a", "b"]
- which is an invalid JSON array. However, jtc
is aware of such empty iterables and handles them properly,
allowing extending even empty arrays and objects without producing failures.
All the same applies when interpolating JSON objects and JSON strings.
An array could be converted into a string literally using a dressed token notation as long it does not hold any string literals, e.g.:
bash $ <<<'[null, true, 123, {}, []]' jtc -T'"array stringified: {{}}"'
"array stringified: [ null, true, 123, {}, [] ]"
bash $
- presence of any string value (or a label) would fail the interpolation (b/c of quotation marks)
Any iterable can be interpolated into a string template using the naked token notation: then all its atomic values get recursively enumerated within the string:
# array:
bash $ jsn='{"array":[null,1,true,{"root":["five"]}]}'
bash $
bash $ <<<$jsn jtc -w[array] -T'"stringified array: {}"'
"stringified array: null, 1, true, five"
bash $
# object:
bash $ <<<$jsn jtc -T'"stringified object: {}"'
"stringified object: null, 1, true, five"
bash $
Note, string values ("five"
) also get naked (because during such kind of interpolation a naked notation token {}
gets
applied onto each value of the iterable one by one).
By default, for such kind of interpolations (stringifying iterables) the enumeration separator used is held in the namespace $#
(default value ", "
), which means, it could be altered by a user:
bash $ <<<'[1,2,3,4,5]' jtc -w'<$#:\t>v' -qqT'"good for TSV conversion:\n{}"'
good for TSV conversion:
1 2 3 4 5
bash $
A namespace is a container within jtc
, which allows storing JSON elements (nodes) programmatically while walking JSON.
Stored in the namespace values could be reused later in the same or different walk-paths and interpolated in templates and arguments for a shell evaluation.
Beside user provided names, jtc
caters a number of internally generated/supported tokens and names that have various applications:
$N
namespace, whereN
is a number - typically that would be a reference to the result of a REGEX matched group (however, could be (re)used by a user as well)$a
,$b
,$c
, etc - auto generated tokens when the interpolated value is an iterable (an array or an object), where$a
refers to the first top element in the iterable,$b
to the second top, etc$A
,$B
,$C
, etc - auto generated tokens when the interpolated value is an iterable, where$A
refers to the first top element's label/index,$B
to the second top's label/index, etc$PATH
- an auto generated token, used in templates when requires to interpolate a path (set of indices/labels) to the walked point as a JSON array$path
- same as$PATH
but interpolation occurs as a JSON string$file
- a namespace, holding currently processed filename (if one given, otherwise empty)$_
- a namespace, holding a string value that is used when the elements during{$path}
interpolation are getting concatenated (default value"_"
)$#
- a namespace, holding a string value that is used as a separator when a JSON array or object is getting template-interpolated into a string (default value", "
)$?
- a token referring to the result of a prior successful walk, thus it's used to expand multiple walks into a string or an array$$?
- a namespace holding a string separator considered when expanding walks using{$?}
token into a string (default value is","
)
Directives <>v
, <>k
(as well as all other lexemes allowing capturing and setting namespace) and search lexemes <>s
, <>t
let facilitating cross-lookups. Say, we have a following JSON:
bash $ jsn='{ "item": "bread", "list":{ "milk": 0.90, "bread": 1.20 } }'
bash $ <<<$jsn jtc -tc
{
"item": "bread",
"list": { "bread": 1.20, "milk": 0.90 }
}
bash $
the ask here would be to retrieve a value from list
given the label is in item
- that would require a cross lookup.
Using namespaces it becomes a trivial task:
bash $ <<<$jsn jtc -w'[item]<Itm>v[^0]<Itm>t' -l
"bread": 1.20
bash $
[item]
- selects a value by labelitem
<Itm>v
- stores currently walked (selected) value (bread
) it in the namespaceItm
[^0]
- resets the walk path back to the root<Itm>t
- searches (recursively) for a (first) label matching the value stored in the namespaceItm
(which isbread
)
The similar way (like in <Itm>v
) labels/indices could be stored and accessed in the namespace using directive <>k
.
The empty directive lets reinterpreting label/index of the currently walked JSON element and treat it as a JSON string / JSON number
value respectively.
E.g., say, we want to list all labels in the address
record:
bash $ <ab.json jtc -w'<address>l[:]<>k'
"city"
"postal code"
"state"
"street address"
bash $
and we want to remap some of the labels, e.g.: postal code
-> zip
, street address
-> street
.
Here's a way to do it:
bash $ map='{"postal code":"zip","street address":"street"}'
bash $
bash $ <ab.json jtc -w'<address>l[:]<Lbl>k<>k' -u"$map" -u'>Lbl<t' / -w'<address>l'
{
"city": "New York",
"state": "NY",
"street": "599 Lafayette St",
"zip": 10012
}
bash $
- it was already explained why there are 2 x k
-lexemes in the walk-path, but here's once again: when k
-lexeme stores the label
into a namespace it does not re-interpret the label (it ony does storing), while the emplty k
-lexeme does.
Here are both of the path tokens demonstrated:
bash $ <ab.json jtc -w'<Jane>' -T'{{$PATH}}' -r
[ "Directory", 2, "name" ]
bash $ <ab.json jtc -w'<NY>' -T'{{$path}}'
"Directory_0_address_state"
bash $
to play safe with the templates, always surround them with single quotes (to dodge shell interpolation)
here's an example how to join path tokens using a custom separator:
bash $ <ab.json jtc -w'<$_:\t>v<NY>' -qqT'{{$path}}'
Directory 0 address state
bash $
Equally, the same could be achived with the $PATH
token:
bash $ <ab.json jtc -w'<$#:\t>v<NY>' -qqT'"{$PATH}"'
Directory 0 address state
bash $
A prior walk could be referred (during interpolation) using an auto-generated token $?
. It comes handy when it's required
to build/join up JSON records:
bash $ <<<'["a","b","c"]' jtc -w[:]
"a"
"b"
"c"
bash $ <<<'["a","b","c"]' jtc -w[:] -T'[{$?}, {{}}]' -r
[ "a" ]
[ "a", "b" ]
[ "a", "b", "c" ]
bash $
When expanding values into a string (rather than into an array), the separator used by a user is arbitrary, e.g.:
bash $ <<<'["a","b","c"]' jtc -w[:] -T'"{$?} | {}"'
" | a"
" | a | b"
" | a | b | c"
bash $
The first separator appearing as an artifact of the first interpolation is undesirable and it seems superfluous. To rid of this artifact
the namespace $$?
holds the value which jtc
considers as a separator (default is ,
):
bash $ <<<'["a","b","c"]' jtc -w'<$$?:|>v[:]' -T'"{$?} | {}"'
"a"
"a | b"
"a | b | c"
bash $
When interpolation of the token $?
occurs for the first time (i.e. when there was no prior walk) then the value of this token
is a default empty string (""
). The reset of the token back to the default value may occur in 2 ways:
a) if template interpolation fails
b) if user sets the namespace $?
to any (even empty) value - this is a user's way to control token behavior
Say, we want to build an object mapping parent's name to all its children (like: "Parent": "..list of children.."
) using $?
token?
The query could be achieved this way:
bash $ <ab.json jtc -w'<name>l:<N>v[-1][children][:]' -T'{"{N}": "{$?}, {}"}' -lljj
{
"Jane": "Olivia, Robert, Lila",
"John": "Olivia"
}
bash $
See what happens there? The Jane
's record holds the child's name from John
's record because token $?
did not get reset in between
walking parent's name
s steps. To rectify it, we need to add resetting $?
token to a default value in the respective posion
in the walk-path (anywhere in between lexeme iterating over name
s and children
):
bash $ <ab.json jtc -w'<name>l:<$?:>v<N>v[-1][children][:]' -T'{"{N}": "{$?}, {}"}' -lljj
{
"Jane": "Robert, Lila",
"John": "Olivia"
}
bash $
The example above is shown for instructive purpose. Probably the easier (and more efficient) way achieving the same result would be this one:
bash $ <ab.json jtc -w'<name>l:<N>v[-1][children]' -T'{"{N}": "{}"}' -jjll / -pw'<>' { "Jane": "Robert, Lila", "John": "Olivia" } bash $
When a JSON iterable is being interpolated, it generates auto-tokens which could be reused in a template-interpolation.
Each value in the iterable could be referred by a respective token: the first value is referred by $a
, the second is by $b
, and so on.
In the unlikely event of running out of all letters (a - z), the next tokens would be $aa
, $ab
, and so on.
Labels and/or indices of the interpolatable iterable also could be referred using capital letters notations:
$A
, $B
, ... $Z
, $AA
, $AB
, etc.:
bash $ <<<'["This", "is", "example"]' jtc -T'"{$a} {$b} an {$c}!"'
"This is an example!"
bash $
The auto-generated tokens for interpolated iterables can go as far as 3 characters in size. Say, there's a big flat JSON array
(for the simplicity of explanation) like this: [1, 2, 3, 4, 5 ... ]
. The first token ($a
, $A
) will refer to the number 1
(and its index 0
respectively), b$
will refer 2
, $aa
will do 27
, $aaa
will do 703
, etc.
The highest element (index) in such array could be referred by auto-token $zzž
($ZZZ
), which is 18278
(index 18277
)
Here's a proof:
bash $
bash $ <<<0 jtc -jw'<c>I1><F19999' -T{c} / -T'{$a}'
1
bash $ <<<0 jtc -jw'<c>I1><F19999' -T{c} / -T'{$b}'
2
bash $ <<<0 jtc -jw'<c>I1><F20999' -T{c} / -T'{$aa}'
27
bash $ <<<0 jtc -jw'<c>I1><F19999' -T{c} / -T'{$aaa}'
703
bash $ <<<0 jtc -jw'<c>I1><F19999' -T{c} / -T'{$zzz}'
18278
bash $
- the first option chain-set (-jw'<c>I1><F19999' -T{c}
) generates such a big flat array: [1, 2, 3, ... 19999, 20000]
,
the second part (-T'{$..}'
) picks the respective element from the array
There are 2 ranges for auto-generated tokens for iterables:
- the first range refers each top element of the iterable,
- second range of auto-token refers to each atomic value in the iterable as if it was walked recursively.
It's easier to understand auto-token ranges on the example. Consider this simple JSON:
bash $ jsn='{"item": "spoon", "props": ["steel", "dessert"]}'
bash $ <<<$jsn jtc
{
"item": "spoon",
"props": [
"steel",
"dessert"
]
}
bash $
The first-range tokens ($a
and $b
) will refer top values:
bash $ <<<$jsn jtc -T'{{$a}}'
"spoon"
bash $
bash $ <<<$jsn jtc -T'{{$b}}'
[
"steel",
"dessert"
]
bash $
While the second-range tokens (for this JSON the second range starts with token $c
) will refer each atomic value as
if JSON was walked recursively:
bash $ <<<$jsn jtc -T'{{$c}}'
"spoon"
bash $
bash $ <<<$jsn jtc -T'{{$d}}'
"steel"
bash $
bash $ <<<$jsn jtc -T'{{$e}}'
"dessert"
bash $
In fact, atomic values also generate auto-tokens ($a
and $A
for the value and the label/index respectively):
bash $ <<<'[null]' jtc -w[0] -T'"atomic: {$a} with index: {$A}"'
"atomic: null with index: 0"
bash $
When multiple interleaved walks (-w
) present (obviously there must be multiple walks - a single one cannot be interleaved),
they populate namespaces in the order the walks appear:
bash $ <ab.json jtc -x[0][:] -y'[name]<pnt>v' -y'[children][:]<chld>v' -T'{ "Parent": {{pnt}}, "child": {{chld}} }' -r
"John"
{ "Parent": "John", "child": "Olivia" }
{ "Parent": "Ivan", "child": "Olivia" }
{ "Parent": "Jane", "child": "Olivia" }
{ "Parent": "Jane", "child": "Robert" }
{ "Parent": "Jane", "child": "Lila" }
bash $
That is a correct result (though might not reflect what possibly was intended), let's review it:
- first line contains only result
"John"
- because template interpolation fails here (namespacechld
does not yet exist yet, thus the resulting template is invalid JSON) hence source walk is used / printed w/o interpolation - upon next (interleaved) walk, we see a correct result of a template interpolation:
Parent
's andchild
's records are filled right (template is a valid JSON here) - in the third is also corline, the result rect, albeit might not be the expected one - upon next interleaved walk, the
namespace
pnt
is populated with"Ivan"
, but the namespacechld
still carries the old result. - etc.
By now it should be clear why is such result.
Going by the notion of the provided template, apparently, the expected result were to have all records pairs for each person with
each own child. That way, for example, Ivan
should not be even listed (he has no children), John
's record should appear only
once and Jane
should have 2 records (she has 2 kids).
The situation could be easily rectified if for each walk we use own template and assign a dummy one for the first one:
bash $ <ab.json jtc -x[0][:] -y'[name]<pnt>v' -T'""' -y'[children][:]<chld>v' -T'{ "Parent": {{pnt}}, "child": {{chld}} }' -r
""
{ "Parent": "John", "child": "Olivia" }
""
""
{ "Parent": "Jane", "child": "Robert" }
{ "Parent": "Jane", "child": "Lila" }
bash $
Now the result looks closer to the intended one (no records for Ivan
, one for John
and 2 for Jane
, as expected). But what about
those annoying empty JSON strings ""
? Those will be gone if -qq
option is thrown in:
bash $ <ab.json jtc -x[0][:] -y'[name]<pnt>v' -T'""' -y'[children][:]<chld>v' -T'{ "Parent": {{pnt}}, "child": {{chld}} }' -rqq
{ "Parent": "John", "child": "Olivia" }
{ "Parent": "Jane", "child": "Robert" }
{ "Parent": "Jane", "child": "Lila" }
bash $
- that's a neat, though a documented trick
Yet again, the same could have been achieved even a simpler way (using just one walk):
bash $ <ab.json jtc -w'<name>l:<pnt>v[-1][children][:]' -T'{ "Parent": {{pnt}}, "child": {{}} }' -r
{ "Parent": "John", "child": "Olivia" }
{ "Parent": "Jane", "child": "Robert" }
{ "Parent": "Jane", "child": "Lila" }
bash $
Interpolation may also occur in quantifiers, say we have a following JSON, where we need to select an item from list
by the
index value stored item
:
bash $ jsn='{ "item": 2, "list": { "milk": 0.90, "bread": 1.20, "cheese": 2.90 } }'
bash $ <<<$jsn jtc
{
"item": 2,
"list": {
"bread": 1.20,
"cheese": 2.90,
"milk": 0.90
}
}
bash $
To achieve that, we need to memorize the value of item
in the namespace first, then select a value from the list by the index:
bash $ <<<$jsn jtc -w'[item]<idx>v[-1][list]><a{idx}' -l
"milk": 0.90
bash $
It should be quite easy to read/understand such walk path (predicated one is familiar with suffixes / directives). Let's see how the walk-path works in a slow-mo:
[item]
: selects the value by labelitem
:
bash $ <<<$jsn jtc -w'[item]'
2
bash $
<idx>v
: the directive memorizes selected value (2
) in the namespaceidx
bash $ <<<$jsn jtc -w'[item]<idx>v'
2
bash $
[-1]
: steps up 1 level in the JSON tree off the current position (i.e., addresses the first parent of theitem
value) which is the root of the input JSON:
bash $ <<<$jsn jtc -w'[item]<idx>v[-1]'
{
"item": 2,
"list": {
"bread": 1.20,
"cheese": 2.90,
"milk": 0.90
}
}
bash $
[list]
: selects the object value by labellist
:
bash $ <<<$jsn jtc -w'[item]<idx>v[-1][list]'
{
"bread": 1.20,
"cheese": 2.90,
"milk": 0.90
}
bash $
><a{idx}
- a non-recursive search of atomic values (><a
) indexed by a quantifier with the stored in the namespaceidx
(which is2
) gives us the required value.
Alternatively, the same ask could have been achieved using a slightly different query:
bash $ <<<$jsn jtc -w'[item]<idx>v[-1][list]>idx<t' -l
"milk": 0.90
bash $
- >idx<t
lexeme here will use namespace idx
to produce the offset (index).
There's a subtle difference how the lexeme t
treats and uses referred namespace:
- in a recursive
<..>t
notation, the lexeme always treats the value in the namespace as JSON string and will try searching (recursively) a respective label. I.e., even if the value in the namespace is numerical value0
, it will search for a label"0"
- in a non-recursive
>..<t
notation, if the namespace holds a literal (i.e., a JSON string) value, then the lexeme will try matching the label (as expected); however, if the namespace holds a numerical value (JSON number), then the value (its integral part) is used as a direct offset in the searched JSON node
Template (an argument to -T
/-u
/-i
/-c
, or in lexeme <..>j
) is a literal JSON optionally containing tokens for interpolation.
Templates can be used upon walking, insertion, updates and when comparing. The result of template interpolation still must
be a valid JSON. If a template (-T
) is given then it's a template value (after interpolation) will be used for the operations,
not the source walk (unless the resulting template is invalid JSON, in such case the source walk will be used).
Remember, template always refers to a source (walk/JSON), e.g.:
- if
jtc
executes walking only (-w
) then such walks refer to source (walk source JSON)- if
jtc
executes any of-u
/-i
/-c
, then arguments of such operations is a source (while any of-w
walks point to destinations location points of update/insertion/comparison operation)
When walking is a standalone operation, then template interpolation occurs from the walk (-w
) results:
bash $ <ab.json jtc -w'[0][0]<number>l:'
"112-555-1234"
"113-123-2368"
bash $ <ab.json jtc -w'[0][0]<number>l:' -T'"+1 {}"'
"+1 112-555-1234"
"+1 113-123-2368"
bash $
For the operations -i
, -u
, -c
the namespaces resulting from walking destination (-w
) are shared with source walks in operations
- that way
cross-referenced insertions and updates
are possible. Logically, for each destination walk (-w
) there will be a respective subsequent source walk
(e.g.: -u <src-walk>
), thus source walk may utilize the namespaces populated during destination walk (-w
).
Template-interpolation will be attempted only once source walk is successful. If an attempt of template interpolation fails (resulting
in an invalid JSON) then the source result wil be used for the operation.
Below is an example of updating the phone records for the first entry in the Directory
(appending a country-code and
altering the phone
label at the same time via template):
# dump phones of the 1st record in the Directory
bash $ <ab.json jtc -w'<phone>l'
[
{
"number": "112-555-1234",
"type": "mobile"
},
{
"number": "113-123-2368",
"type": "mobile"
}
]
bash $
# let's transform phone records: append country-code and update labels at the same time
bash $ <ab.json jtc -w'<phone>l[:]' -pi'<phone>l[:][number]<V>v' -T'{ "phone number": "+1 {V}" }' / -w'<phone>l'
[
{
"phone number": "+1 112-555-1234",
"type": "mobile"
},
{
"phone number": "+1 113-123-2368",
"type": "mobile"
}
]
bash $
Explanations:
-w'<phone>l[:]'
- that's our destinations which we will be updating (i.e., all phone records in the firstDirectory
entry)-pi'<phone>l[:][number]<V>v'
- we'll walk (logically, synchronously with-w
) all thenumber
records in the first phone record and memorizenumber
values in the namespaceV
; option-p
turns insert operation into move-T'{ "phone number": "+1 {V}" }'
after each source walk (the argument of-i
) a template interpolations occurs - a new JSON entry is generated from the template and namespaceV
, and the new entry is then used for insertion into the respective destination walk (-w
). Thus using templates it becomes easy to transmute existing JSON entry into a new one.
B.t.w., in the above example usage of namespace
<V>v
was reduntand and served only instructional purpose - the same JSON query could be achieved w/o using the namespace, compare:
<ab.json jtc -w'<phone>l[:]' -pi'<phone>l[:][number]' -T'{ "phone number": "+1 {}" }' / -w'<phone>l'
There might be a confusion how purge (
-p
) is applied when used together with-i
,-u
,-c
:
- when the argument of the options is a walk and not a JSON (i.e. when options
-i
,-u
,-c
are walking the input JSON), then the purge is applied to the source walked elements- when the argument of the options is JSON, then the purge is applied to the destination walked (
-w
) elements
jtc
operations -i
/-u
/-c
also support template (as of v1.76
) as an argument, that extends capabilities and use-cases of such
operations:
- template-interpolate operation directly from source walks:
bash $ <ab.json jtc -w'<name>l' -u'["Smith", {{}}];' / -lrw'<name>l'
"name": [ "Smith", "John" ]
bash $
- template-interpolation of the operation's template argument (double templating):
bash $ <ab.json jtc -w'<name>l' -u'["Smith", {{}}];' -T'"Jones-{}"' / -lrw'<name>l'
"name": "Jones-Smith, John"
bash $
- template-interpolation of the operation's template argument walks:
bash $ <ab.json jtc -w'<name>l' -u'["Smith", {{}}];' -u'[0] ' -T'"{} & Wesson"' / -lrw'<name>l'
"name": "Smith & Wesson"
bash $
See argument disambiguation
for explanations why there are trailing symbols (;
,
) after template and walk arguments
When multiple templates given and a number of walks (-w<walk>
, or -u<walk>
, -i<walk>
to which templates applied)
is the same, then templates are pertain to each walk. In all other cases templates are applied in a round-robin fashion.
In the case where a round-robin behavior is required while a number of templates and walks matches, use -nn
notation -
it will ensure round-robin templates application onto sequential walks
Compare:
# templates are pertain to walks:
bash $ <ab.json jtc -x[0][:] -y[name] -T'{"Person":{{}}}' -y[age] -T'{"Age":{{}}}' -r
{ "Person": "John" }
{ "Age": 25 }
{ "Person": "Ivan" }
{ "Age": 31 }
{ "Person": "Jane" }
{ "Age": 25 }
bash $
# templates are pertain to walks, while walks go sequential (non-interleaved):
bash $ <ab.json jtc -x[0][:] -y[name] -T'{"Person":{{}}}' -y[age] -T'{"Age":{{}}}' -rn
{ "Person": "John" }
{ "Person": "Ivan" }
{ "Person": "Jane" }
{ "Age": 25 }
{ "Age": 31 }
{ "Age": 25 }
bash $
# templates applied round-robin, while walks go sequential (non-interleaved):
bash $ <ab.json jtc -x[0][:] -y[name] -T'{"Person":{{}}}' -y[age] -T'{"Age":{{}}}' -rnnn
{ "Person": "John" }
{ "Age": "Ivan" }
{ "Person": "Jane" }
{ "Age": 25 }
{ "Person": 31 }
{ "Age": 25 }
bash $
The mess in the above last example is explained by -nn
usage: templates are forced to get applied in the round-robin fashion while
walks are sequential (-n
).
One use-case of multiple round-robin templates would be this example:
bash $ <<<'[1,2,3,4,5,6,7,8,9,10]' jtc -w[:] -T'""' -T{} -T'""' -qq
2
5
8
bash $
- in the above example printed every 3rd element from source JSON starting from the 2nd one (recall: when unquoting an empty
JSON string ("") the resulted blank lines are not getting printed). Though a much handier way to achieve the same is to use
-xn/N
option.
There's one more token combination in templates allowing stringification and jsonization of values:
<..>
,<<..>>
will attempt "expanding" a string value into a JSON>..<
,>>..<<
will take a current JSON value and stringify it (compress JSON into a string)
The token notation follows the same rule as for regular tokens ({}
, {{}}
):
- single angular bracket notation (
<..>
,>..<
) will result in a "naked" JSON value (without quotation marks, curly braces or square brackets for strings, objects and arrays respectively) - double token notation (
<<..>>
,>>..<<
) is always a safe ("dressed") type and the result of operation will be a complete JSON type.
This little demo illustrates the tokens usage:
bash $ <<<'{"atomics": ["abc", 123, null, true]}' jtc -T'">{{}}<"'
"{ \"atomics\": [ \"abc\", 123, null, true ] }"
bash $
- because the single form of angular token notation was used, the outer quotation marks were necessary for a successful interpolation.
The same could have been achieved with the template: -T'>>{{}}<<'
That was the example of stringification of a JSON value, now let's do a reverse thing - jsonize previously stringified value:
bash $ <<<'{"atomics": ["abc", 123, null, true]}' jtc -T'">{{}}<"' / -T'{<{{}}>, "empty":[{ }, []]}' -tc
{
"atomics": [ "abc", 123, null, true ],
"empty": [ {}, [] ]
}
bash $
- the above example sports jsonization of the previously stringified JSON while extending resulting JSON object at the same time
Note how the empty object was spelled there:
{ }
- over space. It's done intentionally, because if it was spelled w/o one ({}
) then it would be attempted for an interpolation and fail.
- That is the suggested way of spelling empty objects in the templates.
{}
: a token used in templates resulting in a "naked" type of interpolation from the currently walked JSON{NS}
: same as{}
, but interpolation occurs from the namespaceNS
{{}}
: a token used in templates resulting in a "dressed" type of interpolation from the currently walked JSON{{NS}}
: same as{{}}
, but interpolation occurs from the namespaceNS
<json_str>
: a token notation for a jsonization request of the stringified JSONjson_str
, the result is a "naked" JSON value<<json_str>>
: same as<json_str>
, however the result is a complete ("dressed") JSON value>json<
: a token notation for a stringification request of a JSONjson
, the result is a "naked" JSON string value>>json<<
: same as>json<
, however the result is a complete ("dressed") JSON string value
Namespace
: a user can use any spelling (including spaces) to define a name within the namespace, as long it's compatible with the JSON string definition?..
: all internally generated names and tokens always begin with symbol$
?N
: where N is a number - these names are generated by matching subgroups in REGEX lexemes (<..>R
,<..>D
,<..>L
)$a
,$b
..$z
,$aa
,$ab
, .. : auto-generated tokens when the interpolated JSON is an iterable, each token corresponds to a respective ordinal value in the iterable$A
,$B
..$Z
,$AA
,$AB
, .. : auto-generated tokens when the interpolated JSON is an object, each token corresponds to a label of each respective ordinal value in the object$PATH
: a token expanding into a JSON array holding a path (set of labels and indices) towards currently walked JSON element$path
: a token expanding into a JSON string representing a path towards currently walked JSON element$_
: a name holding a separator used when expanding token$path
, default value is"_"
$#
: a name holding a separator used when a JSON iterable is expanded into a JSON string, default value is", "
$?
: a token referring to the result of a prior last successful walk$$?
: a namespace holding a string separator considered when expanding walks using$?
token, default value is","
there are a few options that let modifying input JSON:
-i
- insert (copy-insert, copy-merge, move) new elements to JSON-u
- update (rewrite, rewrite-merge, move) elements to JSON-s
- swap around pair(s) of JSON elements-p
- purge (remove) elements from JSON
Typically, those options are mutually exclusive and if sighted together only one operation will be executed (the above list
is given in the priority order of operation selection). However, there is a combination of options -i
/-u
and -p
, which
facilitates move semantic, those cases reviewed in the respective chapters.
Each of options (except -s
, which requires at least a pair of walks) requires one or multiple destination walks (-w
) to be given
to operate on - a.k.a. destination walk (howerver, actually none of destination walks can be given, in such case a default walk
pointing to a JSON root -w[^0]
is assumed).
Options -i
and -u
require an argument, which comes in different flavors: JSON/walk/template
jtc
will execute any of those operations only once, if multiple operations required, then those could be combined in multiple
option chain sets, daisy-chained through the separator /
.
Once the modification operation is complete, the entire resulting JSON is displayed.
By default jtc
expects the input from stdin
. If the standalone argument(s) args
is given then jtc
will read input from the
file (ignoring stdin
), see below:
# show content of the file:
bash $ cat file.json
[ "JSON", "in", "file" ]
bash $
# both input sources present: stdin and file
bash $ <<<'[ "<stdin>", "JSON" ]' jtc file.json
[
"JSON",
"in",
"file"
]
bash $
The option -f
(together with a single argument) redirects (forces) the output of the operation into the file (instead of stdout
):
bash $ <<<'[ "<stdin>", "JSON" ]' jtc -f file.json
bash $ cat file.json
[
"JSON",
"in",
"file"
]
bash $
In the above example, JSON is read from file.json
and output back into the file (stdin
input is ignored) - note the altered
format of the file.
The bare hyphen (-
) overrides file input and ensures that the input is read from the stdin
:
bash $ <<<'[ "<stdin>", "JSON" ]' jtc -f - file.json
bash $ cat file.json
[
"<stdin>",
"JSON"
]
bash $
-p
removes from JSON all walked elements (pointed by -w
walks). E.g.: let's remove from the address book (for the sake
of an example) all the home
and office
phones records (effectively leaving only mobile
phone records):
bash $ <ab.json jtc -w'[type]:<home>:[-1]' -w'[type]:<office>:[-1]' -p / -w'<phone>l:' -l -tc
"phone": [
{ "number": "112-555-1234", "type": "mobile" },
{ "number": "113-123-2368", "type": "mobile" }
]
"phone": [
{ "number": "223-283-0372", "type": "mobile" }
]
"phone": []
bash $
Of course there's a bit more succinct syntax:
bash $ <ab.json jtc -x[type]: -y'<home>:[-1]' -y'<office>:[-1]' -p / -w'<phone>l:' -ltc
or, using even a single walk-path:
bash $ <ab.json jtc -w'[type]:<home|office>R:[-1]' -p / -w'<phone>l:' -ltc
Another use-case example: remove all the JSON elements except walked ones, while preserving original JSON structure - that's
a feat for a plural option notation: -pp
. E.g.: let's drop all the entries (in all records) but name
and spouse
:
bash $ <ab.json jtc -w'<name|spouse>L:' -pp -tc
{
"Directory": [
{ "name": "John", "spouse": "Martha" },
{ "name": "Ivan", "spouse": null },
{ "name": "Jane", "spouse": "Chuck" }
]
}
bash $
Here, name|spouse
is the RE (indicated by the RE label search suffix L
) matching labels containing either "name"
or
"spouse"
-s
requires walk-paths (-w
) to be given in pairs. Paired walk-path will be walked concurrently (so, ensure they are consistent)
and resulted JSON elements will be swapped around.
E.g., here's the way of swapping around name
and spouse
for all records on the address book:
bash $ <ab.json jtc -w'<name>l:' -w'<spouse>l:' -s / -w'<name|spouse>L:' -l
"name": "Martha"
"spouse": "John"
"name": null
"spouse": "Ivan"
"name": "Chuck"
"spouse": "Jane"
bash $
- for the sake of output brevity, the swapped elements only displayed
Probably, a more frequent use-case for -s
is when it's required to remove some extra/redundant nestedness in a JSON structure.
E.g., let's remove array encapsulation from phone records, leaving only the last phone record in it:
bash $ <ab.json jtc -w'<phone>l:' -w'<phone>l:[-1:]' -s / -w'<phone>l:' -l
"phone": {
"number": "113-123-2368",
"type": "mobile"
}
"phone": {
"number": "223-283-0372",
"type": "mobile"
}
"phone": {
"number": "333-638-0238",
"type": "home"
}
bash $
- again, for the brevity, only phone records are displayed
Finally, more than just one pair of walks (-w) could be swapped out. In fact, al of given pairs of walks will be swapped (predicated walks are valid did not become invalid during prior walk pair swap operations)
when either of insert (-i
) of update (-u
) operation is carried, there 2 types of options/arguments required:
- desnination points: facilitated with
-w
- a.k.a. destination walks - source points: facilitated with
-i
(-u
)
The destination points of insertion are given using -w
option(s) walking input JSON, while the argument under -i
designates the source of the insertions (multiple -i
options could be given).
The source of the insertion must always be a valid JSON.
Insert operations never result in overwriting destination JSON elements (though the destination could be extended via a merge). There are a few different flavors of insertion arguments (i.e., sources of insertion):
-i <arg_JSON>
: anarg_JSON
could be either read from a file or spelled literally, or given as a template (which gets interpolated and must result in a valid JSON before insertion occurs)-i <arg_JSON> -i <walk-path>
: here thewalk-path
actually walksarg_JSON
rather than the input JSON; only one option witharg_JSON
argument is allowed while multiple options withwalk-path
may be given-i <walk-path>
: the argumentwalk-path
walks the input JSON-e -i <shell_cli> \;
:shell_cli
is the shell command sequence terminated with\;
destination walks (-w
) to be shell evaluated, optionally containing interpolation tokens; tokens{}
,{{}}
will be referring to JSONs pointed by-w
(destination) walk; the returned value (predicated the evaluation was a success) has to be a valid JSON, otherwise it'll be promoted to a JSON string.-ei <shell_cli> \; -i<walk-path>
:walk-path
here walks the input JSON (wcich is different from case 2, where such argument walks the source of insertion) and tokens{}
,{{}}
will be referring-i<walk-path>
, the shell evaluation occurs for the each of-i<walk-path>
rather than for destination walks (-w
).
How does jtc
know which argument is supplied? The disambiguation path is like this:
- initially a file argument is assumed and attempted to be open/read, if that fails (i.e., file not found), then
- a literally spelled JSON is assumed and attempted to be parsed. If JSON parsing fails, then
- a
walk-path
is assumed and compiled - if that fails too, then - argument is considered to be a template - a template interpolation is deferred until the actual operation (insertion/update) occurs.
There's a problem (usually associated with short argument notations) that different semantics of argument may clash. For example, let's
consider this argument -i'[{}]'
- actually it could be anything - even a file.
- Alright, the 1st type of argument can be sorted by
jtc
automaticaly - if such file exists then JSON will be read from it. If doesn't, thenjtc
will parse it as a JSON: indeed[{}]
is a valid JSON - it's an array made of a an empty object. - But what if a user meant to pass it as a walk-path?
[{}]
is also a perfect walk-path - it tries addressing a value by the label"{}"
at root. There are couple ways to dodge such clashing:- make the walk-path look more explicit like a walk and not like a JSON (e.g., the same walk semantic could be expressed as
>{}<l
) - or allow a trailing space past the walk:
'[{}] '
- The literally spelled JSON argument does not tolerate any trailing characters (even spaces) while walk-path accept spaces before/between/past lexemes
- make the walk-path look more explicit like a walk and not like a JSON (e.g., the same walk semantic could be expressed as
- Fine, but what if it's meant to be passed as a template?
[{}]
is also an array template with a "naked" interpolation token{}
. To ensure that such argument is treated as a template, pass any (non-space) trailing characters, e.g.:[{}];
. The template policy in the argument w.r.t. any trailing characters is relaxed - it attempts parsing what it can and discards any trailing symbols.
if insert/update occurs from a file and such file caters multiple JSONS (a.k.a. stream of JSONs), then the stream of JSON is automatically converted into array of JSONs:
bash $ cat file.json
[ "first", "JSON" ]
{ "second": "JSON" }
"third JSON"
bash $
bash $ <<<'{"update here": null}' jtc -w'[update here]' -u file.json -tc
{
"update here": [
[ "first", "JSON" ],
{ "second": "JSON" },
"third JSON"
]
}
bash $
The destination insertion point(s) (-w
) controls how insertion is done:
- if a given destination insertion point (
-w
) is a single walk and non-iterable - i.e., if it's a single point location - then all the supplied sources are attempted to get inserted into a single destination location:
# list all children records
bash $ <ab.json jtc -w'<children>l:' -lr
"children": [ "Olivia" ]
"children": []
"children": [ "Robert", "Lila" ]
bash $
# make couple insertions in a single destination point
bash $ <ab.json jtc -w'[name]:<Ivan>[-1][children]' -i'"Maggie"' -i'"Bruce"' / -w'<children>l:' -lr
"children": [ "Olivia" ]
"children": [ "Maggie", "Bruce" ]
"children": [ "Robert", "Lila" ]
bash $
- if a given destination insertion point is iterable or multiple are given, then all sources (
-i
arguments) are inserted one by one in a round-robin fashion (if source runs out of JSON elements, but destination has more room to iterate, then source is wrapped to the first element):
# make insertion in a round-robin fashion
bash $ <ab.json jtc -w'<children>l:' -i'"Maggie"' -i'"Bruce"' / -w'<children>l:' -lr
"children": [ "Olivia", "Maggie" ]
"children": [ "Bruce" ]
"children": [ "Robert", "Lila", "Maggie" ]
bash $
while insertion into arrays is obvious (well, so far), insertion into objects requires clarification:
- objects always merged recursively
- in case of the clashing labels, by default, the destination is preserved while source of insertion is discarded
To illustrate, let's insert a JSON structure: { "PO box": null, "street address": null }
into the last record's address
:
# dump the first address record in Directory
bash $ <ab.json jtc -w'[0][-1:][address]' -l
"address": {
"city": "Denver",
"postal code": 80206,
"state": "CO",
"street address": "6213 E Colfax Ave"
}
bash $
# insert custom entries into the first address record
bash $ <ab.json jtc -w'[0][-1:][address]' -i'{ "PO box": null, "street address": null }' / -w'[0][-1:][address]' -l
"address": {
"PO box": null,
"city": "Denver",
"postal code": 80206,
"state": "CO",
"street address": "6213 E Colfax Ave"
}
bash $
- the "PO box"
got inserted, but the destination object's value in the "street address"
has been preserved
The source (a JSON being inserted) and the destination (a JSON point where insertion occurs) elements might represent different types: JSON array, JSON object, JSON atomic. Thus there's a number of variants of insertions of one type of elements into others. All such variants are shown in the below matrix table:
to \ from | [3,4] | {"a":3,"c":4} | "a":3,"c":4 | 3
-------------+---------------------+-----------------------+-----------------------+-------------
[1,2] | [1,2,[3,4] | [1,2,{"a":3,"c":4}] | [1,2,{"a":3},{"c":4}] | [1,2,3]
{"a":1,"b":2}| {"a":1,"b":2} | {"a":1,"b":2,"c":4} | {"a":1,"b":2,"c":4} |{"a":1,"b":2}
"a" | "a" | "a" | "a" | "a"
- the values in the 4th column header (namely
"a":3,"c":4
) do not look like valid JSON - those are JSON object's elements when pointed to by the-i <walk-path>
, i.e., they are JSON values in objects (the values with labels)
as follows from the table:
- insertion cannot occur into the atomic JSON elements
- when inserting into an array, the whole JSON value is getting inserted (no array expansion occurs)
- labeled values are getting inserted into arrays as standalone JSON objects
- when inserting objects into objects, upon label clashing the destination's label is preserved (source's ignored)
if insertion of an array into another array happens without merging arrays, how then to achieve the merged result upon insertion?
Option -m
(merge) alters the behavior of insert operation into following:
to \ from | [3,4] | {"a":3,"c":4} | "a":3,"c":4 | 3
-------------+---------------------+-----------------------+-----------------------+-------------
[1,2] | [1,2,3,4] | [1,2,3,4] | [1,2,3,4] | [1,2,3]
{"a":1,"b":2}|{"a":[1,3],"b":[2,4]}|{"a":[1,3],"b":2,"c":4}|{"a":[1,3],"b":2,"c":4}|{"a":1,"b":2}
"a" | ["a",3,4] | ["a",3,4] | ["a",3,4] | ["a",3]
- merging option allows insertion into the atomic values, but it gets converted into an JSON array first
- arrays are merged now
- clashing labels (for merged objects/object members) are also converted into arrays (if not yet) and merged
update (-u
) is similar to insert operation, but unlike insert, it rewrites the destination JSON element. Though both operations
share the same qualities:
- both are destination driven operations
- both merge JSON objects recursively
- both support merging (
-m
) semantic - both support move
(
-p
) semantic - both support shell evaluation
(
-e
) of argument
Here's the matrix table for update operations with and without merging:
- update without merging:
to \ from | [3,4] | {"a":3,"c":4} | "a":3 | 3
-------------+---------------------+-----------------------+-----------------------+-------------
[1,2] | [3,4] | {"a":3,"c":4} | 3 | 3
{"a":1,"b":2}| [3,4] | {"a":3,"c":4} | 3 | 3
"a" | [3,4] | {"a":3,"c":4} | 3 | 3
- update with merging (
-m
):
to \ from | [3,4] | {"a":3,"c":4} | "a":3 | 3
-------------+---------------------+-----------------------+-----------------------+-------------
[1,2] | [3,4] | [3,4] | [3,2] | [3,2]
{"a":1,"b":2}| {"a":3,"b":4} | {"a":3,"b":2,"c":4} | {"a":3,"b":2} |{"a":3,"b":2}
"a" | [3,4] | {"a":3,"c":4} | {"a":3} | 3
- when updating without
-m
, the operation is straightforward - a source overwrites the destination - when objects are merge-updated, for clashing labels the source does overwrite the destination (unlike with insertion)
The directive lexeme <..>k
allows accessing the label/index of the currently walked JSON element and even store it in the namespace.
Another function featured by the lexeme is that the label could be is reinterpreted as a JSON string value, that allows rewriting
labels using update operation (insert into labels is not possible even semantically). However, that only applies if <>k
lexeme
is the last executed lexeme and if the lexeme is empty.
As the an exercise, let's capitalize all the labels within all address
'es in ab.json
:
bash $ <ab.json jtc -w'<address>l:[:]<>k' -eu '<<<{{}}' tr '[:lower:]' '[:upper:]' \; / -w'<address>l:' -rl
"address": { "CITY": "New York", "POSTAL CODE": 10012, "STATE": "NY", "STREET ADDRESS": "599 Lafayette St" }
"address": { "CITY": "Seattle", "POSTAL CODE": 98104, "STATE": "WA", "STREET ADDRESS": "5423 Madison St" }
"address": { "CITY": "Denver", "POSTAL CODE": 80206, "STATE": "CO", "STREET ADDRESS": "6213 E Colfax Ave" }
bash $
The destination walk-path will not become invalid after the parent's label has been altered, thus allowing altering labels even of the nested elements in the same recursive update:
# list all the labels of the John's record:
bash $ <ab.json jtc -w'[name]:<John>[-1]<.*>L:<>k'
"address"
"city"
"postal code"
"state"
"street address"
"age"
"children"
"name"
"phone"
"number"
"type"
"number"
"type"
"spouse"
bash $
# capitalize all the labels in it:
bash $ <ab.json jtc -w'<John>[-1]<.*>L:<>k' -eu '<<<{{}}' tr '[:lower:]' '[:upper:]' \; / -w'<John>[-1]' -tc
{
"ADDRESS": { "CITY": "New York", "POSTAL CODE": 10012, "STATE": "NY", "STREET ADDRESS": "599 Lafayette St" },
"AGE": 25,
"CHILDREN": [ "Olivia" ],
"NAME": "John",
"PHONE": [
{ "NUMBER": "112-555-1234", "TYPE": "mobile" },
{ "NUMBER": "113-123-2368", "TYPE": "mobile" }
],
"SPOUSE": "Martha"
}
bash $
The labels can be updated with any atomic value (and not with the iterable values, obviously):
bash $ <ab.json jtc -w'<John><>k' -u true / -w'<John>' -l
"true": "John"
bash $ <ab.json jtc -w'<name>l<>k' -u '[true]' / -w'[0][0]' -tc
error: label could be updated only with JSON atomic value
...
if a source argument for either -i
or -u
is given in the form of a file or literal JSON, then those obviously cannot be moved.
The move semantic is only applicable when the argument is given only in the form of a <walk-path>
(i.e. it refers to the
input JSON), then upon completing the operation, the source elements (referred by the source walk-path) becomes possible
to remove (purge). This is achievable by adding the option -p
.
Let's move address
from the last Directory
record into the first one:
bash $ <ab.json jtc -w'[Directory][0][address]' -u'[Directory][-1:][address]' -p -tc
{
"Directory": [
{
"address": { "city": "Denver", "postal code": 80206, "state": "CO", "street address": "6213 E Colfax Ave" },
"age": 25,
"children": [ "Olivia" ],
"name": "John",
"phone": [
{ "number": "112-555-1234", "type": "mobile" },
{ "number": "113-123-2368", "type": "mobile" }
],
"spouse": "Martha"
},
{
"address": { "city": "Seattle", "postal code": 98104, "state": "WA", "street address": "5423 Madison St" },
"age": 31,
"children": [],
"name": "Ivan",
"phone": [
{ "number": "273-923-6483", "type": "home" },
{ "number": "223-283-0372", "type": "mobile" }
],
"spouse": null
},
{
"age": 25,
"children": [ "Robert", "Lila" ],
"name": "Jane",
"phone": [
{ "number": "358-303-0373", "type": "office" },
{ "number": "333-638-0238", "type": "home" }
],
"spouse": "Chuck"
}
]
}
bash $
- That leaves Jane
"homeless", while John
"moves" into Jane
's place!
An argument for insert and update operations (-i
, -u
) optionally may undergo a shell evaluation (predicated by preceding
option -e
).
E.g., let's capitalize all the name
entries in the address book:
# dump all the names:
bash $ <ab.json jtc -w'<name>l:' -l
"name": "John"
"name": "Ivan"
"name": "Jane"
bash $
# capitalize the names through update using shell evaluation:
bash $ <ab.json jtc -w'<name>l:' -eu '<<<{{}}' tr "[:lower:]" "[:upper:]" \; / -w'<name>l:' -l
"name": "JOHN"
"name": "IVAN"
"name": "JANE"
bash $
Once options -e
and -u
(-i
) are used together, following rules must be observed:
- option
-e
must precede first occurrence of-i
/-u
- cli arguments sequence following option
-i
/-u
must be terminated with the standalone escaped semicolon:\;
- the cli is also subjected for namespace interpolation before it gets shell evaluated
- the cli in the argument do not require any additional escaping (except those which would normally be required by a shell)
- if piping in the cli is required then pipe symbol itself has to be escaped and spelled standalone:
\|
- returned result of a shell evaluation must be either a valid JSON, or non-empty and non-error (in the latter case it'll be promoted to a JSON string value)
- failed (those returning non-zero exit code) or empty results of the shell evaluations are ignored (then JSON entry wont be updated/inserted, rather proceed to the next walked entry for another/next update attempt)
- templates (
-T
) / template-interpolation occurs and applies after shell evaluation returns the result
if shell cli does not deliver expected result for some reason, it's debuggable with
-dd
options.
options -i
, -u
, -c
allow mixing two kinds of their arguments:
- JSON argument (given in the form of either file, or a literally spelled JSON, or a template)
- walk-path
- when those used together, namely a <walk-path>
argument(s) follows JSON argument, e.g.:
jtc -u file.json -u'[Root][:]'
then all <walk-path>
arguments (here [Root][:]
) apply onto the JSON argument (here a JSON from file.json
).
- when both kinds of arguments are used together, then only one JSON argument is allowed while multiple walk-path may be given
That way is possible to facilitate walking over the specified JSON argument. Though be aware of a design limitation:
multiple <walk-path>
arguments will be processed in a sequential order, one by one (i.e., interleaving never occurs).
(Also, see operations with cross referenced lookups)
options -u
, -i
when used together with -e
also allow specifying multiple instances of the option usage:
- first option occurrence must prove a shell cli line, terminated with the standalone spelling of a semicolon
\;
- all the subsequent option usages must provide
<walk-path>
type of argument, which let specifying source(s) of interpolation (occurring before shell evaluation happens). However, in the case if the mixed argument types use is detected (in presence of-e
), then the semantic of thejtc
arguments would be like this:
jtc -w'<dst>' -eu <shell cli ...> \; -u'<src>'
Here, both<dst>
and<src>
walk the same input JSON. Shell cli evaluation / interpolation occurs from walking<src>
That way it's possible to decouple source(s) (of interpolation) from the destination(s): all trailing (subsequent) arguments of-u
will be used in every shell evaluation (interpolating respective JSON elements), while arguments pointed by (all)-w
option(s) will point where evaluated/returned resulting JSONs should be placed.
The described argument behavior facilitates transformation of a JSON when a source location of transformation is not the same as a destination
Hopefully this example will clarify:
- say (just for the sake of example), we want to add to every record's
children
thename
of the person, but not just - we want to add it in all capitals (i.e., transform the record).
bash $ <ab.json jtc -w'<children>l:' -ei '<<<{{}}' tr '[:lower:]' '[:upper:]' \; -i'<name>l:' / -lrw'<name>l:' -w'<children>l:'
"name": "John"
"children": [ "Olivia", "JOHN" ]
"name": "Ivan"
"children": [ "IVAN" ]
"name": "Jane"
"children": [ "Robert", "Lila", "JANE" ]
bash $
- there, the source(s) of shell interpolation were
name
records (provided with-i'<name>l:'
walks), while the destinations werechildren
(given with-w'<children>l:'
)
In case if a single option instance (-eu
/-ei
) is used (w/o trailing options with walk arguments), then both the source
(of interpolation) and the destination (of operation) would be provided with -w
option argument
One use-case that namespaces facilitate quite nicely, is when insert/update/purge/compare operation refer to different JSONs (e.g, in Mixing argument types) but one requires a reference from another.
Say, we have 2 JSONs:
main.json
:
bash $ <main.json jtc -tc
[
{ "name": "Abba", "rec": 1, "songs": [] },
{ "name": "Deep Purple", "rec": 3, "songs": [] },
{ "name": "Queen", "rec": 2, "songs": [] }
]
bash $
id.json
:
bash $ <id.json jtc -tc
[
{ "id": 3, "title": "Smoke on the Water" },
{ "id": 1, "title": "The Winner Takes It All" },
{ "id": 2, "title": "The Show Must Go On" }
]
bash $
The ask here is to insert songs titles from id.json
into main.json
cross-referencing respective rec
to id
values.
The way to do it:
- first walk
main.json
finding and memorizing (each)rec
value - then, walk up to the
song
entry (so that will be a destination pointer, where song needs to be inserted).
The insert operation (-i
) here would need to find id
record in id.json
using memorized (in the destination walk) namespace and
insert respective title
:
bash $ <main.json jtc -w'<rec>l:<Rid>v[-1][songs]' -mi id.json -i'[id]:<Rid>s[-1][title]' -tc
[
{
"name": "Abba",
"rec": 1,
"songs": [ "The Winner Takes It All" ]
},
{
"name": "Deep Purple",
"rec": 3,
"songs": [ "Smoke on the Water" ]
},
{
"name": "Queen",
"rec": 2,
"songs": [ "The Show Must Go On" ]
}
]
bash $
For each destination walk (-w
) here, there will be a respective insert-walk (-i
) (-w
is always walked first). When destination
walk finishes walking, the namespace Rid
will be populated with a respective value from the rec
entry. That value will be reused
by insert-walk when walking its source JSON (id.json
) with the lexeme [id]:<Rid>s
that will find a respective id
.
The rest should be obvious by now.
jtc
does not have a "walk" argument for -p
(purge) operation (-p
is a standalone option, when it's used only with -w
it will purge every resulted/walked entry).
So, how to facilitate a cross-referenced purge then? (i.e., when purging ids are located in a separate file)
The trick is to use update/insert -u
/-i
operation together with -p
. When the cli is given in this notation:
<<<dst.json jtc -w... -u <src.json> -u... -p
,
purging will be applied to walked destinations, but only predicated by a successful source walk:
bash $ <main.json jtc -w'<rec>l:<Rid>v[-1]' -u'[{"id":1}, {"id":3}]' -u'[id]:<Rid>s' -p
[
{
"name": "Queen",
"rec": 2,
"songs": []
}
]
bash $
The "complemented" purge operation (i.e. when we want to delete everything except referenced) is facilitated using -pp
:
bash $ <main.json jtc -w'[rec]:<Rid>N:[-1]<Entry>v' -u'[1, 3]' -u'<Rid>s' -T'{{Entry}}' -pp
[
{
"name": "Abba",
"rec": 1,
"songs": []
},
{
"name": "Deep Purple",
"rec": 3,
"songs": []
}
]
bash $
- memorizing the whole entry (in Entry
namespace) is required because the update operation w/o the template only replaces
records (and purges everything else), but that's not the goal - the goal is to retain all the entries, hence replacing the
updated entries with the template for the entire entry.
-
: bare qualifier (hyphen), ensures that input read occurs fromstdin
regardless of present filename arguments-f
: ensures that the final output is redirected to the filename (if one given) instead ofstdout
-p
: purges all walked (-w
) JSON elements from a JSON tree-pp
: purges all JSON elements except walked ones (-w
) from a JSON tree-s
: swaps around JSON elements in a JSON tree pointed by the pairs of walks (i.e. at least 2 -w must be given)-i<arg_JSON>
: insertsarg_JSON
(which is either a file, or a literally spelled JSON, or a template) into the destinations pointed by-w
; multiple options with such arguments allowed-i<arg_JSON> -i<walk-path>
: inserts JSON elements walkedarg_JSON
withwalk-path
into the destinations pointed by-w
; only a single option witharg_JSON
and multiple options withwalk-path
arguments are supported-i<walk-path>
: insert-copies JSON elements from the input JSON pointed bywalk-path
into the destinations pointed by-w
; multiple options with such arguments allowed-pi<walk-path>
: insert-moves JSON elements from the input JSON pointed bywalk-path
into the destinations pointed by-w
; multiple options with such arguments allowed-ppi<walk-path>
: inserts JSON elements from the input JSON pointed bywalk-path
into the destinations pointed by-w
, while purging all other (non-walked) elements from a JSON tree-ei <shell_cli> \;
: inserts a JSON element resulted from a shell evaluation runningshell_cli
into the destinations pointed by-w
;shell_cli
is run for every successful destination walk (-w
) iteration; only a single invocation per the option chain-set is supported-ei <shell_cli> \; -i<walk-path>
: inserts a JSON element resulted from a runningshell_cli
into the destinations pointed by-w
;shell_cli
is run for every successful sourcewalk-path
iteration walking input JSON; multiple options withwalk-path
argument are supportedu...
: update (rewrite) operations, all the same option modes and combinations as for-i
are applied-m
: modifier option, when used together with-i
,-u
toggles insert/update behavior allowing "merging" behavior
-c
allows comparing JSONs (or JSONs element pointed by walk-paths) - jtc
will display JSON delta (diffs) between compared JSONs.
Let's compare phone
records from the first and the second entries of the address book:
bash $ <ab.json jtc -w'<phone>l' -c'<phone>l1' -l
"json_1": [
{
"number": "112-555-1234",
"type": "mobile"
},
{
"number": "113-123-2368"
}
]
"json_2": [
{
"number": "273-923-6483",
"type": "home"
},
{
"number": "223-283-0372"
}
]
bash $
When both JSONs are equal, an empty set is displayed and return code is 0.
bash $ <<<'123' jtc -c'123' -l
"json_1": {}
"json_2": {}
bash $ echo $?
0
bash $
Otherwise (JSONs are different) a non-zero code is returned:
bash $ <<<'[1,2,3]' jtc -c'[2,3]' -lr
"json_1": [ 1, 2, 3 ]
"json_2": [ 2, 3 ]
bash $ echo $?
4
bash $
If multiple pairs of JSONs compared, zero code is returned only when all compared JSON pairs are equal.
JSON schema essentially is a JSON structure (JSON containers, labels, indices) without leaf data. I.e., two JSONs may have different contents (leaf data), while their structures could be the same (though the statement is rather loose - JSON schema does validate types of the leaf data as well).
E.g., if we add/insert a child into Ivan
's record, then the record would be different from the original:
bash $ <ab.json jtc -w'<Ivan>[-1][children]' -i'"Norma"' / -w'<Ivan>[-1]' -c ab.json -c'<Ivan>[-1]' -l
"json_1": {
"children": [
"Norma"
]
}
"json_2": {}
bash $
However, their schemas would be the same. To compare schemas of two JSONs (loosely, with applied exemption on checking leaves data types),
label directive <>k
used together with <>c
search suffix come handy:
bash $ <ab.json jtc -w'<Ivan>[-1][children]' -i'"Norma"' / -w'<Ivan>[-1]<>c:<>k' -c'ab.json' -c'<Ivan>[-1]<>c:<>k' -l
"json_1": {}
"json_2": {}
"json_1": {}
"json_2": {}
"json_1": {}
"json_2": {}
"json_1": {}
"json_2": {}
"json_1": {}
"json_2": {}
"json_1": {}
"json_2": {}
bash $ echo $?
0
bash $
NOTE: usage of '<>k' is only restricted to JSON elements which have labels/indices. JSON
root
does not have any of those, thus attempting to print a label of the root always results in the exception:bash $ <ab.json jtc -w'<>k' jtc json exception: walk_root_has_no_label bash $
ECMA-404 is relaxed w.r.t. the uniqueness of labels:
The JSON syntax does not impose any restrictions on the strings used as names, does not require that name strings be unique ...
While JSON RFC 7158 is strict on the label uniqueness:
... The names within an object SHOULD be unique.
jtc
follows the RFC (and so considers JSONs with clashing labels to be ill-formed), because logically labels must provide
an addressing mechanism within objects and non-unique (clashing) labels break the addressing. In case the source JSON holds
duplicate labels, then by default jtc
parses and retains the first label only:
bash $ cat ill.json
{
"label": "first entry",
"label": "second entry"
}
bash $
# parse ill-formed json:
bash $ <ill.json jtc
{
"label": "first entry"
}
bash $
However, sometimes there's a requirement to parse in such ill-formed JSONs and retain all the values. Option -mm
allows
merging the values with clashing labels into a JSON array:
bash $ <ill.json jtc -mm
{
"label": [
"first entry",
"second entry"
]
}
bash $
Note: option
-mm
will not engulf a single one-m
and thus, if both behaviors are required then both to be provided (e.g.:-mmm
)
both ECMA-404
and RFC 7158
agree that JSON spec does not assign any significance to the ordering of name/value pairs within
objects (unlike with the arrays, which are ordered sequences) - thus, it's free up to a JSON parser to handle it in any preferable way.
jtc
, while being relaxed upon parsing in object values coming in any order, will always re-arrange objects by labels sorted
in the descending order - that provides some benefits when handling JSON manipulations queries, e.g.: auto-generated tokens when
an object gets interpolated into a template-string allows operating with object values deterministically, another benefit is that
subscripting object elements with numerical indices also becomes more deterministic, etc.
Normally jtc
would process only a single input JSON (withing a single source). If multiple input JSONs given (within a single source)
- the fist JSON will be processed and the rest of the inputs will be silently ignored:
bash $ <<<'[ "1st json" ] { "2nd": "json" } "3rd json"' jtc -r
[ "1st json" ]
bash $
Couple options allow altering the behavior and process all the input JSONs:
Option -a
instructs to process each of the input JSONS:
bash $ <<<'[ "1st json" ] { "2nd": "json" } "3rd json"' jtc -ar
[ "1st json" ]
{ "2nd": "json" }
"3rd json"
bash $
- respected processing (of all given options) will occur for all of the input JSONs:
bash $ <<<'[ "1st json" ] { "2nd": "json" } "3rd json"' jtc -a -w'<json>R'
"1st json"
"json"
"3rd json"
bash $
All the input JSONs will be processed as long they are valid - processing will stops upon parsing failure:
bash $ <<<'[ "1st json" ] { "2nd": json" } "3rd json"' jtc -ad
.display_opts(), option set[0]: -a -d (internally imposed: )
.init_inputs(), reading json from <stdin>
.write_json(), outputting json to <stdout>
[
"1st json"
]
.exception_locus_(), [ "1st json" ] { "2nd": j
.exception_spot_(), ------------------------>| (offset: 24)
jtc json parsing exception (<stdin>:24): expected_json_value
bash $
The exception locus is only shown up to a violating point and not past it because of
streamed_cin
type of read. If the same steam of JSONs was in the file, then the read type woudl bebuffered_file
and then the entire locus would be shown:bash $ jtc -ad file.json .display_opts(), option set[0]: -a -d 'file.json' (internally imposed: ) .init_inputs(), reading json from file-arguments: .init_inputs(), file argument: file.json .write_json(), outputting json to <stdout> [ "1st json" ] .exception_locus_(), { "2nd": json" } "3rd json" .exception_spot_(), --------->| (offset: 9) jtc json parsing exception (file.json:9): expected_json_value bash $
Another option trigerring all JSONs processing (from any number of sources) is -J
- that option tells that aggregation of all JSONs
is required and thus assumes option -a
implicitely (no need giving both):
bash $ <<<'[ "1st json" ] { "2nd": "json" } "3rd json"' jtc -J -w[0]
[
"1st json",
"json"
]
bash $
If multiple sources (input file arguments) given, then there's no need specifying option
-a
- it's assumed implicitly. Actually, if multiple sources given then option-a
gates single-threaded JSON parsing behavior, so use-a
:
- with a single input source (
stdin
, or a single file argument) only when a stream of JSONs has to be processed, or- with multiple input sources when you like to enforce a single-threaded JSON parsing
option -J
allows wrapping all processed input JSONs into a super JSON array (option -J
assumes option -a
, no need giving both):
bash $ <<<'[ "1st json" ] { "2nd": "json" } "3rd json"' jtc -J -w'<json>R'
[
"1st json",
"json",
"3rd json"
]
bash $
option -J
also implicitly imposes -j
thus it could be used safely even with a single JSON at the input with the same effect.
Though, when walking multiple input JSONs, each of the option would have its own effect, this example clarifies:
# process and wrap each input JSON into an array:
bash $ jtc -w'[0][:][name]' -aj ab.json ab.json
[
"John",
"Ivan",
"Jane"
]
[
"John",
"Ivan",
"Jane"
]
bash $
# process all input JSONs and wrap them into an array:
bash $ jtc -w'[0][:][name]' -J ab.json ab.json
[
"John",
"Ivan",
"Jane",
"John",
"Ivan",
"Jane"
]
bash $
# process and wrap each input JSON into an array and then wrap all the processed into a super array:
bash $ jtc -w'[0][:][name]' -Jj ab.json ab.json
[
[
"John",
"Ivan",
"Jane"
],
[
"John",
"Ivan",
"Jane"
]
]
bash $
Note:
jtc
supports an unlimited number of files that can be supplied via standalone arguments. When multiple input files are given, options-a
is assumed implicitely.
jtc
may read inputs 2 ways:
- buffered read
- streamed read
In the buffered read mode (which is default), the entire file (or <stdin>
) input is read into memory and only then JSON parsing is
attempted (with all due subsequent processing).
In the streamed read mode JSON parsing begins immediately as the the first character is read (so, no memory wasted to hold input
literal JSON).
The streamed read is activated when:
- option
-a
given AND input source is<stdin>
The option -J
overrides streamed read (reverting to buffered): the streamed read might be endless, while option -J
assumes a finite number of inputs to be processed and then displayed
From the JSON result point of view there's no difference between buffered and streamed reads - the result will be 100% consistent across both types of reads. However, streamed read finds its application when the streamed data are there (typically would be a network-based streaming)
We can see the difference in the parsing when debugging jtc
:
- in a buffered read mode, the debug will show the parsing point with the data following behind it:
bash $ <ab.json jtc -dddddd
.display_opts(), option set[0]: -d -d -d -d -d -d (internally imposed: )
.init_inputs(), reading json from <stdin>
..ss_init_(), initializing mode: buffered_cin
..ss_init_(), buffer (from <stdin>) size after initialization: 1674
..run_decomposed_optsets(), pass for set[0]
......parse_(), parsing point ->{| "Directory": [| {| "address": {| ...
......parse_(), parsing point ->"Directory": [| {| "address": {| "...
......parse_(), parsing point ->[| {| "address": {| "city": "New Y...
......parse_(), parsing point ->{| "address": {| "city": "New York",| ...
...
- in a streamed read mode, the parsing point would point to the last read character from the <stdin>
:
bash $ <ab.json jtc -dddddd -a
.display_opts(), option set[0]: -d -d -d -d -d -d -a (internally imposed: )
.init_inputs(), reading json from <stdin>
..ss_init_(), initializing mode: streamed_cin
..ss_init_(), buffer (stream) size after initialization: 1
..run_decomposed_optsets(), pass for set[0]
......parse_(), {<- parsing point
......parse_(), {| "<- parsing point
......parse_(), {| "Directory": [<- parsing point
......parse_(), {| "Directory": [| {<- parsing point
......parse_(), {| "Directory": [| {| "<- parsing point
......parse_(), {| "Directory": [| {| "address": {<- parsing point
......parse_(), ..."Directory": [| {| "address": {| "<- parsing point
...
Here's an example of how streamed read works in jtc
:
| Screen 1 | Screen 2 |
| ---------------------------------------------------- | ---------------------------------------------------- |
| bash $ nc -lk localhost 3000 | jtc -ra | bash $ <ab.json jtc -w'<address>l:' | nc localhost 3 |
| { "city": "New York", "postal code": 10012, "state": | 000 |
| "NY", "street address": "599 Lafayette St" } | bash $ |
| { "city": "Seattle", "postal code": 98104, "state": | bash $ <ab.json jtc -w'<name>l:' -w'<name>l:[-1][pho |
| "WA", "street address": "5423 Madison St" } | ne]' | nc localhost 3000 |
| { "city": "Denver", "postal code": 80206, "state": " | bash $ |
| CO", "street address": "6213 E Colfax Ave" } | bash $ <ab.json jtc -w'<name>l:<N>v[-1][children]' - |
| "John" | T'{"Parent":{{N}}, "progeny": {{}} }' | nc localhost |
| [ { "number": "112-555-1234", "type": "mobile" }, { | 3000 |
| "number": "113-123-2368", "type": "mobile" } ] | bash $ |
| "Ivan" | |
| [ { "number": "273-923-6483", "type": "home" }, { "n | |
| umber": "223-283-0372", "type": "mobile" } ] | |
| "Jane" | |
| [ { "number": "358-303-0373", "type": "office" }, { | |
| "number": "333-638-0238", "type": "home" } ] | |
| { "Parent": "John", "progeny": [ "Olivia" ] } | |
| { "Parent": "Ivan", "progeny": [] } | |
| { "Parent": "Jane", "progeny": [ "Robert", "Lila" ] | |
| } | |
| | |
In the Screen 1
, jtc
listens to the stream data coming from netcat
utility and process-prints (in a compact format) all
the input JSONs. It will stop once <stdin>
is closed, but netcat
is run using -k
option, which means endlessly.
In the Screen 2
, jtc
sends to netcat
a few walks (JSONs), which netcat
relays to its counterpart in the Screen1
.
When multiple file argumens are given, jtc
by default will read and parse each file in a separate thread (predicated a multi-core CPU
is available). jtc
will spawn only as many CPU threads as many CPU cores available or as many as files arguments given
(whichever number is smaller). E.g., on a 8-core CPU and 10 files given, only up to 8 additional threads will be created.
The advantage of a concurent parsing only becomes noticeable when JSON files are relatively big, or there are many of them. If there are
too many of very tiny JSONs, then such processing might be even slower (due to thread creation overheads) than a single-threaded run.
To disable multithreaded parsing and revert to a single-threaded mode use option -a
(in the initial option set).
Compare:
bash $ # multithreaded input file parsing
bash $ /usr/bin/time jtc -J / -zz big.json big.json
30000033
18.33 real 28.69 user 2.49 sys
bash $
bash $ # single threaded input file parsing
bash $ /usr/bin/time jtc -aJ / -zz big.json big.json
30000033
29.34 real 27.50 user 1.77 sys
bash $
Like it was mentioned before, jtc
performs one major operation at a time: standalone walking, insertion, update, purging,
swapping, comparison. There's a number of supplementary operations that might complement the major operations like: wrapping results
into JSON arrays and objects, toggling various viewing and parsing modes, etc.
If multiple major operations are required, one way to achieve it would be piping an output of the prior operation into the input of the next one, e.g:
jtc <insert...> | jtc <swap...> | etc
However, such approach is quite suboptimal - with every piping operation a serialization (outputting) and deserialization (parsing) of JSON occurs and those are quite expensive (CPU cycles-wise) operations.
jtc
permits chaining multiple operations using solidus separator /
. The above example could be collapsed into this:
jtc <insert...> / <swap...> / etc
without any affect to the result. The sets of all options in between separators are known as option sets.
The advantage of such approach is huge: processed JSONs now are passed from one option set to the next one in a compiled (binary) form (no CPU cycles wasted on printing / re-parsing). Another additional benefit is that the namespace now is shared across all option sets.
There's a few options (mostly viewing and parsing) which are non-transient and may occur only in the first or in the last option set:
-r
: compact printing - may occur only in the last option set-rr
: stringifying output JSON - may occur only in the last option set; if such operation is required in the interim operation - use template stringification instead-t
: output indentation - may occur only in the last option set-q
: parse input with a strict solidus quoting - may occur only in the initial option set-qq
: unquoting JSON strings, jsonizing stringified JSONs - may occur only in the last option set; if such operation is required in the interim operation - use template jsonizing instead-z
: additionally printing size for the each walked JSON - may occur only in the last option set-zz
: printing size instead of JSON - may occur only in the last option set-f
: forcing (redirecting) outputs into a file - may occur only in the last option set-
: ensuring input is read fromstdin
- may occur in any of the option sets, but affects only first one (where parsing occurs)
In the bufferred reading mode the processing of JSON streams occurs per option-set. I.e., consider following syntax:
jtc <option-set1> / <option-set2> file.json
If file.json
contains multiple JSONs (a.k.a. stream of JSONs) and predicated option-set1
caters -a
option, then all JSONs
from the file will be processed first in option-set1
then all (walk) ouputputs are passed to the input of option-set2
and again all
JSONs will be processed in option-set2
(predicated option-set2
caters -a
) and so on and so forth:
bash $ jtc -ar file.json
{ "1": "first JSON" }
{ "2": "second JSON" }
{ "3": "third JSON" }
bash $
bash $ jtc -raw[:] -u'[{{}}];' file.json
{ "1": [ "first JSON" ] }
{ "2": [ "second JSON" ] }
{ "3": [ "third JSON" ] }
bash $
bash $ jtc -aw[:] -u'[{{}}];' / -raw'[:][0]<(\w+) (\w+)>R[-1]' -u'[{{$1}}, {{$2}}];' file.json
{ "1": [ "first", "JSON" ] }
{ "2": [ "second", "JSON" ] }
{ "3": [ "third", "JSON" ] }
bash $
However, in the streamed
mode
each JSON will be processed through all option sets individually (in the streamed mode it's assumed that the stream of JSONs
could be virtually endless), for the same reason the behavior of option -J
is reduced to -j
(it's impossible to aggregate
an endless stream of JSONs).
Thus, if neither of option-sets caters -J
option, then the result of the operations should be identical (it might not be identical
if there was namespace dependency in the walks - due to difference in processing it might result in a discrepancy of the results):
bash $ <file.json jtc -aw[:] -u'[{{}}];' / -raw'[:][0]<(\w+) (\w+)>R[-1]' -u'[{{$1}}, {{$2}}];'
{ "1": [ "first", "JSON" ] }
{ "2": [ "second", "JSON" ] }
{ "3": [ "third", "JSON" ] }
bash $
However, if -J
is present, then no aggregation will happen in the streamed reading mode:
bash $ # buffered reading mode
bash $ jtc -aw[0] / -J file.json
[
"first JSON",
"second JSON",
"third JSON"
]
bash $
bash $ # streamed reading mode
bash $ <file.json jtc -aw[0] / -J -tc
notice: in streamed_cin mode, behavior of option -J is reduced to -j
[ "first JSON" ]
[ "second JSON" ]
[ "third JSON" ]
bash $
CSV
stands for comma separated values, thus to convert a JSON into a csv
file, it's required dumping all relevant JSON walks
line by line, while separating JSON values either with ,
or with ;
(CSV
format admits both)
There are couple tricks required to do so, but not difficult ones, so, let's walk it.
Say, we want to dump into csv
format following data from the ab.json
:
name, city, postal code, state, street address
- First let's build a walk going over all the names (memorizing them) and all the addresses:
bash $ <ab.json jtc -rw'<name>l:<N>v[-1][address]'
{ "city": "New York", "postal code": 10012, "state": "NY", "street address": "599 Lafayette St" }
{ "city": "Seattle", "postal code": 98104, "state": "WA", "street address": "5423 Madison St" }
{ "city": "Denver", "postal code": 80206, "state": "CO", "street address": "6213 E Colfax Ave" }
bash $
- not that difficult
- Now, using template it's possible to arrange all the data in the required format:
bash $ <ab.json jtc -rw'<name>l:<N>v[-1][address]' -qqT'"{N}, {$a}, {$b}, {$c}, {$d}"'
John, New York, 10012, NY, 599 Lafayette St
Ivan, Seattle, 98104, WA, 5423 Madison St
Jane, Denver, 80206, CO, 6213 E Colfax Ave
bash $
The template above is demo'ing iterable auto tokens - those are good when it's required to extract iterables members individually, or in a different order. In our case the query's order of items matches those in the array, hence it's possible to simplify our template:
bash $ <ab.json jtc -rw'<name>l:<N>v[-1][address]' -qqT'"{N}, {}"'
John, New York, 10012, NY, 599 Lafayette St
Ivan, Seattle, 98104, WA, 5423 Madison St
Jane, Denver, 80206, CO, 6213 E Colfax Ave
bash $
If the header is required it could be added either using unix echo
command:
bash $ hdr='name, city, postal code, state, street address'
bash $ echo -e "$hdr\n$(<ab.json jtc -rw'<name>l:<N>v[-1][address]' -qqT'"{N}, {}"')"
name, city, postal code, state, street address
John, New York, 10012, NY, 599 Lafayette St
Ivan, Seattle, 98104, WA, 5423 Madison St
Jane, Denver, 80206, CO, 6213 E Colfax Ave
bash $
Another way to add a header is to use additional walk and template with jtc
:
bash $ <ab.json jtc -nqq -w' ' -T"\"$hdr\"" -w'<name>l:<N>v[-1][address]' -T'"{N}, {}"'
name, city, postal code, state, street address
John, New York, 10012, NY, 599 Lafayette St
Ivan, Seattle, 98104, WA, 5423 Madison St
Jane, Denver, 80206, CO, 6213 E Colfax Ave
bash $
DONE.
Quite a very common query for JSON is to process duplicates. Say, we deal with the following JSON:
bash $ jsn='[ "string", true, null, 3.14, "string", null ]'
bash $ <<<$jsn jtc
[
"string",
true,
null,
3.14,
"string",
null
]
bash $
So, let's
bash $ <<<$jsn jtc -w'<>Q:' -p
[
"string",
true,
null,
3.14
]
bash $
Because switch -p
is given, all the duplicate elements will be purged, thus leaving the list only with non-duplicate (unique)
elements
If the JSON structure is as simple as shown, then the same could be achieved differently: walk only unique elements and jsonize the output:
bash $ <<<$jsn jtc -w'><q:' -j
[
"string",
true,
null,
3.14
]
bash $
But there's a reverse action:
bash $ <<<$jsn jtc -w'<>Q:' -pp
[
"string",
null
]
bash $
That one is obvious - we just reversed the prior example.
Another way to achieve the same:
bash $ <<<$jsn jtc -w'><q:' -p
[
"string",
null
]
bash $
How about:
bash $ <<<$jsn jtc -w'<Dup>Q:[^0]<Dup>s:' -p
[
true,
3.14
]
bash $
it's just a tiny bit more complex:
<Dup>Q:
- for each duplicate element, we'll memorize it intoDup
namespace, then[^0]<Dup>s:
reset the search path back to the root and now find all the elements (i.e., all duplicates).
that way all duplicates (and their origins) will be removed, leaving the array only with those which have no duplicates.
and finally
bash $ <<<$jsn jtc -w'<Dup>Q:[^0]<Dup>s:' -pp
[
"string",
null,
"string",
null
]
bash $
it's just a reverse action.
Counting any number of properties is JSON could be done using external wc
unix utility. E.g., let's count all number
s in ab.json
:
bash $ <ab.json jtc -w'<number>l:' | wc -l
6
bash $
However, the same is possible to achieve using jtc
capability - with <..>I
lexeme:
bash $ <ab.json jtc -w'<number>l:<cnt>I1' -T{cnt} -x/-1
6
bash $
<cnt>I1
will arrange a namespace varcnt
counting values starting from0
with increment of1
upon each walk pass (iteration)-T{cnt}
will interpolate it-x/-1
will display the last walk only
Say, now we want to count the same phone numbers, but for some reason starting from the offset 100
:
bash $ <ab.json jtc -w'<number>l:<cnt:100>I1' -T{cnt} -x/-1
106
bash $
Finally, let's count home numbers and mobile numbers separately:
bash $ <ab.json jtc -x'<phone>l:' -y'<home>:<hn>I1' -y'<mobile>:<mn>I1' -T'{"total home numbers":{hn},"total mobile numbers":{mn}}' -x/-1
{
"total home numbers": 2,
"total mobile numbers": 3
}
bash $
Say, we have a matrix to transpose:
bash $ mtx='[[0,1,2,3,4],["a","b","c","d","e"],[null,true,2,"3",[4]]]'
bash $ <<<$mtx jtc -tc
[
[ 0, 1, 2, 3, 4 ],
[ "a", "b", "c", "d", "e" ],
[
null,
true,
2,
"3",
[ 4 ]
]
]
bash $
We can arrange walking through each slice while memorizing incremental index within the slice, it's trivial:
bash $ <<<$mtx jtc -w'[:][:]<I>k' -jr
[ 0, 1, 2, 3, 4, "a", "b", "c", "d", "e", null, true, 2, "3", [ 4 ] ]
bash $
However, we need to re-arrange such output per each new, transposed matrix with number of columns <->
rows.
That could be facilitated if we label each value with the row index:
bash $ <<<$mtx jtc -w'[:][:]<I>k' -T'{"{I}":{{}}}' -r
{ "0": 0 }
{ "1": 1 }
{ "2": 2 }
{ "3": 3 }
{ "4": 4 }
{ "0": "a" }
{ "1": "b" }
{ "2": "c" }
{ "3": "d" }
{ "4": "e" }
{ "0": null }
{ "1": true }
{ "2": 2 }
{ "3": "3" }
{ "4": [ 4 ] }
bash $
The last step is to reach out for labels inside each object (-ll
) and then regroup the output per each new group:
bash $ <<<$mtx jtc -w'[:][:]<I>k' -T'{"{I}":{{}}}' -ll / -jw[:] -tc
[
[ 0, "a", null ],
[ 1, "b", true ],
[ 2, "c", 2 ],
[ 3, "d", "3" ],
[
4,
"e",
[ 4 ]
]
]
bash $
DONE.
Search Lexemes <>g
, <>G
allow sorted walk of entries in an ascending or descending order respectively. Let's sort
entries in the Directory of ab.json
by city
in the ascending order.
First, let's walk city
es in the ascending order:
bash $ <ab.json jtc -w'[city]:<>g:'
"Denver"
"New York"
"Seattle"
bash $
Sorting is achieved with the insert-move operations of the entries in the sorted order:
bash $ <ab.json jtc -w'[Directory]' -pi'[city]:<>g:[-2]' -tc
{
"Directory": [
{
"address": { "city": "Denver", "postal code": 80206, "state": "CO", "street address": "6213 E Colfax Ave" },
"age": 25,
"children": [ "Robert", "Lila" ],
"name": "Jane",
"phone": [
{ "number": "358-303-0373", "type": "office" },
{ "number": "333-638-0238", "type": "home" }
],
"spouse": "Chuck"
},
{
"address": { "city": "New York", "postal code": 10012, "state": "NY", "street address": "599 Lafayette St" },
"age": 25,
"children": [ "Olivia" ],
"name": "John",
"phone": [
{ "number": "112-555-1234", "type": "mobile" },
{ "number": "113-123-2368", "type": "mobile" }
],
"spouse": "Martha"
},
{
"address": { "city": "Seattle", "postal code": 98104, "state": "WA", "street address": "5423 Madison St" },
"age": 31,
"children": [],
"name": "Ivan",
"phone": [
{ "number": "273-923-6483", "type": "home" },
{ "number": "223-283-0372", "type": "mobile" }
],
"spouse": null
}
]
}
bash $
When walking lexemes <>g
, <>G
the sorting order occurs for JSONs using the following priority resolution order:
null JSON < boolean JSON < numerical JSON < string JSON < JSON array < JSON object
E.g.: an empty JSON object {}
wins (has a better weight) over any array of any size, and so on.
When comparing the same iterable types (comparing atomic values is trivial) the following priority resolution order is applied:
- an iterable of the bigger size wins, otherwise (sizes are the same):
- deeper JSON wins over the shallower one, otherwise (both have the same nestedness):
- compared values child-by-child defines a winner, otherwise (all children values are the same):
- if it's an object then the labels are compared, otherwise (if it's an array, or all the labels are the same):
- JSON values are equal
Here
you could find some answers using jtc
for JSON queries taken from
stackoverflow.com