Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New sub, gsub, and ssub verbs #1361

Merged
merged 4 commits into from
Aug 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 33 additions & 6 deletions docs/src/manpage.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,12 +194,13 @@ MILLER(1) MILLER(1)
1mVERB LIST0m
altkv bar bootstrap case cat check clean-whitespace count-distinct count
count-similar cut decimate fill-down fill-empty filter flatten format-values
fraction gap grep group-by group-like having-fields head histogram json-parse
json-stringify join label latin1-to-utf8 least-frequent merge-fields
most-frequent nest nothing put regularize remove-empty-columns rename reorder
repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records
sort sort-within-records split stats1 stats2 step summary tac tail tee
template top utf8-to-latin1 unflatten uniq unspace unsparsify
fraction gap grep group-by group-like gsub having-fields head histogram
json-parse json-stringify join label latin1-to-utf8 least-frequent
merge-fields most-frequent nest nothing put regularize remove-empty-columns
rename reorder repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle
skip-trivial-records sort sort-within-records split ssub stats1 stats2 step
sub summary tac tail tee template top utf8-to-latin1 unflatten uniq unspace
unsparsify

1mFUNCTION LIST0m
abs acos acosh any append apply arrayify asin asinh asserting_absent
Expand Down Expand Up @@ -1245,6 +1246,15 @@ MILLER(1) MILLER(1)
Options:
-h|--help Show this message.

1mgsub0m
Usage: mlr gsub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and handling multiple matches, like the `gsub` DSL function.
See also the `sub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.

1mhaving-fields0m
Usage: mlr having-fields [options]
Conditionally passes through records depending on each record's field names.
Expand Down Expand Up @@ -1853,6 +1863,14 @@ MILLER(1) MILLER(1)

See also the "tee" DSL function which lets you do more ad-hoc customization.

1mssub0m
Usage: mlr ssub [options]
Replaces old string with new string in specified field(s), without regex support for
the old string, like the `ssub` DSL function. See also the `gsub` and `sub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.

1mstats10m
Usage: mlr stats1 [options]
Computes univariate statistics for one or more given fields, accumulated across
Expand Down Expand Up @@ -1990,6 +2008,15 @@ MILLER(1) MILLER(1)
https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average
for more information on EWMA.

1msub0m
Usage: mlr sub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and not handling multiple matches, like the `sub` DSL function.
See also the `gsub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.

1msummary0m
Usage: mlr summary [options]
Show summary statistics about the input data.
Expand Down
39 changes: 33 additions & 6 deletions docs/src/manpage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -173,12 +173,13 @@ MILLER(1) MILLER(1)
1mVERB LIST0m
altkv bar bootstrap case cat check clean-whitespace count-distinct count
count-similar cut decimate fill-down fill-empty filter flatten format-values
fraction gap grep group-by group-like having-fields head histogram json-parse
json-stringify join label latin1-to-utf8 least-frequent merge-fields
most-frequent nest nothing put regularize remove-empty-columns rename reorder
repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records
sort sort-within-records split stats1 stats2 step summary tac tail tee
template top utf8-to-latin1 unflatten uniq unspace unsparsify
fraction gap grep group-by group-like gsub having-fields head histogram
json-parse json-stringify join label latin1-to-utf8 least-frequent
merge-fields most-frequent nest nothing put regularize remove-empty-columns
rename reorder repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle
skip-trivial-records sort sort-within-records split ssub stats1 stats2 step
sub summary tac tail tee template top utf8-to-latin1 unflatten uniq unspace
unsparsify

1mFUNCTION LIST0m
abs acos acosh any append apply arrayify asin asinh asserting_absent
Expand Down Expand Up @@ -1224,6 +1225,15 @@ MILLER(1) MILLER(1)
Options:
-h|--help Show this message.

1mgsub0m
Usage: mlr gsub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and handling multiple matches, like the `gsub` DSL function.
See also the `sub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.

1mhaving-fields0m
Usage: mlr having-fields [options]
Conditionally passes through records depending on each record's field names.
Expand Down Expand Up @@ -1832,6 +1842,14 @@ MILLER(1) MILLER(1)

See also the "tee" DSL function which lets you do more ad-hoc customization.

1mssub0m
Usage: mlr ssub [options]
Replaces old string with new string in specified field(s), without regex support for
the old string, like the `ssub` DSL function. See also the `gsub` and `sub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.

1mstats10m
Usage: mlr stats1 [options]
Computes univariate statistics for one or more given fields, accumulated across
Expand Down Expand Up @@ -1969,6 +1987,15 @@ MILLER(1) MILLER(1)
https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average
for more information on EWMA.

1msub0m
Usage: mlr sub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and not handling multiple matches, like the `sub` DSL function.
See also the `gsub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.

1msummary0m
Usage: mlr summary [options]
Show summary statistics about the input data.
Expand Down
146 changes: 146 additions & 0 deletions docs/src/reference-verbs.md
Original file line number Diff line number Diff line change
Expand Up @@ -1447,6 +1447,55 @@ record_count resource
150 /path/to/second/file
</pre>

## gsub

<pre class="pre-highlight-in-pair">
<b>mlr gsub -h</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Usage: mlr gsub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and handling multiple matches, like the `gsub` DSL function.
See also the `sub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv cat --filename then sub -f color,shape l X</b>
</pre>
<pre class="pre-non-highlight-in-pair">
filename color shape flag k index quantity rate
example.csv yeXlow triangXe true 1 11 43.6498 9.8870
example.csv red square true 2 15 79.2778 0.0130
example.csv red circXe true 3 16 13.8103 2.9010
example.csv red square false 4 48 77.5542 7.4670
example.csv purpXe triangXe false 5 51 81.2290 8.5910
example.csv red square false 6 64 77.1991 9.5310
example.csv purpXe triangXe false 7 65 80.1405 5.8240
example.csv yeXlow circXe true 8 73 63.9785 4.2370
example.csv yeXlow circXe true 9 87 63.5058 8.3350
example.csv purpXe square false 10 91 72.3735 8.2430
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv cat --filename then gsub -f color,shape l X</b>
</pre>
<pre class="pre-non-highlight-in-pair">
filename color shape flag k index quantity rate
example.csv yeXXow triangXe true 1 11 43.6498 9.8870
example.csv red square true 2 15 79.2778 0.0130
example.csv red circXe true 3 16 13.8103 2.9010
example.csv red square false 4 48 77.5542 7.4670
example.csv purpXe triangXe false 5 51 81.2290 8.5910
example.csv red square false 6 64 77.1991 9.5310
example.csv purpXe triangXe false 7 65 80.1405 5.8240
example.csv yeXXow circXe true 8 73 63.9785 4.2370
example.csv yeXXow circXe true 9 87 63.5058 8.3350
example.csv purpXe square false 10 91 72.3735 8.2430
</pre>

## having-fields

<pre class="pre-highlight-in-pair">
Expand Down Expand Up @@ -3120,6 +3169,54 @@ then there will be split_yellow_triangle.csv, split_yellow_square.csv, etc.
See also the "tee" DSL function which lets you do more ad-hoc customization.
</pre>

## ssub

<pre class="pre-highlight-in-pair">
<b>mlr ssub -h</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Usage: mlr ssub [options]
Replaces old string with new string in specified field(s), without regex support for
the old string, like the `ssub` DSL function. See also the `gsub` and `sub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv cat --filename then sub -f filename . o</b>
</pre>
<pre class="pre-non-highlight-in-pair">
filename color shape flag k index quantity rate
oxample.csv yellow triangle true 1 11 43.6498 9.8870
oxample.csv red square true 2 15 79.2778 0.0130
oxample.csv red circle true 3 16 13.8103 2.9010
oxample.csv red square false 4 48 77.5542 7.4670
oxample.csv purple triangle false 5 51 81.2290 8.5910
oxample.csv red square false 6 64 77.1991 9.5310
oxample.csv purple triangle false 7 65 80.1405 5.8240
oxample.csv yellow circle true 8 73 63.9785 4.2370
oxample.csv yellow circle true 9 87 63.5058 8.3350
oxample.csv purple square false 10 91 72.3735 8.2430
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv cat --filename then ssub -f filename . o</b>
</pre>
<pre class="pre-non-highlight-in-pair">
filename color shape flag k index quantity rate
exampleocsv yellow triangle true 1 11 43.6498 9.8870
exampleocsv red square true 2 15 79.2778 0.0130
exampleocsv red circle true 3 16 13.8103 2.9010
exampleocsv red square false 4 48 77.5542 7.4670
exampleocsv purple triangle false 5 51 81.2290 8.5910
exampleocsv red square false 6 64 77.1991 9.5310
exampleocsv purple triangle false 7 65 80.1405 5.8240
exampleocsv yellow circle true 8 73 63.9785 4.2370
exampleocsv yellow circle true 9 87 63.5058 8.3350
exampleocsv purple square false 10 91 72.3735 8.2430
</pre>

## stats1

<pre class="pre-highlight-in-pair">
Expand Down Expand Up @@ -3574,6 +3671,55 @@ $ each 10 uptime | mlr -p step -a delta -f 11

</pre>

## sub

<pre class="pre-highlight-in-pair">
<b>mlr sub -h</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Usage: mlr sub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and not handling multiple matches, like the `sub` DSL function.
See also the `gsub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv cat --filename then sub -f color,shape l X</b>
</pre>
<pre class="pre-non-highlight-in-pair">
filename color shape flag k index quantity rate
example.csv yeXlow triangXe true 1 11 43.6498 9.8870
example.csv red square true 2 15 79.2778 0.0130
example.csv red circXe true 3 16 13.8103 2.9010
example.csv red square false 4 48 77.5542 7.4670
example.csv purpXe triangXe false 5 51 81.2290 8.5910
example.csv red square false 6 64 77.1991 9.5310
example.csv purpXe triangXe false 7 65 80.1405 5.8240
example.csv yeXlow circXe true 8 73 63.9785 4.2370
example.csv yeXlow circXe true 9 87 63.5058 8.3350
example.csv purpXe square false 10 91 72.3735 8.2430
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv cat --filename then gsub -f color,shape l X</b>
</pre>
<pre class="pre-non-highlight-in-pair">
filename color shape flag k index quantity rate
example.csv yeXXow triangXe true 1 11 43.6498 9.8870
example.csv red square true 2 15 79.2778 0.0130
example.csv red circXe true 3 16 13.8103 2.9010
example.csv red square false 4 48 77.5542 7.4670
example.csv purpXe triangXe false 5 51 81.2290 8.5910
example.csv red square false 6 64 77.1991 9.5310
example.csv purpXe triangXe false 7 65 80.1405 5.8240
example.csv yeXXow circXe true 8 73 63.9785 4.2370
example.csv yeXXow circXe true 9 87 63.5058 8.3350
example.csv purpXe square false 10 91 72.3735 8.2430
</pre>

## summary

<pre class="pre-highlight-in-pair">
Expand Down
42 changes: 42 additions & 0 deletions docs/src/reference-verbs.md.in
Original file line number Diff line number Diff line change
Expand Up @@ -487,6 +487,20 @@ GENMD-RUN-COMMAND
mlr --opprint group-like data/het.dkvp
GENMD-EOF

## gsub

GENMD-RUN-COMMAND
mlr gsub -h
GENMD-EOF

GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv cat --filename then sub -f color,shape l X
GENMD-EOF

GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv cat --filename then gsub -f color,shape l X
GENMD-EOF

## having-fields

GENMD-RUN-COMMAND
Expand Down Expand Up @@ -987,6 +1001,20 @@ GENMD-RUN-COMMAND
mlr split --help
GENMD-EOF

## ssub

GENMD-RUN-COMMAND
mlr ssub -h
GENMD-EOF

GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv cat --filename then sub -f filename . o
GENMD-EOF

GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv cat --filename then ssub -f filename . o
GENMD-EOF

## stats1

GENMD-RUN-COMMAND
Expand Down Expand Up @@ -1095,6 +1123,20 @@ Example deriving uptime-delta from system uptime:

GENMD-INCLUDE-ESCAPED(data/ping-delta-example.txt)

## sub

GENMD-RUN-COMMAND
mlr sub -h
GENMD-EOF

GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv cat --filename then sub -f color,shape l X
GENMD-EOF

GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv cat --filename then gsub -f color,shape l X
GENMD-EOF

## summary

GENMD-RUN-COMMAND
Expand Down
3 changes: 3 additions & 0 deletions internal/pkg/transformers/aaa_transformer_table.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ var TRANSFORMER_LOOKUP_TABLE = []TransformerSetup{
GrepSetup,
GroupBySetup,
GroupLikeSetup,
GsubSetup,
HavingFieldsSetup,
HeadSetup,
HistogramSetup,
Expand Down Expand Up @@ -62,9 +63,11 @@ var TRANSFORMER_LOOKUP_TABLE = []TransformerSetup{
SortSetup,
SortWithinRecordsSetup,
SplitSetup,
SsubSetup,
Stats1Setup,
Stats2Setup,
StepSetup,
SubSetup,
SummarySetup,
TacSetup,
TailSetup,
Expand Down
Loading