Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #70 (implicitly), #23. May also have an impact on the
"high memory usage" issues but I'm doing more testing there.
Adds:
-a
,-all
flag which means "decode all the objects,pretending it's a JSON stream even if it's not actually."
Rationale:
gron
only decodes the first object,gron -s
requires a "correctly" formatted JSON stream (one object per
line), but it's not uncommon to get multiple objects per line
with tools that don't support JSON stream formatting.
This does require a positionable stream, however, since the
JSON decoder can read past the end of an object to be sure its
parsed correctly.
io.Seekable
doesn't work, unfortunately,because whilst we know where we want to be (
d.InputOffset()
),we don't actually know where we currently are which precludes
the use of
io.SeekCurrent
and, bizarrely, it turns out thatio.SeekSet
gets progressively slower as you seek further andfurther into your (in this case)
bytes.Buffer
.Thus we keep track of where we want to be (
moved
) and createa
bytes.NewReader
for each attempted decode at the correctposition. Crufty, definitely, and memory-allocation heavy,
probably, but it works and is surprisingly not that bad even
on large files.
My test 85MB JSON single line input takes ~64s (x86_64),
~43s (arm64) and ~275M to parse into 1024 objects comprising
1GB of output text. Compare to
jq
: ~25s (x86_64),~11s (arm64) using ~630M giving 350MB of output.