You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The code that reads in nested lists in rust/arrow/src/json/reader.rs does an extra copy (via Vec::clone) that caused 20% slowdown in a benchmark compared to not cloning.
The goal of this ticket would be to improve the performance of reading JSON in this case, likely by avoiding the clone
As [~nevi_me] says:
{quote}
I suspect the main perf loss is from having to peek into JSON values in order to make the nesting work.
By this, I mean that if we have {"a": [_, _, ]}, we extract a values into a Vec, i.e. [, _, _].
By extracting values, we are able to then use the reader to read &[Value] without caring about its key (a).
The downside of this approach is that we have to clone values to get Vec, as I couldn't find an alternative
{quote}
The text was updated successfully, but these errors were encountered:
Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-11002
The code that reads in nested lists in rust/arrow/src/json/reader.rs does an extra copy (via
Vec::clone
) that caused 20% slowdown in a benchmark compared to not cloning.The goal of this ticket would be to improve the performance of reading JSON in this case, likely by avoiding the clone
More details can be found here:
apache/arrow#8938 (review)
As [~nevi_me] says:
{quote}
I suspect the main perf loss is from having to peek into JSON values in order to make the nesting work.
By this, I mean that if we have {"a": [_, _, ]}, we extract a values into a Vec, i.e. [, _, _].
By extracting values, we are able to then use the reader to read &[Value] without caring about its key (a).
The downside of this approach is that we have to clone values to get Vec, as I couldn't find an alternative
{quote}
The text was updated successfully, but these errors were encountered: