-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add utility to print summary of a vector #11859
Conversation
This pull request was exported from Phabricator. Differential Revision: D67229321 |
✅ Deploy Preview for meta-velox canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbasmanova LGTM. Thanks for adding this!
Summary: Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector. The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory. Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps. ``` ROW(4) 8 rows ROW 528B INTEGER 8 rows FLAT 32B ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] BIGINT 12 rows FLAT 128B MAP 8 rows DICTIONARY 192B Stats: 0 nulls, 4 unique MAP 4 rows MAP 160B Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2] INTEGER 8 rows FLAT 32B REAL 8 rows FLAT 32B VARCHAR 8 rows CONSTANT 16B ``` The summary optionally includes unique node IDs to allow for easy referencing. ``` 0 ROW(4) 8 rows ROW 528B 0.0 INTEGER 8 rows FLAT 32B 0.1 ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] 0.1.0 BIGINT 12 rows FLAT 128B 0.2 MAP 8 rows DICTIONARY 192B Stats: 0 nulls, 4 unique 0.2.0 MAP 4 rows MAP 160B Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2] 0.2.0.0 INTEGER 8 rows FLAT 32B 0.2.0.1 REAL 8 rows FLAT 32B 0.3 VARCHAR 8 rows CONSTANT 16B ``` The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren: ``` ROW(4) 8 rows ROW 528B INTEGER 8 rows FLAT 32B ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] BIGINT 12 rows FLAT 128B ...2 more ``` Reviewed By: xiaoxmeng Differential Revision: D67229321
59ef4a5
to
cda995a
Compare
This pull request was exported from Phabricator. Differential Revision: D67229321 |
Summary: Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector. The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory. Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps. ``` ROW(4) 8 rows ROW 528B INTEGER 8 rows FLAT 32B ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] BIGINT 12 rows FLAT 128B MAP 8 rows DICTIONARY 192B Stats: 0 nulls, 4 unique MAP 4 rows MAP 160B Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2] INTEGER 8 rows FLAT 32B REAL 8 rows FLAT 32B VARCHAR 8 rows CONSTANT 16B ``` The summary optionally includes unique node IDs to allow for easy referencing. ``` 0 ROW(4) 8 rows ROW 528B 0.0 INTEGER 8 rows FLAT 32B 0.1 ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] 0.1.0 BIGINT 12 rows FLAT 128B 0.2 MAP 8 rows DICTIONARY 192B Stats: 0 nulls, 4 unique 0.2.0 MAP 4 rows MAP 160B Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2] 0.2.0.0 INTEGER 8 rows FLAT 32B 0.2.0.1 REAL 8 rows FLAT 32B 0.3 VARCHAR 8 rows CONSTANT 16B ``` The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren: ``` ROW(4) 8 rows ROW 528B INTEGER 8 rows FLAT 32B ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] BIGINT 12 rows FLAT 128B ...2 more ``` Reviewed By: xiaoxmeng Differential Revision: D67229321
cda995a
to
3c52409
Compare
This pull request was exported from Phabricator. Differential Revision: D67229321 |
3c52409
to
13c0328
Compare
Summary: Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector. The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory. Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps. ``` ROW(4) 8 rows ROW 528B INTEGER 8 rows FLAT 32B ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] BIGINT 12 rows FLAT 128B MAP 8 rows DICTIONARY 192B Stats: 0 nulls, 4 unique MAP 4 rows MAP 160B Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2] INTEGER 8 rows FLAT 32B REAL 8 rows FLAT 32B VARCHAR 8 rows CONSTANT 16B ``` The summary optionally includes unique node IDs to allow for easy referencing. ``` 0 ROW(4) 8 rows ROW 528B 0.0 INTEGER 8 rows FLAT 32B 0.1 ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] 0.1.0 BIGINT 12 rows FLAT 128B 0.2 MAP 8 rows DICTIONARY 192B Stats: 0 nulls, 4 unique 0.2.0 MAP 4 rows MAP 160B Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2] 0.2.0.0 INTEGER 8 rows FLAT 32B 0.2.0.1 REAL 8 rows FLAT 32B 0.3 VARCHAR 8 rows CONSTANT 16B ``` The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren: ``` ROW(4) 8 rows ROW 528B INTEGER 8 rows FLAT 32B ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] BIGINT 12 rows FLAT 128B ...2 more ``` Reviewed By: xiaoxmeng Differential Revision: D67229321
This pull request was exported from Phabricator. Differential Revision: D67229321 |
Summary: Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector. The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory. Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps. ``` ROW(4) 8 rows ROW 528B INTEGER 8 rows FLAT 32B ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] BIGINT 12 rows FLAT 128B MAP 8 rows DICTIONARY 192B Stats: 0 nulls, 4 unique MAP 4 rows MAP 160B Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2] INTEGER 8 rows FLAT 32B REAL 8 rows FLAT 32B VARCHAR 8 rows CONSTANT 16B ``` The summary optionally includes unique node IDs to allow for easy referencing. ``` 0 ROW(4) 8 rows ROW 528B 0.0 INTEGER 8 rows FLAT 32B 0.1 ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] 0.1.0 BIGINT 12 rows FLAT 128B 0.2 MAP 8 rows DICTIONARY 192B Stats: 0 nulls, 4 unique 0.2.0 MAP 4 rows MAP 160B Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2] 0.2.0.0 INTEGER 8 rows FLAT 32B 0.2.0.1 REAL 8 rows FLAT 32B 0.3 VARCHAR 8 rows CONSTANT 16B ``` The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren: ``` ROW(4) 8 rows ROW 528B INTEGER 8 rows FLAT 32B ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] BIGINT 12 rows FLAT 128B ...2 more ``` Reviewed By: xiaoxmeng Differential Revision: D67229321
13c0328
to
a327772
Compare
This pull request was exported from Phabricator. Differential Revision: D67229321 |
Summary: Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector. The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory. Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps. ``` ROW(4) 8 rows ROW 528B INTEGER 8 rows FLAT 32B ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] BIGINT 12 rows FLAT 128B MAP 8 rows DICTIONARY 192B Stats: 0 nulls, 4 unique MAP 4 rows MAP 160B Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2] INTEGER 8 rows FLAT 32B REAL 8 rows FLAT 32B VARCHAR 8 rows CONSTANT 16B ``` The summary optionally includes unique node IDs to allow for easy referencing. ``` 0 ROW(4) 8 rows ROW 528B 0.0 INTEGER 8 rows FLAT 32B 0.1 ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] 0.1.0 BIGINT 12 rows FLAT 128B 0.2 MAP 8 rows DICTIONARY 192B Stats: 0 nulls, 4 unique 0.2.0 MAP 4 rows MAP 160B Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2] 0.2.0.0 INTEGER 8 rows FLAT 32B 0.2.0.1 REAL 8 rows FLAT 32B 0.3 VARCHAR 8 rows CONSTANT 16B ``` The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren: ``` ROW(4) 8 rows ROW 528B INTEGER 8 rows FLAT 32B ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] BIGINT 12 rows FLAT 128B ...2 more ``` Reviewed By: xiaoxmeng Differential Revision: D67229321
a327772
to
358eb7b
Compare
This pull request was exported from Phabricator. Differential Revision: D67229321 |
macos-14 CI failure seems unrelated: #11871 |
Summary: Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector. The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory. Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps. ``` ROW(4) 8 rows ROW 528B INTEGER 8 rows FLAT 32B ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] BIGINT 12 rows FLAT 128B MAP 8 rows DICTIONARY 192B Stats: 0 nulls, 4 unique MAP 4 rows MAP 160B Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2] INTEGER 8 rows FLAT 32B REAL 8 rows FLAT 32B VARCHAR 8 rows CONSTANT 16B ``` The summary optionally includes unique node IDs to allow for easy referencing. ``` 0 ROW(4) 8 rows ROW 528B 0.0 INTEGER 8 rows FLAT 32B 0.1 ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] 0.1.0 BIGINT 12 rows FLAT 128B 0.2 MAP 8 rows DICTIONARY 192B Stats: 0 nulls, 4 unique 0.2.0 MAP 4 rows MAP 160B Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2] 0.2.0.0 INTEGER 8 rows FLAT 32B 0.2.0.1 REAL 8 rows FLAT 32B 0.3 VARCHAR 8 rows CONSTANT 16B ``` The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren: ``` ROW(4) 8 rows ROW 528B INTEGER 8 rows FLAT 32B ARRAY 8 rows ARRAY 288B Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3] BIGINT 12 rows FLAT 128B ...2 more ``` Reviewed By: xiaoxmeng Differential Revision: D67229321
358eb7b
to
74bf4d6
Compare
This pull request was exported from Phabricator. Differential Revision: D67229321 |
This pull request has been merged in 2d31862. |
Summary:
Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector.
The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory.
Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps.
The summary optionally includes unique node IDs to allow for easy referencing.
The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren:
Differential Revision: D67229321