Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add utility to print summary of a vector #11859

Closed

Conversation

mbasmanova
Copy link
Contributor

Summary:
Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector.

The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory.

Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps.

ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         INTEGER 8 rows FLAT 32B
         REAL 8 rows FLAT 32B
   VARCHAR 8 rows CONSTANT 16B

The summary optionally includes unique node IDs to allow for easy referencing.

0 ROW(4) 8 rows ROW 528B
   0.0 INTEGER 8 rows FLAT 32B
   0.1 ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      0.1.0 BIGINT 12 rows FLAT 128B
   0.2 MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      0.2.0 MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         0.2.0.0 INTEGER 8 rows FLAT 32B
         0.2.0.1 REAL 8 rows FLAT 32B
   0.3 VARCHAR 8 rows CONSTANT 16B

The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren:

ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   ...2 more

Differential Revision: D67229321

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 14, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67229321

Copy link

netlify bot commented Dec 14, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 74bf4d6
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/676015b9d963020008c04eab

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbasmanova LGTM. Thanks for adding this!

mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Dec 14, 2024
Summary:

Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector.

The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory.

Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps.

```
ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         INTEGER 8 rows FLAT 32B
         REAL 8 rows FLAT 32B
   VARCHAR 8 rows CONSTANT 16B
```

The summary optionally includes unique node IDs to allow for easy referencing.

```
0 ROW(4) 8 rows ROW 528B
   0.0 INTEGER 8 rows FLAT 32B
   0.1 ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      0.1.0 BIGINT 12 rows FLAT 128B
   0.2 MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      0.2.0 MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         0.2.0.0 INTEGER 8 rows FLAT 32B
         0.2.0.1 REAL 8 rows FLAT 32B
   0.3 VARCHAR 8 rows CONSTANT 16B
```

The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren:

```
ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   ...2 more
```

Reviewed By: xiaoxmeng

Differential Revision: D67229321
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67229321

mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Dec 14, 2024
Summary:

Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector.

The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory.

Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps.

```
ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         INTEGER 8 rows FLAT 32B
         REAL 8 rows FLAT 32B
   VARCHAR 8 rows CONSTANT 16B
```

The summary optionally includes unique node IDs to allow for easy referencing.

```
0 ROW(4) 8 rows ROW 528B
   0.0 INTEGER 8 rows FLAT 32B
   0.1 ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      0.1.0 BIGINT 12 rows FLAT 128B
   0.2 MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      0.2.0 MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         0.2.0.0 INTEGER 8 rows FLAT 32B
         0.2.0.1 REAL 8 rows FLAT 32B
   0.3 VARCHAR 8 rows CONSTANT 16B
```

The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren:

```
ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   ...2 more
```

Reviewed By: xiaoxmeng

Differential Revision: D67229321
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67229321

mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Dec 16, 2024
Summary:

Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector.

The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory.

Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps.

```
ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         INTEGER 8 rows FLAT 32B
         REAL 8 rows FLAT 32B
   VARCHAR 8 rows CONSTANT 16B
```

The summary optionally includes unique node IDs to allow for easy referencing.

```
0 ROW(4) 8 rows ROW 528B
   0.0 INTEGER 8 rows FLAT 32B
   0.1 ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      0.1.0 BIGINT 12 rows FLAT 128B
   0.2 MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      0.2.0 MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         0.2.0.0 INTEGER 8 rows FLAT 32B
         0.2.0.1 REAL 8 rows FLAT 32B
   0.3 VARCHAR 8 rows CONSTANT 16B
```

The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren:

```
ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   ...2 more
```

Reviewed By: xiaoxmeng

Differential Revision: D67229321
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67229321

mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Dec 16, 2024
Summary:

Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector.

The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory.

Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps.

```
ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         INTEGER 8 rows FLAT 32B
         REAL 8 rows FLAT 32B
   VARCHAR 8 rows CONSTANT 16B
```

The summary optionally includes unique node IDs to allow for easy referencing.

```
0 ROW(4) 8 rows ROW 528B
   0.0 INTEGER 8 rows FLAT 32B
   0.1 ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      0.1.0 BIGINT 12 rows FLAT 128B
   0.2 MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      0.2.0 MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         0.2.0.0 INTEGER 8 rows FLAT 32B
         0.2.0.1 REAL 8 rows FLAT 32B
   0.3 VARCHAR 8 rows CONSTANT 16B
```

The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren:

```
ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   ...2 more
```

Reviewed By: xiaoxmeng

Differential Revision: D67229321
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67229321

mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Dec 16, 2024
Summary:

Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector.

The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory.

Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps.

```
ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         INTEGER 8 rows FLAT 32B
         REAL 8 rows FLAT 32B
   VARCHAR 8 rows CONSTANT 16B
```

The summary optionally includes unique node IDs to allow for easy referencing.

```
0 ROW(4) 8 rows ROW 528B
   0.0 INTEGER 8 rows FLAT 32B
   0.1 ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      0.1.0 BIGINT 12 rows FLAT 128B
   0.2 MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      0.2.0 MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         0.2.0.0 INTEGER 8 rows FLAT 32B
         0.2.0.1 REAL 8 rows FLAT 32B
   0.3 VARCHAR 8 rows CONSTANT 16B
```

The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren:

```
ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   ...2 more
```

Reviewed By: xiaoxmeng

Differential Revision: D67229321
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67229321

@mbasmanova
Copy link
Contributor Author

mbasmanova commented Dec 16, 2024

macos-14 CI failure seems unrelated: #11871

Summary:

Introduce VectorPrinter::summarizeToText helper function to generate human-friendly summary of a vector.

The summary shows the overall hierarchy of the vector and annotates each node with data type, number of rows, encoding, and size in memory.

Dictionary nodes include number of unique indices. Array and Map nodes specify number of empty arrays / maps as well as min/max/avg sizes of the arrays / maps.

```
ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         INTEGER 8 rows FLAT 32B
         REAL 8 rows FLAT 32B
   VARCHAR 8 rows CONSTANT 16B
```

The summary optionally includes unique node IDs to allow for easy referencing.

```
0 ROW(4) 8 rows ROW 528B
   0.0 INTEGER 8 rows FLAT 32B
   0.1 ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      0.1.0 BIGINT 12 rows FLAT 128B
   0.2 MAP 8 rows DICTIONARY 192B
         Stats: 0 nulls, 4 unique
      0.2.0 MAP 4 rows MAP 160B
            Stats: 0 nulls, 1 empty, sizes: [1...4, avg 2]
         0.2.0.0 INTEGER 8 rows FLAT 32B
         0.2.0.1 REAL 8 rows FLAT 32B
   0.3 VARCHAR 8 rows CONSTANT 16B
```

The number of RowVector chidren is limited to 5, but can be increased by specifying options.maxChildren:

```
ROW(4) 8 rows ROW 528B
   INTEGER 8 rows FLAT 32B
   ARRAY 8 rows ARRAY 288B
         Stats: 3 nulls, 1 empty, sizes: [2...4, avg 3]
      BIGINT 12 rows FLAT 128B
   ...2 more
```

Reviewed By: xiaoxmeng

Differential Revision: D67229321
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67229321

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 2d31862.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants