Improvements to verbose mode output #2839

ghost · 2021-11-01T17:45:12Z

Draft PR, still in progress

When completed, this will close #2834.

facebook-github-bot · 2021-11-01T17:45:15Z

Hi @Svetlitski-FB!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

lib/zstd.h

lib/decompress/zstd_decompress.c

Cyan4973 · 2021-11-01T18:01:54Z

Generally speaking, for a CLI feature, we try to keep all code modifications at CLI level only (directory programs/).
Library territory (directory lib/) is supposed to be off-limit, except in special circumstances where the feature is really not possible or too inconvenient without additional library support.

Cyan4973 · 2021-11-01T18:17:45Z

This PR delivers a feature which displays the window size of the currently decompressed frame if the CLI has received verbosity level 4 or more (aka command -vv). A few points:

This is different from requested feature at zstd verbose output could give more info #2834, which asks for this information at compression time instead. It doesn't make the feature bad, just different from the request, and therefore cannot be used to close afore-mentioned issue.
With zstd verbose output could give more info #2834, the request is to imitate xz -vv during compression. However, there is nothing equivalent during decompression : unxz -vv doesn't provide any information regarding window size. While this is not necessarily a blocker, it introduces a risk of goal definition : how should the feature behave, since there is no model to just copy ?
Here comes the kicker : while many files consist of a single frame, there are cases of files which consist of multiple appended frames. Each frame is entitled to its window size. How does the feature work in this case ? Define it, and test this more complex case if you plan to deliver this feature.

ghost · 2021-11-01T18:30:46Z

This PR delivers a feature which displays the window size of the currently decompressed frame if the CLI has received verbosity level 4 or more (aka command -vv). A few points:

This is different from requested feature at zstd verbose output could give more info #2834, which asks for this information at compression time instead. It doesn't make the feature bad, just different from the request, and therefore cannot be used to close afore-mentioned issue.

With zstd verbose output could give more info #2834, the request is to imitate xz -vv during compression. However, there is nothing equivalent during decompression : unxz -vv doesn't provide any information regarding window size. While this is not necessarily a blocker, it introduces a risk of goal definition : how should the feature behave, since there is no model to just copy ?

Here comes the kicker : while many files consist of a single frame, there are cases of files which consist of multiple appended frames. Each frame is entitled to its window size. How does the feature work in this case ? Define it, and test this more complex case if you plan to deliver this feature.

Thanks for the clarification, and for bringing that more complex case to light, I was definitely unaware of that possibility. I do believe this feature was requested though (the first bulletpoint in the issue says: "Decompression memory requirements (rough number, it's not precise)").

Cyan4973 · 2021-11-01T18:37:13Z

I do believe this feature was requested though (the first bulletpoint in the issue says: "Decompression memory requirements (rough number, it's not precise)").

It's possible to know the decompression memory requirement at compression time.
That's actually useful, because user creating the compressed document might realize that they are requiring too much resources to the decompression side in order to later access said document, and decide to alter parameters.

It also corresponds to the information delivered by xz -vv (third line). Example (xz -vv enwik8):

xz: Filter chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 94 MiB of memory is required. The limiter is disabled.
xz: Decompression will need 9 MiB of memory.

ghost · 2021-11-01T18:38:44Z

I do believe this feature was requested though (the first bulletpoint in the issue says: "Decompression memory requirements (rough number, it's not precise)").

It's possible to know the decompression memory requirement at compression time. That's actually useful, because user creating the compressed document might realize that they are requiring too much resources to the decompression side in order to later access said document, and decide to alter parameters.

It also corresponds to the information delivered by xz -vv (third line). Example (xz -vv enwik8):
xz: Filter chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 94 MiB of memory is required. The limiter is disabled.
xz: Decompression will need 9 MiB of memory.

Understood, sorry for the misunderstanding.

Cyan4973 · 2021-11-01T20:49:18Z

The updated PR looks good to me.

We could work on merging this one, then you can continue the topic by delivering more value in another PR.

programs/fileio.c

ghost · 2021-11-01T21:36:41Z

The updated PR looks good to me.

We could work on merging this one, then you can continue the topic by delivering more value in another PR.

Sure, or I could add more self-contained commits to this branch to reduce the number of merge commits that end up on dev and thereby avoid polluting the git history. Whichever you prefer.

Cyan4973 · 2021-11-01T21:50:47Z

Whichever method you select is good to me.

If you have a short delay between 2 consecutive commits, it seems better to continue stacking commits in the same PR.
But if delays grow too much between consecutive commits, it becomes better to merge what is ready to be merged, and give freedom to restart the work later without the pressure to "not forget" previous work.

ghost · 2021-11-01T21:55:40Z

Alright, in that case if I can get the next change in by the end of the day today I'll keep it in this PR, and otherwise I'll start a new one.

facebook-github-bot · 2021-11-01T22:33:03Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

Cyan4973

one last type mismatch to fix

programs/fileio.c

Cyan4973 · 2021-11-05T01:26:21Z

Thanks @Svetlitski-FB !

Cyan4973 requested changes Nov 1, 2021

View reviewed changes

lib/zstd.h Outdated Show resolved Hide resolved

lib/decompress/zstd_decompress.c Outdated Show resolved Hide resolved

ghost force-pushed the improve-verbose-output branch from 237cd51 to fc957a9 Compare November 1, 2021 20:31

Cyan4973 reviewed Nov 1, 2021

View reviewed changes

programs/fileio.c Outdated Show resolved Hide resolved

ghost force-pushed the improve-verbose-output branch from fc957a9 to 86e4ad1 Compare November 1, 2021 21:54

facebook-github-bot added the CLA Signed label Nov 1, 2021

Cyan4973 requested changes Nov 2, 2021

View reviewed changes

programs/fileio.c Outdated Show resolved Hide resolved

ghost force-pushed the improve-verbose-output branch from f230f80 to 3ddd1a0 Compare November 4, 2021 23:13

Report memory required to decompress while compressing in verbose mode

b388819

ghost force-pushed the improve-verbose-output branch 2 times, most recently from 5767632 to b388819 Compare November 4, 2021 23:27

Cyan4973 marked this pull request as ready for review November 4, 2021 23:53

Cyan4973 approved these changes Nov 5, 2021

View reviewed changes

Cyan4973 merged commit 5375d75 into facebook:dev Nov 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to verbose mode output #2839

Improvements to verbose mode output #2839

ghost commented Nov 1, 2021 •

edited by ghost

Loading

facebook-github-bot commented Nov 1, 2021

Cyan4973 commented Nov 1, 2021

Cyan4973 commented Nov 1, 2021

ghost commented Nov 1, 2021

Cyan4973 commented Nov 1, 2021

ghost commented Nov 1, 2021

Cyan4973 commented Nov 1, 2021

ghost commented Nov 1, 2021

Cyan4973 commented Nov 1, 2021 •

edited

Loading

ghost commented Nov 1, 2021

facebook-github-bot commented Nov 1, 2021

Cyan4973 left a comment

Cyan4973 commented Nov 5, 2021

Improvements to verbose mode output #2839

Improvements to verbose mode output #2839

Conversation

ghost commented Nov 1, 2021 • edited by ghost Loading

facebook-github-bot commented Nov 1, 2021

Action Required

Process

Cyan4973 commented Nov 1, 2021

Cyan4973 commented Nov 1, 2021

ghost commented Nov 1, 2021

Cyan4973 commented Nov 1, 2021

ghost commented Nov 1, 2021

Cyan4973 commented Nov 1, 2021

ghost commented Nov 1, 2021

Cyan4973 commented Nov 1, 2021 • edited Loading

ghost commented Nov 1, 2021

facebook-github-bot commented Nov 1, 2021

Cyan4973 left a comment

Choose a reason for hiding this comment

Cyan4973 commented Nov 5, 2021

ghost commented Nov 1, 2021 •

edited by ghost

Loading

Cyan4973 commented Nov 1, 2021 •

edited

Loading