Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to verbose mode output #2839

Merged
merged 1 commit into from
Nov 5, 2021

Conversation

ghost
Copy link

@ghost ghost commented Nov 1, 2021

Draft PR, still in progress

When completed, this will close #2834.

@facebook-github-bot
Copy link

Hi @Svetlitski-FB!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

lib/zstd.h Outdated Show resolved Hide resolved
lib/decompress/zstd_decompress.c Outdated Show resolved Hide resolved
@Cyan4973
Copy link
Contributor

Cyan4973 commented Nov 1, 2021

Generally speaking, for a CLI feature, we try to keep all code modifications at CLI level only (directory programs/).
Library territory (directory lib/) is supposed to be off-limit, except in special circumstances where the feature is really not possible or too inconvenient without additional library support.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Nov 1, 2021

This PR delivers a feature which displays the window size of the currently decompressed frame if the CLI has received verbosity level 4 or more (aka command -vv). A few points:

  • This is different from requested feature at zstd verbose output could give more info #2834, which asks for this information at compression time instead. It doesn't make the feature bad, just different from the request, and therefore cannot be used to close afore-mentioned issue.
  • With zstd verbose output could give more info #2834, the request is to imitate xz -vv during compression. However, there is nothing equivalent during decompression : unxz -vv doesn't provide any information regarding window size. While this is not necessarily a blocker, it introduces a risk of goal definition : how should the feature behave, since there is no model to just copy ?
  • Here comes the kicker : while many files consist of a single frame, there are cases of files which consist of multiple appended frames. Each frame is entitled to its window size. How does the feature work in this case ? Define it, and test this more complex case if you plan to deliver this feature.

@ghost
Copy link
Author

ghost commented Nov 1, 2021

This PR delivers a feature which displays the window size of the currently decompressed frame if the CLI has received verbosity level 4 or more (aka command -vv). A few points:

  • This is different from requested feature at zstd verbose output could give more info #2834, which asks for this information at compression time instead. It doesn't make the feature bad, just different from the request, and therefore cannot be used to close afore-mentioned issue.
  • With zstd verbose output could give more info #2834, the request is to imitate xz -vv during compression. However, there is nothing equivalent during decompression : unxz -vv doesn't provide any information regarding window size. While this is not necessarily a blocker, it introduces a risk of goal definition : how should the feature behave, since there is no model to just copy ?
  • Here comes the kicker : while many files consist of a single frame, there are cases of files which consist of multiple appended frames. Each frame is entitled to its window size. How does the feature work in this case ? Define it, and test this more complex case if you plan to deliver this feature.

Thanks for the clarification, and for bringing that more complex case to light, I was definitely unaware of that possibility. I do believe this feature was requested though (the first bulletpoint in the issue says: "Decompression memory requirements (rough number, it's not precise)").

@Cyan4973
Copy link
Contributor

Cyan4973 commented Nov 1, 2021

I do believe this feature was requested though (the first bulletpoint in the issue says: "Decompression memory requirements (rough number, it's not precise)").

It's possible to know the decompression memory requirement at compression time.
That's actually useful, because user creating the compressed document might realize that they are requiring too much resources to the decompression side in order to later access said document, and decide to alter parameters.

It also corresponds to the information delivered by xz -vv (third line). Example (xz -vv enwik8):

xz: Filter chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 94 MiB of memory is required. The limiter is disabled.
xz: Decompression will need 9 MiB of memory.

@ghost
Copy link
Author

ghost commented Nov 1, 2021

I do believe this feature was requested though (the first bulletpoint in the issue says: "Decompression memory requirements (rough number, it's not precise)").

It's possible to know the decompression memory requirement at compression time. That's actually useful, because user creating the compressed document might realize that they are requiring too much resources to the decompression side in order to later access said document, and decide to alter parameters.

It also corresponds to the information delivered by xz -vv (third line). Example (xz -vv enwik8):

xz: Filter chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: 94 MiB of memory is required. The limiter is disabled.
xz: Decompression will need 9 MiB of memory.

Understood, sorry for the misunderstanding.

@ghost ghost force-pushed the improve-verbose-output branch from 237cd51 to fc957a9 Compare November 1, 2021 20:31
@Cyan4973
Copy link
Contributor

Cyan4973 commented Nov 1, 2021

The updated PR looks good to me.

We could work on merging this one, then you can continue the topic by delivering more value in another PR.

programs/fileio.c Outdated Show resolved Hide resolved
@ghost
Copy link
Author

ghost commented Nov 1, 2021

The updated PR looks good to me.

We could work on merging this one, then you can continue the topic by delivering more value in another PR.

Sure, or I could add more self-contained commits to this branch to reduce the number of merge commits that end up on dev and thereby avoid polluting the git history. Whichever you prefer.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Nov 1, 2021

Whichever method you select is good to me.

If you have a short delay between 2 consecutive commits, it seems better to continue stacking commits in the same PR.
But if delays grow too much between consecutive commits, it becomes better to merge what is ready to be merged, and give freedom to restart the work later without the pressure to "not forget" previous work.

@ghost ghost force-pushed the improve-verbose-output branch from fc957a9 to 86e4ad1 Compare November 1, 2021 21:54
@ghost
Copy link
Author

ghost commented Nov 1, 2021

Alright, in that case if I can get the next change in by the end of the day today I'll keep it in this PR, and otherwise I'll start a new one.

@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

Copy link
Contributor

@Cyan4973 Cyan4973 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one last type mismatch to fix

programs/fileio.c Outdated Show resolved Hide resolved
@ghost ghost force-pushed the improve-verbose-output branch from f230f80 to 3ddd1a0 Compare November 4, 2021 23:13
@ghost ghost force-pushed the improve-verbose-output branch 2 times, most recently from 5767632 to b388819 Compare November 4, 2021 23:27
@Cyan4973 Cyan4973 marked this pull request as ready for review November 4, 2021 23:53
@Cyan4973
Copy link
Contributor

Cyan4973 commented Nov 5, 2021

Thanks @Svetlitski-FB !

@Cyan4973 Cyan4973 merged commit 5375d75 into facebook:dev Nov 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

zstd verbose output could give more info
3 participants