-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Codec] Finite scalar quantizer #7886
Conversation
jenkins |
003735a
to
78df06d
Compare
if len(vq_output_types) == 3 and vq_output_types[-1] == 'commit_loss': | ||
self.vector_quantizer_has_commit_loss = True | ||
logging.info('Vector quantizer supports commit loss.') | ||
else: | ||
self.vector_quantizer_has_commit_loss = False | ||
logging.info('Vector quantizer does not support commit loss.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be simpler if we require the VQ to return commit loss and modify FSQ to return 0 commit loss?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do that as well, we could skip ~7 lines above.
A disadvantage may be that we could still set commit_loss_scale
, even if the quantizer is always returning 0.0, which may be confusing.
I implemented something like that earlier:
an abstract base class VectorQuantizer
with methods forward
(including commit_loss
), encode
, decode
and the same typecheck types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is not much extra work, I think having the abstract base class would be better. It does not matter much now with only 2 implementations, but should help long term as we inevitably add more types of quantizers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge this PR and we can factorize out to an abstract class in a follow-up.
Sounds good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, sounds good.
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
jenkins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if len(vq_output_types) == 3 and vq_output_types[-1] == 'commit_loss': | ||
self.vector_quantizer_has_commit_loss = True | ||
logging.info('Vector quantizer supports commit loss.') | ||
else: | ||
self.vector_quantizer_has_commit_loss = False | ||
logging.info('Vector quantizer does not support commit loss.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is not much extra work, I think having the abstract base class would be better. It does not matter much now with only 2 implementations, but should help long term as we inevitably add more types of quantizers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Chen Cui <chcui@nvidia.com> support packed dataset Signed-off-by: Chen Cui <chcui@nvidia.com> [Codec] Finite scalar quantizer (NVIDIA#7886) * Finite scalar quantizer Signed-off-by: Ante Jukić <ajukic@nvidia.com> * Updated test Signed-off-by: Ante Jukić <ajukic@nvidia.com> --------- Signed-off-by: Ante Jukić <ajukic@nvidia.com> upgrade to latest mcore and TE (NVIDIA#7908) * reimport module Signed-off-by: dimapihtar <dpihtar@gmail.com> * update mcore and TE commits Signed-off-by: dimapihtar <dpihtar@gmail.com> --------- Signed-off-by: dimapihtar <dpihtar@gmail.com> Tar codec (NVIDIA#7867) added missing torch import (NVIDIA#7913) Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com> add cpu init check (NVIDIA#7889) Signed-off-by: Chen Cui <chcui@nvidia.com> Fix pinned triton version (NVIDIA#7925) * Fix pinned triton version Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Remove comment Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change README Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Remove flash-attn in Dockerfile Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Revert Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> --------- Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> fix tp_overlap config var name (NVIDIA#7928) Signed-off-by: Xiaowei Ren <xren@nvidia.com> add Dutch P&C FC model info (NVIDIA#7892) * add Dutch P&C FC model info Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com> * update order of the results Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com> --------- Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com> fix issues with convert_nemo_llama_to_hf.py (NVIDIA#7922) [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci fix collate_fn bug for TP > 1 Signed-off-by: Chen Cui <chcui@nvidia.com> [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci make packed dataset work Signed-off-by: Chen Cui <chcui@nvidia.com> fix nan bug Signed-off-by: Chen Cui <chcui@nvidia.com> [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci support answer only loss Signed-off-by: Chen Cui <chcui@nvidia.com> [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci account for padding in cu_seqlens during dataloading for attn kernel Signed-off-by: Chen Cui <chcui@nvidia.com> fix path for answer_only_loss = false Signed-off-by: Chen Cui <chcui@nvidia.com> Modify GPTSFTPackedDataset to respond to pad_to_max_length setting Signed-off-by: Valerie Sarge <vsarge@nvidia.com>
Signed-off-by: Chen Cui <chcui@nvidia.com> support packed dataset Signed-off-by: Chen Cui <chcui@nvidia.com> [Codec] Finite scalar quantizer (NVIDIA#7886) * Finite scalar quantizer Signed-off-by: Ante Jukić <ajukic@nvidia.com> * Updated test Signed-off-by: Ante Jukić <ajukic@nvidia.com> --------- Signed-off-by: Ante Jukić <ajukic@nvidia.com> upgrade to latest mcore and TE (NVIDIA#7908) * reimport module Signed-off-by: dimapihtar <dpihtar@gmail.com> * update mcore and TE commits Signed-off-by: dimapihtar <dpihtar@gmail.com> --------- Signed-off-by: dimapihtar <dpihtar@gmail.com> Tar codec (NVIDIA#7867) added missing torch import (NVIDIA#7913) Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com> add cpu init check (NVIDIA#7889) Signed-off-by: Chen Cui <chcui@nvidia.com> Fix pinned triton version (NVIDIA#7925) * Fix pinned triton version Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Remove comment Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change README Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Remove flash-attn in Dockerfile Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> * Revert Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> --------- Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> fix tp_overlap config var name (NVIDIA#7928) Signed-off-by: Xiaowei Ren <xren@nvidia.com> add Dutch P&C FC model info (NVIDIA#7892) * add Dutch P&C FC model info Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com> * update order of the results Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com> --------- Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com> fix issues with convert_nemo_llama_to_hf.py (NVIDIA#7922) [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci fix collate_fn bug for TP > 1 Signed-off-by: Chen Cui <chcui@nvidia.com> [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci make packed dataset work Signed-off-by: Chen Cui <chcui@nvidia.com> fix nan bug Signed-off-by: Chen Cui <chcui@nvidia.com> [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci support answer only loss Signed-off-by: Chen Cui <chcui@nvidia.com> [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci account for padding in cu_seqlens during dataloading for attn kernel Signed-off-by: Chen Cui <chcui@nvidia.com> fix path for answer_only_loss = false Signed-off-by: Chen Cui <chcui@nvidia.com>
* Finite scalar quantizer Signed-off-by: Ante Jukić <ajukic@nvidia.com> * Updated test Signed-off-by: Ante Jukić <ajukic@nvidia.com> --------- Signed-off-by: Ante Jukić <ajukic@nvidia.com> Signed-off-by: Piotr Żelasko <petezor@gmail.com>
* Finite scalar quantizer Signed-off-by: Ante Jukić <ajukic@nvidia.com> * Updated test Signed-off-by: Ante Jukić <ajukic@nvidia.com> --------- Signed-off-by: Ante Jukić <ajukic@nvidia.com>
What does this PR do ?
This PR adds finite scalar quantizer as used in
Mentzer et al., Finite Scalar Quantization: VQ-VAE Made Simple (https://arxiv.org/abs/2309.15505v1)
Collection: TTS
Changelog
FiniteScalarQuantizer
Usage
This can be used as a drop-in replacement for
ResidualVectorQuantizer
, for example using the following configurationFor using larger vector,
GroupFiniteScalarQuantizer
can be used to repeat quantization pattern several times, for example using the following configurationThis corresponds to FSQ with
num_levels: [3, 4, 3, 4, 3, 4]
.Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information