Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for mcore generate #9566

Merged
merged 13 commits into from
Jul 8, 2024
Merged

Adding support for mcore generate #9566

merged 13 commits into from
Jul 8, 2024

Conversation

shanmugamr1992
Copy link
Collaborator

What does this PR do ?

Supports generating though megatron core

Collection: NLP

Changelog

  • Added deprecation notice to old generate.
  • Added support for generation using mcore.

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

@github-actions github-actions bot added the NLP label Jun 28, 2024
Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
# Have to turn off activations_checkpoint_method for inference
try:
model.model.language_model.encoder.activations_checkpoint_method = None
except AttributeError:

Check notice

Code scanning / CodeQL

Empty except Note

'except' clause does nothing but pass and there is no explanatory comment.
self.eod = tokenizer.eod
self.vocab_size = tokenizer.vocab_size

def detokenize(self, tokens):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just add these methods to the TokenizerSpec in nemo so we don't need this wrapper-class?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is very specific to mcore generate, so I didnt want to corrupt the original tokenizer spec class.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are also reworking tokenizers later maybe for mcore 0.9 or the next release, where nemo and mcore will use the same tokenizer. At that point will unify the apis

@shanmugamr1992
Copy link
Collaborator Author

Tested that it works. The output of the old code and the new code are exactlly the same. The old code for tp2pp2 gpt3 43b, takes 15 seconds for generating 100 tokens . The new code takes 8.5 seconds all same settings. The text outputs are exactly the same:
OLD CODE :
TOTAL TIME TAKEN 14.98s


[{'sentences': ["Q: How are you?\n\nA: I'm doing well. How are you?", 'Q: How big is the universe?\n\nA: The observable universe is 93 billion light years in diameter.']
['Q: Explain the concept of linear algebra to a five year old?\n\nA: Linear algebra is a way to solve problems by using numbers and shapes. It helps us understand how things move and change, like how a ball rolls down a hill or how a car turns a corner. We can use linear algebra to make things move in the right direction, without having to try lots of different ways first.', 'Q: Write a poem about the sun and moon.\n\nThe Sun and Moon']

NEW CODE :
TOTAL TIME TAKEN 8.68s
------------- RESULT FOR PROMPT 0 ---------------
{'id': '0', 'input_prompt': 'Q: How are you?', 'generated_text': "\n\nA: I'm doing well. How are you?",

------------- RESULT FOR PROMPT 1 ---------------
{'id': '1', 'input_prompt': 'Q: How big is the universe?', 'generated_text': '\n\nA: The observable universe is 93 billion light years in diameter.',

------------- RESULT FOR PROMPT 2 ---------------
{'id': '2', 'input_prompt': 'Q: Explain the concept of linear algebra to a five year old?', 'generated_text': '\n\nA: Linear algebra is a way to solve problems by using numbers and shapes. It helps us understand how things move and change, like how a ball rolls down a hill or how a car turns a corner. We can use linear algebra to make things move in the right direction, without having to try lots of different ways first.',

------------- RESULT FOR PROMPT 3 ---------------
{'id': '3', 'input_prompt': 'Q: Write a poem about the sun and moon', 'generated_text': '.\n\nThe Sun and Moon',

Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Noting here that we still need ci tests and docs for this.

@shanmugamr1992 shanmugamr1992 merged commit aa397d7 into main Jul 8, 2024
115 checks passed
@shanmugamr1992 shanmugamr1992 deleted the mcore_generation branch July 8, 2024 16:52
@ko3n1g ko3n1g mentioned this pull request Jul 18, 2024
2 tasks
ertkonuk pushed a commit that referenced this pull request Jul 19, 2024
* Adding support for mcore generate

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

* adding support

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

* adding support

---------

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: shanmugamr <shanmugamr@nvidia.com>
Co-authored-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
Signed-off-by: Tugrul Konuk <ertkonuk@gmail.com>
tonyjie pushed a commit to tonyjie/NeMo that referenced this pull request Aug 6, 2024
* Adding support for mcore generate

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

* adding support

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

* adding support

---------

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: shanmugamr <shanmugamr@nvidia.com>
Co-authored-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
Signed-off-by: tonyjie <jl4257@cornell.edu>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 5, 2024
* Adding support for mcore generate

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

* adding support

* Apply isort and black reformatting

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>

* adding support

---------

Signed-off-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com>
Co-authored-by: shanmugamr <shanmugamr@nvidia.com>
Co-authored-by: shanmugamr1992 <shanmugamr1992@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants