Skip to content
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.

End-to-end regression test between metaseq and huggingface #136

Closed
shijie-wu opened this issue Jun 3, 2022 · 8 comments
Closed

End-to-end regression test between metaseq and huggingface #136

shijie-wu opened this issue Jun 3, 2022 · 8 comments
Assignees
Labels
question Further information is requested test-coverage Improving our test coverage

Comments

@shijie-wu
Copy link

shijie-wu commented Jun 3, 2022

We are interested in a full regression test between metaseq and huggingface. As @patrickvonplaten confirmed, the current regression test doesn't cover the merging step. We observed different behaviors between models that require merging (1.3B) and the one that doesn’t (350M) (huggingface/transformers#17545). It’s unclear what causes the difference at the moment due to the lack of an end to end regression test.

To run the full regression test, we would need to load the model into metaseq directly. However, this is not well supported currently (e.g. #78 and #31). It would be great if people at Meta could help perform a full regression test between metaseq and huggingface 😄

@suchenzang @stephenroller

@shijie-wu shijie-wu added the question Further information is requested label Jun 3, 2022
@patrickvonplaten
Copy link
Contributor

@shijie-wu could you specify a bit what is meant by

"We observed different behaviors between models that require merging and the ones that don't"

What exactly is different?

Thanks!

@shijie-wu
Copy link
Author

@shijie-wu could you specify a bit what is meant by

"We observed different behaviors between models that require merging and the ones that don't"

What exactly is different?

Thanks!

Hi @patrickvonplaten

I'm referring to huggingface/transformers#17545

@patrickvonplaten
Copy link
Contributor

patrickvonplaten commented Jun 8, 2022

@shijie-wu, could you by any chance reproduce the result using the metaseq code base - or the opposite show that this doesn't happen with the metaseq codebase? Think this could really help in finding out if there is a bug :-)

@shijie-wu
Copy link
Author

@shijie-wu, could you by any chance reproduce the result using the metaseq code base - or the opposite show that this doesn't happen with the metaseq codebase? Think this could really help in finding out if there is a bug :-)

@patrickvonplaten I tried to load OPT-1.3B into metaseq but unfortunately it doesn't work (error message). I would love to conduct the end-to-end regression test once I figure out what went wrong. Similar issues (#78 and #31) have been raised before regarding loading checkpoint into metaseq.

@patrickvonplaten
Copy link
Contributor

Ah yeah I see - could you try to run this snippet frist: #88 (comment)

It should put all the sharded checkpoints into one :-)

@punitkoura
Copy link
Contributor

@patrickvonplaten @shijie-wu I added an end to end integration test between metaseq and HF in this PR #214

The test confirms that the logits and predicted words for -

  1. Model parallel checkpoint
  2. Metaseq singleton checkpoint
    and 3) HF checkpoint

are the same. Let me know if this test contains what you were looking for. Thanks!

@patrickvonplaten
Copy link
Contributor

Wow awesome thanks a lot @punitkoura !

@suchenzang suchenzang added the test-coverage Improving our test coverage label Oct 6, 2022
@suchenzang
Copy link
Contributor

Completed in #214 - thanks @punitkoura !!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested test-coverage Improving our test coverage
Projects
None yet
Development

No branches or pull requests

4 participants