Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[master] Add new BERT calibration dataset #1171

Merged
merged 1 commit into from
Jul 12, 2022

Conversation

nvitramble
Copy link
Contributor

@nvitramble nvitramble commented Jun 30, 2022

The purpose of this change is to:

  1. Provide an alternative calibration dataset for BERT that defines the calibration examples in terms of qas (question answers) ids in dev-v1.1.json
  2. Clarify the meaning of the integers in the previous calibration dataset

The new calibration examples are randomly selected from dev-v1.1.json. Defining the examples in terms of qas ids can be more convenient for some PTQ workflows. Other benchmarks (e.g. resnet50) already provide multiple calibration datasets (see here).

@github-actions
Copy link

github-actions bot commented Jun 30, 2022

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@nv-ananjappa
Copy link
Contributor

LGTM. @psyhtest Please review.

@nvitramble nvitramble force-pushed the dev-itramble-bert-calib branch from 1c336c1 to e53d936 Compare July 1, 2022 00:12
@pgmpablo157321
Copy link
Contributor

@nvitramble Looks ok! If you would like this change to be included in Inferece_v2.1, please also make a PR to the branch r2.1

@nvitramble nvitramble changed the title Add new BERT calibration dataset [master] Add new BERT calibration dataset Jul 1, 2022
@psyhtest
Copy link
Contributor

psyhtest commented Jul 5, 2022

Other benchmarks (e.g. resnet50) already provide multiple calibration datasets

ResNet50 is the only benchmark with two calibration datasets. And it's only an historic accident - NVIDIA and Intel proposed their calibration datasets at the same tim, so the grudging consensus was to allow submitters use either.

The new calibration examples are randomly selected from dev-v1.1.json.
Defining the examples in terms of qas ids can be more convenient for some PTQ workflows.

This may well be the case. But why simply not convert the old calibration examples into the new format?

@nvitramble
Copy link
Contributor Author

nvitramble commented Jul 5, 2022

This may well be the case. But why simply not convert the old calibration examples into the new format?

As discussed in this morning's WG, there is not a 1:1 mapping from (question, context) pairs in dev-v1.1.json to features since contexts get split when len(context + question) > max_seq_len.

And to followup from the WG, is the current calibration dataset used by any submitters? My understandings is that it is unused - for int8 submitters would use the QAT'ed models (which include quantization scales) here and here.

@nvitramble
Copy link
Contributor Author

To explain the motivation more, the current calibration dataset assumes a deterministic mapping from dev-v1.1.json question+context pairs to features. Different implementations of the featurizing function may not maintain the same ordering (e.g. if using huggingface). It is more robust to define the calibration examples in terms of unique ids, since that makes no assumptions about the featurizer (e.g. for ImageNet the calibration set is defined in terms of image file names, for LibriSpeech the calibration set is defined in terms of wav file names, etc.).

@nvitramble nvitramble force-pushed the dev-itramble-bert-calib branch from e53d936 to 28c0d40 Compare July 6, 2022 02:19
@rnaidu02
Copy link
Contributor

Either one of the calibration sets is allowed.

@rnaidu02 rnaidu02 merged commit f576d52 into mlcommons:master Jul 12, 2022
@github-actions github-actions bot locked and limited conversation to collaborators Jul 12, 2022
@nvpohanh
Copy link
Contributor

@nvitramble Should we cherry-pick this to r2.1 branch?

@nvpohanh
Copy link
Contributor

Nvm, it's already done in #1172

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants