BLEURT: Learning Robust Metrics for Text Generation
https://github.com/google-research/bleurt
The BLEURT
class can be instantiated with the checkpoints provided by the original repository.
See here for the list.
The corresponding model names are "bleurt-{tiny,base,large}-{128,512}"
and should be passed to the constructor of the class.
- BLEURT
- Description: A learned evaluation metric for natural language generation
- Name:
sellam2020-bleurt
- Usage:
from repro.models.sellam2020 import BLEURT model = BLEURT(model="bleurt-base-128") inputs = [ {"candidate": "The candidate text", "references": ["The reference"]} ] scores = model.predict_batch(inputs)
BLEURT
only supports single references (the argument toreferences
should be a list of length 1).
- Image name:
sellam2020
- Build command:
The arguments specify which BLEURT models should be downloaded. Both
repro setup sellam2020 \ [--not-tiny-128] \ [--not-base-128] \ [--tiny-512] \ [--base-512] \ [--large-128] \ [--large-512] \ [--silent]
bleurt-tiny-128
andbleurt-base-128
are downloaded by default. - Requires network: No
Explain how to run the unittests for this model
repro setup sellam2020
pytest models/sellam2020/tests
- Regression unit tests pass
- Correctness unit tests pass
The unit tests are based on examples in the official repository. See here. - Model runs on full test dataset
Not tested - Predictions approximately replicate results reported in the paper
Not tested - Predictions exactly replicate results reported in the paper
Not tested