Add features to beam search that are supported in other libraries #5205

danieldeutsch · 2021-05-17T15:24:49Z

Is your feature request related to a problem? Please describe.
Other libraries, like transformers and fairseq, have implemented several beam search options that aren't included in AllenNLP, which would be useful to have. For instance:

Minimum-length requirements
Scoring the final sequences by the average log probability per token instead of the sum of the log probs.
Normalizing the final scores by some length penalty
Blocking repeated n-grams from the output

All 4 of these are used in the BART paper, but the generic AllenNLP beam search code does not support them. I have not run the exact training config for the CNN/DailyMail BART model, but I have run one almost identical to it, and I could not reproduce the BART paper's results without implementing 1-3 myself. 4 is more complicated and I haven't implemented it yet.

Describe the solution you'd like
I think there are workarounds for all 4 problems that could be implemented in the model code, but they all seem generically useful enough to include them in the beam search code.

If you would like, I can create a separate issue for each of the above requests (although I think 2 and 3 should be solved together) and submit my own PRs.

The text was updated successfully, but these errors were encountered:

danieldeutsch · 2021-05-17T16:54:31Z

Looks like #5113 is asking for 3 as well.

epwalsh · 2021-05-17T20:58:25Z

These would be all be great! Looks like you've got 1-3 covered (I'll finish reviewing shortly), and I think 4 could be implemented as a Sampler.

danieldeutsch · 2021-05-17T21:27:02Z

I do think 4 could be a Sampler, but blocking repeated ngrams and picking which sampling technique you use seem like they should be orthogonal decisions to me. I was imagining it could be implemented as an abstract Constrant class which would be able to zero-out predictions on each step. Each constraint could have its own state, which, in this case, would keep track of which ngrams have already appeared. It might not be necessary to make it an abstract class since I can't immediately think of other popularly enforced constraints that would fit this interface (although min_steps could also be implemented this way).

epwalsh · 2021-05-17T21:38:10Z

Actually I really like this idea of having a Constraint abstract class. This would allow people to use our BeamSearch for constrained generation tasks like semantic parsing, where the generated tokens have to form valid SQL or whatever.

danieldeutsch added the Feature request label May 17, 2021

This was referenced May 17, 2021

Adding a min_steps parameter to BeamSearch #5207

Merged

Implementing abstraction to score final sequences in BeamSearch #5208

Merged

danieldeutsch mentioned this issue May 20, 2021

Blocking repeated ngrams during beam search #5216

Merged

AkshitaB assigned epwalsh May 21, 2021

epwalsh closed this as completed in #5216 Jun 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add features to beam search that are supported in other libraries #5205

Add features to beam search that are supported in other libraries #5205

danieldeutsch commented May 17, 2021

danieldeutsch commented May 17, 2021

epwalsh commented May 17, 2021

danieldeutsch commented May 17, 2021

epwalsh commented May 17, 2021

Add features to beam search that are supported in other libraries #5205

Add features to beam search that are supported in other libraries #5205

Comments

danieldeutsch commented May 17, 2021

danieldeutsch commented May 17, 2021

epwalsh commented May 17, 2021

danieldeutsch commented May 17, 2021

epwalsh commented May 17, 2021