Skip to content

v1.1.0

Compare
Choose a tag to compare
@adrianeboyd adrianeboyd released this 18 Oct 13:10
· 107 commits to master since this release
1f1606c

✨ New features and improvements

  • Refactor and improve transformer serialization for better support of inline transformer components and replacing listeners.
  • Provide the transformer model output as ModelOutput instead of tuples in TransformerData.model_output and FullTransformerBatch.model_output. For backwards compatibility, the tuple format remains available under TransformerData.tensors and FullTransformerBatch.tensors. See more details in the transformer API docs.
  • Add support for transformer_config settings such as output_attentions. Additional output is stored under TransformerData.model_output. More details in the TransformerModel docs.
  • Add support for mixed-precision training.
  • Improve training speed by streamlining allocations for tokenizer output.
  • Extend support for transformers up to v4.11.x.

🔴 Bug fixes

  • Fix support for GPT2 models.

⚠️ Backwards incompatibilities

  • The serialization format for transformer components has changed in v1.1 and is not compatible with spacy-transformers v1.0.x. Pipelines trained with v1.0.x can be loaded with v1.1.x, but pipelines saved with v1.1.x cannot be loaded with v1.0.x.
  • TransformerData.tensors and FullTransformerBatch.tensors return a tuple instead of a list.

👥 Contributors

@adrianeboyd, @bryant1410, @danieldk, @honnibal, @ines, @KennethEnevoldsen, @svlandeg