Spark NLP 4.3.2: Patch release
π’ Overview
Spark NLP 4.3.2 π comes with a new support for S3 in training classes to read and load CoNLL and CoNLL-U formats, support for NER tags without any schema in NerConverter, improving dedicated and self-hosted examples with more guides, and other enhancements and bug fixes!
As always, we would like to thank our community for their feedback, questions, and feature requests. π
β New Features & Enhancements
- Add S3 support for CoNLL(), POS(), CoNLLU() training classes #13596
- Add support for non-schema NER (
I-
orB-
) tags in NerConverter annotator #13642 - Improve self-hosted examples with better documentation, Docker examples, no broken links, and more #13575
- Improve error handling for validation evaluation in ClassifierDL and MultiClassifierDL trainable annotators #13615
π Bug Fixes
- Fix
Date2Chunk
andChunk2Doc
annotators compatibility with PipelineModel #13609 - Fix
DependencyParserModel
predicting all Chunks as<no-type>
#13620 - Removed
calculationsCol
parameter from MultiDocumentAssembler in Python that doesn't actually exist #13594
π Documentation
- Import models from TF Hub & HuggingFace
- Spark NLP Notebooks
- Models Hub with new models
- Spark NLP Articles
- Spark NLP in Action
- Spark NLP Documentation
- Spark NLP Scala APIs
- Spark NLP Python APIs
Community support
- Slack For live discussion with the Spark NLP community and the team
- GitHub Bug reports, feature requests, and contributions
- Discussions Engage with other community members, share ideas, and show off how you use Spark NLP!
- Medium Spark NLP articles
- YouTube Spark NLP video tutorials
Installation
Python
#PyPI
pip install spark-nlp==4.3.2
Spark Packages
spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, and 3.3.x (Scala 2.12):
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.3.2
pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.3.2
GPU
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:4.3.2
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:4.3.2
Apple Silicon (M1 & M2)
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:4.3.2
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:4.3.2
AArch64
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:4.3.2
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:4.3.2
Maven
spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, and 3.3.x:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp_2.12</artifactId>
<version>4.3.2</version>
</dependency>
spark-nlp-gpu:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-gpu_2.12</artifactId>
<version>4.3.2</version>
</dependency>
spark-nlp-silicon:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-silicon_2.12</artifactId>
<version>4.3.2</version>
</dependency>
spark-nlp-aarch64:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-aarch64_2.12</artifactId>
<version>4.3.2</version>
</dependency>
FAT JARs
-
CPU on Apache Spark 3.x/3.1.x/3.2.x/3.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-assembly-4.3.2.jar
-
GPU on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-assembly-4.3.2.jar
-
M1 on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-silicon-assembly-4.3.2.jar
-
AArch64 on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-aarch64-assembly-4.3.2.jar
What's Changed
- Updated documentation by @dcecchini in #13556
- Add new demos by @agsfer in #13595
- Update menu Items by @agsfer in #13600
- update release notes by @Cabir40 in #13597
- SPARKNLP-742: Improve Examples Folder by @DevinTDHa in #13575
- Removed parameter calculationsCol by @dcecchini in #13594
- SPARKNLP-88 Adding support for S3 in CoNLL, POS, CoNLLU by @danilojsl in #13596
- Sparknlp 747 Date2Chunk and Chunk2Doc are not in the correct Python module by @maziyarpanahi in #13609
- SPARKNLP-746: Handle empty validation sets by @DevinTDHa in #13615
- SPARKNLP-750 DependencyParserModel Outputs All Chunks as by @danilojsl in #13620
- SPARKNLP-786 Add support for non-schema NER tags by @maziyarpanahi in #13642
- Models hub by @maziyarpanahi in #13651
- release/432-release-candidate by @maziyarpanahi in #13648
- Models hub by @maziyarpanahi in #13654
New Contributors
- @ahmet-mesut made their first contribution in #13598
Full Changelog: 4.3.1...4.3.2