Convert Spacy output to FoLiA XML Documents. Also supports FoLiA input.
$ pip install spacy2folia
You also need to install the spacy models you want like:
python -m spacy download en_core_web_sm
Using the command line tool on an input file named test.txt
:
$ spacy2folia --model en_core_web_sm test.txt
This results in a document test.folia.xml
in the current working directory.
You can also invoke the command line tool on one or more FoLiA documents as input:
$ spacy2folia --model en_core_web_sm document.folia.xml
The output file will be written to the currrent working directory (so it may overwirte the input if it's in the same directory!)
Usage from Python:
import spacy
from spacy2folia import spacy2folia
text = "Input text goes here"
nlp = spacy.load("en_core_web_sm")
doc = nlp(text)
foliadoc = spacy2folia.convert(doc, "example", paragraphs=True)
foliadoc.save("/tmp/output.folia.xml")
Usage from Python with FoLiA input:
import spacy
import folia.main as folia
from spacy2folia import spacy2folia
foliadoc = folia.Document(file="/tmp/input.folia.xml")
nlp = spacy.load("en_core_web_sm")
spacy2folia.convert_folia(foliadoc, nlp)
foliadoc.save("/tmp/output.folia.xml")