Skip to content

cltl/SpaCy-to-NAF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spacy-to-naf is a spaCy wrapper that converts text or NAF input to NAF. The converter minimally extracts a tokenized text layer, and can additionally extract terms, deps, entities and chunks layers.

Installation

Install spaCy and spacy-to-naf:

pip install spacy
pip install spacy-to-naf

Download a spaCy model, eg. 'en_core_web_sm':

python -m spacy download en_core_web_sm

Usage

Specify the spaCy model and the NAF layers to create (the text layer is always created).

from spacy_to_naf.converter import Converter
converter = Converter('en_core_web_sm', add_terms=True, add_deps=True, add_entities=True, add_chunks=True)

The input may be a naf or text directory or a text string.

Text input

To convert text to a file 'example.naf' in the current directory:

text = "The cat sat on the mat. Felix was his name."
naf = converter.run(text, 'example.naf', '.')

The converter additionally returns a NafParser object for further processing.

Processing files

To process text files from a 'text_in' to 'naf_out' directory:

converter.convert_text_files('text_in', 'naf_out')

Note that input text files are expected to end in '.txt'.

To process NAF files from 'naf_in' to 'naf_out':

converter.convert_naf_files('naf_in', 'naf_out')

Output files carry the same name as the input file, extension excepted.

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages