-
Notifications
You must be signed in to change notification settings - Fork 0
ctakes dependency parser
Dependency parsers provide syntactic information about sentences.
Unlike deep parsers, they do not explicitly find phrases (e.g., NP or VP); rather, they find the dependencies between words.
For example, "hormone replacement therapy" would have deep structure:
(NP (NML (NN hormone) (NN replacement)) (NN therapy))
but its dependency structure would show that "hormone" depends on "replacement" and "replacement" in turn depends on "therapyl".
Below, the first column of numbers indicates the ID of the word, and the second number indicates what it is dependent on.
23 hormone hormone NN 24 NMOD 24 replacement replacement NN 25 NMOD 25 therapy therapy NN 22 PMOD
Dependency parsers can be labeled as well, e.g., we could specify that "hormone" is in a noun-modifier (i.e., NMOD) relationship with "therapy" in the example above (the last column).
This project provides an Apache UIMA wrapper and some utilities for ClearParser,
a transition-based dependency parser that achieves state-of-the-art accuracy and speed.
ClearParser is described in:
"K-best, Locally Pruned, Transition-based Dependency Parsing Using Robust Risk Minimization."
Jinho D. Choi, Nicolas Nicolov, Collections of Recent Advances in Natural Language Processing V,
205-216, John Benjamins, Amsterdam & Philadelphia, 2009.
The semantic role labeler assigns the predicate-argument structure of the sentence. (Who did what to whom when and where.)
Collection Readers
Annotation Engines
Output Writers
Utilities
Reads in dependency tree training/test data in a tab-delimited format.
Source class: DependencyFileCollectionReader
Source package: org.apache.ctakes.dependency.parser.cr
Parent class: org.apache.uima.collection.CollectionReader_ImplBase
Products: Base Token, Sentence, Dependency Node
No available configuration parameters.
Reads document texts and annotations from XMI files specified in a provided list.
Source class: XMIReader
Source package: org.apache.ctakes.dependency.parser.ae.util
Parent class: org.apache.uima.fit.component.JCasCollectionReader_ImplBase
Products: Document Id
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
files | The XMI files to be loaded | List | Yes |
Adds Semantic Roles Relations.
Source class: ClearNLPSemanticRoleLabelerAE
Source package: org.apache.ctakes.dependency.parser.ae
Parent class: org.apache.uima.fit.component.JCasAnnotator_ImplBase
Dependencies: Sentence, Base Token, Dependency Node
Products: Semantic Relation
No available configuration parameters.
Adds Semantic Roles Relations.
Source class: ThreadSafeClearNlpSemRoleLabeler
Source package: org.apache.ctakes.dependency.parser.concurrent
Parent class: org.apache.ctakes.dependency.parser.ae.ClearNLPSemanticRoleLabelerAE
Dependencies: Sentence, Base Token, Dependency Node
Products: Semantic Relation
No available configuration parameters.
Writes information about Dependency Nodes to file.
Source class: DependencyNodeWriter
Source package: org.apache.ctakes.dependency.parser
Parent class: org.apache.uima.collection.CasConsumer_ImplBase
Dependencies: Sentence, Dependency Node
No available configuration parameters.
Analyses Sentence Structure, storing information in nodes.
Source class: ClearNLPDependencyParserAE
Source package: org.apache.ctakes.dependency.parser.ae
Parent class: org.apache.uima.fit.component.JCasAnnotator_ImplBase
Dependencies: Sentence, Base Token
Products: Dependency Node
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
LemmatizerDataFile | This parameter provides the data file required for the MorphEnAnalyzer. If not specified, this analysis engine will use a default model from the resources directory | String | Yes | org/apache/ctakes/dependency/parser/models/lemmatizer/ dictionary-1.3.1.jar |
ParserModelFileName | This parameter provides the file name of the dependency parser model required by the factory method provided by ClearNLPUtil. If not specified, this analysis engine will use a default model from the resources directory | String | Yes | org/apache/ctakes/dependency/parser/models/dependency/ mayo-en-dep-1.3.0.jar |
UseLemmatizer | If true, use the default ClearNLP lemmatizer, otherwise use lemmas from the BaseToken normalizedToken field | boolean | Yes | true |
MaxTokens | The maximum length sentence to parse. Longer sentences will have a basic dependency structure created where every node's head is the sentence node. | int | No |
Analyses Sentence Structure, storing information in nodes.
Source class: ThreadSafeClearNlpDepParser
Source package: org.apache.ctakes.dependency.parser.concurrent
Parent class: org.apache.ctakes.dependency.parser.ae.ClearNLPDependencyParserAE
Dependencies: Sentence, Base Token
Products: Dependency Node
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
LemmatizerDataFile | This parameter provides the data file required for the MorphEnAnalyzer. If not specified, this analysis engine will use a default model from the resources directory | String | Yes | org/apache/ctakes/dependency/parser/models/lemmatizer/ dictionary-1.3.1.jar |
ParserModelFileName | This parameter provides the file name of the dependency parser model required by the factory method provided by ClearNLPUtil. If not specified, this analysis engine will use a default model from the resources directory | String | Yes | org/apache/ctakes/dependency/parser/models/dependency/ mayo-en-dep-1.3.0.jar |
UseLemmatizer | If true, use the default ClearNLP lemmatizer, otherwise use lemmas from the BaseToken normalizedToken field | boolean | Yes | true |
MaxTokens | The maximum length sentence to parse. Longer sentences will have a basic dependency structure created where every node's head is the sentence node. | int | No |
- Piper File Submitter
- UMLS Package Fetcher
- Dictionary Creator
- Simple Pipeline Fabricator
- Pipeline Installation Facility
- ctakes-pbj module
- Getting started with PBJ
- Python pbj-component
- Python pbj-pipeline
- Python pbj-scripts
- Python pbj-tools
- pbj-user-pipeline
- examples
- ctakes-assertion
- ctakes-assertion-zoner
- ctakes-chunker
- ctakes-clinical-pipeline
- ctakes-constituency-parser
- ctakes-context-tokenizer
- ctakes-core
- ctakes-coreference
- ctakes-dependency-parser
- ctakes-dictionary-lookup
- ctakes-dictionary-lookup-fast
- ctakes-distribution
- ctakes-dockhand
- ctakes-drug-ner
- ctakes-examples
- ctakes-fhir
- ctakes-gui
- ctakes-lvg
- ctakes-mastif-zoner
- ctakes-ne-contexts
- ctakes-pbj
- ctakes-pos-tagger
- ctakes-preprocessor
- ctakes-regression-test
- ctakes-relation-extractor
- ctakes-side-effect
- ctakes-smoking-status
- ctakes-template-filler
- ctakes-temporal
- ctakes-tiny-rest
- ctakes-type-system
- ctakes-utils
- ctakes-web-rest
- ctakes-ytex
- ctakes-ytex-uima
- ctakes-ytex-web