Skip to content

Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML

License

Notifications You must be signed in to change notification settings

nicolay-r/AREkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AREkit 0.25.1

PyPI downloads

AREkit (Attitude and Relation Extraction Toolkit) -- is a python toolkit, devoted to document level Attitude and Relation Extraction between text objects from mass-media news.

Description

This toolkit aims at memory-effective data processing in Relation Extraction (RE) related tasks.

Figure: AREkit pipelines design. More on ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction paper

In particular, this framework serves the following features:

  • pipelines and iterators for handling large-scale collections serialization without out-of-memory issues.
  • 🔗 EL (entity-linking) API support for objects,
  • ➰ avoidance of cyclic connections,
  • 📏 distance consideration between relation participants (in terms or sentences),
  • 📑 relations annotations and filtering rules,
  • *️⃣ entities formatting or masking, and more.

The core functionality includes:

  • API for document presentation with EL (Entity Linking, i.e. Object Synonymy) support for sentence level relations preparation (dubbed as contexts);
  • API for contexts extraction;
  • Relations transferring from sentence-level onto document-level, and more.

Installation

pip install git+https://github.com/nicolay-r/AREkit.git@0.25.1-rc

Usage

Please follow the tutorial section on project Wiki for mode details.

How to cite

A great research is also accompanied by the faithful reference. if you use or extend our work, please cite as follows:

@inproceedings{rusnachenko2024arelight,
  title={ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction},
  author={Rusnachenko, Nicolay and Liang, Huizhi and Kolomeets, Maxim and Shi, Lei},
  booktitle={European Conference on Information Retrieval},
  year={2024},
  organization={Springer}
}