This project is under works. It aims to do sentiment analysis using text GCN.
- Identified aspects terms from user opinions.
- Dependency parsing is used to capture syntactical structure.
- Graph Convolutional Network is used to capture dependencies of aspect and opinions.
- Stratified split is used to ensure even distribution of aspect classes among train, validation and test data.
- For predicting the aspect terms, MultilabelClassification from the simpletransformers library is used as the baseline.
- Connect Updating Adjacency matrix code with the main pipeline
Six datasets are used to evaluate our model.
All the datasets are cleaned by using the text processing pipeline as mentioned in the paper. The description of the pipeline is given in the utils folder of absa_gnn module in this repository as well.
The cleaned data is stored in the data folder of this repository. The format of the data is [text labels].
Text contains the cleaned text from the datasets mentioned above, labels contain a multi hot vector as described in the paper.
For a detailed information about the files present in each dataset folder, please navigate to the data folder.
Please cite us if you find the the above cleaned datasets helpful in your work.
Browse into the corresponding folders in the absa_gnn module to see the pertaining details
- Python3
$ sudo apt install python3-venv
$ sudo apt install openjdk-8-jre-headless
In case pip install gives wheel related errors:
$ sudo update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
This is required for language check package
$ git clone https://github.com/abhinavg97/ABSA_GNN.git
$ cd gcn
$ python -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
$ sudo apt install python3-tk
$ python -m spacy download en_core_web_lg
$ python -m spacy download en_core_web_sm
Run our model
$ python main.py
Run baseline
$ python baseline.py
Logging is done by PyTorch lightning which uses Tensorboard by default.
Visualize the metrics:
$ tensorboard --logdir lightning_logs/
or
$ python3 -m tensorboard.main --logdir lightning_logs/
$ docker image build -t image_name:tag .
$ docker container run --name absa_gnn --mount source=volume_name,target=/usr/src/app image_name:tag
The mounted directory is present at /var/lib/docker/volumes/
Note: You need sudo permissions to access the above directory
@misc{
author = {Gupta, Abhinav and Ghosh, Samujjwal and Konjengbam, Anand},
title = {ABSA GNN},
year = {2020},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/abhinavg97/ABSA_GNN}}
}