GraphWSD (TextGraphs-16 COLING 2022)

Official code repository: Word Sense Disambiguation of French Lexicographical Examples Using Lexical Networks

Folder Description

/data/ : contains instructions to organize the data/ folder

/scripts/ : contains individual script modules

Steps to run the experiments:

Clone the repo : git clone https://github.com/ATILF-UMR7118/GraphWSD.git
Create the virtualenv

python3 -m venv wsdvenv
. wsdvenv/bin/activate
pip3 install --upgrade pip
cd GraphWSD/
pip3 install -r requirements.txt

Follow the instructions provided in data/ folder
To run the models for NOUN/VERB wsd:

(a.1) STRUCT model

python3  ~/GraphWSD/scripts/wsd_ewiser.py \
        --data ~/GraphWSD/data/ortolang/nountmp/ \
        --save_dir ~/GraphWSD/scripts/ortolog/ \
        --num 100  --model_num onoun_ewiser_29061156 --mtype ewiser --save-model \
        --learning 0.001  --hidden 8000 --batch 64 --device cuda --embed 768 --lm camembert-base

(a.2) STRUCT* model

python3  ~/GraphWSD/scripts/wsd_ewiser.py \
        --data ~/GraphWSD/data/ortolang/nountmp/ \
        --save_dir ~/GraphWSD/scripts/ortolog/ \
        --num 100  --model_num onoun_ewiser_29061156 --mtype ewiser --save-model \
        --learning 0.001  --hidden 8000 --batch 64 --device cuda --embed 768 --lm camembert-base --trainable

(a.3) STRUCT** model

python3  ~/GraphWSD/scripts/wsd_ewiser.py \
        --data ~/GraphWSD/data/ortolang/nountmp/ \
        --save_dir ~/GraphWSD/scripts/ortolog/ \
        --num 100  --model_num onoun_ewiser_29061156 --mtype ewiser --save-model \
        --learning 0.001  --hidden 8000 --batch 64 --device cuda --embed 768 --lm camembert-base --fragment --trainable

(b.1) SEM model

python3  ~/GraphWSD/scripts/wsd_ewiser.py \
        --data ~/GraphWSD/data/ortolang/nountmp/ --num 100 \
        --save_dir ~/GraphWSD/scripts/ortolog/  --model_num onoun_seml_29061522 \
        --mtype ewiserc --save-model --batch 64  --device cuda --semantics\
          --hidden-dim 8000  --embed 768 --lm camembert-base

(b.2) SEM* model

python3  ~/GraphWSD/scripts/wsd_ewiser.py \
        --data ~/GraphWSD/data/ortolang/nountmp/ --num 100 \
        --save_dir ~/GraphWSD/scripts/ortolog/  --model_num onoun_seml_29061522 \
        --mtype ewiserc --save-model --batch 64  --device cuda --semantics\
          --hidden-dim 8000  --embed 768 --lm camembert-base --trainable

(b.3) SEM** model

python3  ~/GraphWSD/scripts/wsd_ewiser.py \
        --data ~/GraphWSD/data/ortolang/nountmp/ --num 100 \
        --save_dir ~/GraphWSD/scripts/ortolog/  --model_num onoun_seml_29061522 \
        --mtype ewiserc --save-model --batch 64  --device cuda --semantics\
          --hidden-dim 8000  --embed 768 --lm camembert-base --fragment --trainable

Different configurations

STRUCT and SEM are two strategies to intialize $A$ adjacency matrix. Any $a_{i,j} \in A$ is $\sum w(r)$ where $r \in S_{i,j}$ and $S_{i,j}$ is set of edges between i and j nodes.

STRUCT : count number of edges

SEM : count weight (strength) of edges. SEM model requires --semantics

config	interpretation	command
STRUCT/SEM	A is frozen
STRUCT/SEM *	$a_{i,j}$ is trainable	--trainable
STRUCT/SEM **	$w(r)$ is trainable	--trainable --fragment

For any questions related to repository contact: asinha@atilf.fr

Citation

@inproceedings{sinha2022word,
  title={Word sense disambiguation of french lexicographical examples using lexical networks},
  author={Sinha, Aman and Ollinger, Sandrine and Constant, Mathieu},
  booktitle={TextGraphs-16: Graph-based Methods for Natural Language Processing},
  pages={70--76},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
scripts		scripts
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GraphWSD (TextGraphs-16 COLING 2022)

Folder Description

Steps to run the experiments:

Different configurations

Citation

About

Releases

Packages

Languages

ATILF-UMR7118/GraphWSD

Folders and files

Latest commit

History

Repository files navigation

GraphWSD (TextGraphs-16 COLING 2022)

Folder Description

Steps to run the experiments:

Different configurations

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages