music-ner-eacl2023

This repository provides Python code to reproduce the experiments from the article A Human Subject Study of Named Entity Recognition (NER) in Conversational Music Recommendation Queries, accepted for publication to EACL 2023.

Installation

git clone git@github.com:deezer/music-ner-eacl2023.git
cd music-ner-eacl2023

Setup

Build the docker image and run it in a container while launching an interactive bash session (the current docker image requires a CUDA-capable GPU):

$ make build
$ make run-bash

Experiments

Data statistics and preparation

Print the data statistics shown in Table 2 of the paper:

poetry run python3 music-ner/datasets/stats.py --data_dir=data/dataset1
poetry run python3 music-ner/datasets/stats.py --data_dir=data/dataset2
poetry run python3 music-ner/datasets/stats.py --data_dir=data/dataset3
poetry run python3 music-ner/datasets/stats.py --data_dir=data/dataset4

Prepare ground-truth sets with seen and rare / unseen entities:

poetry run python3 music-ner/datasets/create_seen_rare_ds.py --data_dir data/dataset1/ --th_seen=1 --th_rare_unseen=0
poetry run python3 music-ner/datasets/create_seen_rare_ds.py --data_dir data/dataset2/ --th_seen=1 --th_rare_unseen=0
poetry run python3 music-ner/datasets/create_seen_rare_ds.py --data_dir data/dataset3/ --th_seen=1 --th_rare_unseen=0
poetry run python3 music-ner/datasets/create_seen_rare_ds.py --data_dir data/dataset4/ --th_seen=1 --th_rare_unseen=0

Fine-tuning

Note: some small variations between different runs, hence from the exact scores reported in the paper, could exist but with no statistically significant differences.

Fine-tune multiple transformers (BERT, RoBERTa and MPNet) to perform music NER and print results (Table 4):

./music-ner/scripts/run_ner_model_selection.sh
poetry run python3 music-ner/tables-and-stats/transformer_baselines.py --results_dir output

Fine-tune BERT to perform music NER, export human annotation results in the same json format as the one produced by transformers, and print results (Tables 5 and 6):

./music-ner/scripts/run_ner.sh
poetry run python3 music-ner/src/compute_human_performance.py --data_dir data/dataset1 --output_dir output/dataset1
poetry run python3 music-ner/src/compute_human_performance.py --data_dir data/dataset2 --output_dir output/dataset2
poetry run python3 music-ner/src/compute_human_performance.py --data_dir data/dataset3 --output_dir output/dataset3
poetry run python3 music-ner/src/compute_human_performance.py --data_dir data/dataset4 --output_dir output/dataset4
poetry run python3 music-ner/tables-and-stats/human_vs_bert.py --results_dir output

Run experiments for seen and rare / unseen ground-truth sets and print results (Table 7):

./music-ner/scripts/run_ner_seen_ents.sh
./music-ner/scripts/run_ner_rare_unseen_ents.sh
poetry run python3 music-ner/tables-and-stats/seen_vs_unseen.py --results_dir output

Reproduce Figure 1 with the detailed error analysis for BERT and human predictors:

poetry run python3 music-ner/tables-and-stats/graph_error_analysis.py --results_dir output

Paper

Please cite our paper if you use this data or code in your work:

@InProceedings{Epure2023,
  title={A Human Subject Study of Named Entity Recognition (NER) in Conversational Music Recommendation Queries},
  author={Epure, Elena and Hennequin, Romain},
  booktitle={Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL)},
  month={May},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
music-ner		music-ner
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

music-ner-eacl2023

Installation

Setup

Experiments

Data statistics and preparation

Fine-tuning

Paper

About

Releases

Packages

Languages

License

deezer/music-ner-eacl2023

Folders and files

Latest commit

History

Repository files navigation

music-ner-eacl2023

Installation

Setup

Experiments

Data statistics and preparation

Fine-tuning

Paper

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages