Splice

The Role of Information Extraction Tasks in Automatic Literary Character Network Construction

Reproducing Results

First, you should:

install dependencies. Either use poetry install if you have poetry, or pip install -r requirements.txt otherwise.
get the litbank dataset

The main experiment can be run with xp.py:

python xp.py with\
	   min_graph_nodes=10\
	   co_occurrences_dist=32\
	   litbank.root="/path/to/litbank"

Degradation Experiments

The following script will run all of the degradation experiments:

MAIN_XP_RUN="/path/to/main/xp/run"

python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=NER degradation_name=add_wrong_entity degradation_steps=1000 degradation_report_frequency=0.05
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=NER degradation_name=remove_correct_entity degradation_steps=200 degradation_report_frequency=0.5
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=coref degradation_name=add_wrong_mention degradation_steps=200 degradation_report_frequency=0.05
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=coref degradation_name=remove_correct_mention degradation_steps=1000 degradation_report_frequency=0.05
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=coref degradation_name=add_wrong_link degradation_steps=500 degradation_report_frequency=0.05
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=coref degradation_name=remove_correct_link degradation_steps=1000 degradation_report_frequency=0.05
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=coref degradation_name=coref_all degradation_steps=1000 degradation_report_frequency=0.05

End-to-end LLM-based Pipelines

The E2E-Coref experiment can be reproduced with the xp_e2e_llm_coref.py script:

MAIN_XP_RUN="/path/to/main/xp/run"
LITBANK_PATH="/path/to/litbank"

python xp_e2e_llm_coref.py with\
	   input_dir="${MAIN_XP_RUN}"\
	   model="gpt3.5"\
	   openAI_API_key="insert your openAI key"\
	   litbank.root="${LITBANK_PATH}"

python xp_e2e_llm_coref.py with\
	   input_dir="${MAIN_XP_RUN}"\
	   model="gpt40"\
	   openAI_API_key="insert your openAI key"\
	   litbank.root="${LITBANK_PATH}"

python xp_e2e_llm_coref.py with\
	   input_dir="${MAIN_XP_RUN}"\
	   model="llama3-8b-instruct"\
	   hg_access_token="insert your Huggingface access token"\
	   device="cuda"\
	   litbank.root="${LITBANK_PATH}"

Similarly, the *E2E-Graphml experiment can be reproduced with the xp_e2e_llm_graphml.py script:

MAIN_XP_RUN="/path/to/main/xp/run"

python xp_e2e_llm_graphml.py with\
	   input_dir="${MAIN_XP_RUN}"\
	   model="gpt3.5"\
	   openAI_API_key="insert your openAI key"\
	   litbank.root="${LITBANK_PATH}"

python xp_e2e_llm_graphml.py with\
	   input_dir="${MAIN_XP_RUN}"\
	   model="gpt40"\
	   openAI_API_key="insert your openAI key"\
	   litbank.root="${LITBANK_PATH}"

python xp_e2e_llm_graphml.py with\
	   input_dir="${MAIN_XP_RUN}"\
	   model="llama3-8b-instruct"\
	   hg_access_token="insert your Huggingface access token"\
	   device="cuda"\
	   litbank.root="${LITBANK_PATH}"

Printing / Plotting Results

Figure	Corresponding Script
Table 1	`print_main_task_results.py`
Table 2	`print_main_graph_results.py`
Table 3
Figure 1	`plot_degradation_metrics.py`
Figure 2	`plot_ner_degradation_metrics.py`
Figure 3	`plot_coref_degradation_metrics.py`
Table 4	`print_e2e_graph_results.py`

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
splice		splice
tests		tests
.gitignore		.gitignore
.projectile		.projectile
README.md		README.md
dataset_ingredients.py		dataset_ingredients.py
flatten_litbank_ner.py		flatten_litbank_ner.py
gold_nocoref_vs_coref.py		gold_nocoref_vs_coref.py
plot_coref_degradation_metrics.py		plot_coref_degradation_metrics.py
plot_degradation_metrics.py		plot_degradation_metrics.py
plot_ner_degradation_metrics.py		plot_ner_degradation_metrics.py
poetry.lock		poetry.lock
print_e2e_graph_results.py		print_e2e_graph_results.py
print_main_graph_results.py		print_main_graph_results.py
print_main_task_results.py		print_main_task_results.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
xp.py		xp.py
xp_e2e_coref.py		xp_e2e_coref.py
xp_e2e_llm_coref.py		xp_e2e_llm_coref.py
xp_e2e_llm_graphml.py		xp_e2e_llm_graphml.py
xp_metrics_over_degradation.py		xp_metrics_over_degradation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Splice

Reproducing Results

Degradation Experiments

End-to-end LLM-based Pipelines

Printing / Plotting Results

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

CompNet/Splice

Folders and files

Latest commit

History

Repository files navigation

Splice

Reproducing Results

Degradation Experiments

End-to-end LLM-based Pipelines

Printing / Plotting Results

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages