The Role of Information Extraction Tasks in Automatic Literary Character Network Construction
First, you should:
- install dependencies. Either use
poetry install
if you have poetry, orpip install -r requirements.txt
otherwise. - get the litbank dataset
The main experiment can be run with xp.py
:
python xp.py with\
min_graph_nodes=10\
co_occurrences_dist=32\
litbank.root="/path/to/litbank"
The following script will run all of the degradation experiments:
MAIN_XP_RUN="/path/to/main/xp/run"
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=NER degradation_name=add_wrong_entity degradation_steps=1000 degradation_report_frequency=0.05
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=NER degradation_name=remove_correct_entity degradation_steps=200 degradation_report_frequency=0.5
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=coref degradation_name=add_wrong_mention degradation_steps=200 degradation_report_frequency=0.05
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=coref degradation_name=remove_correct_mention degradation_steps=1000 degradation_report_frequency=0.05
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=coref degradation_name=add_wrong_link degradation_steps=500 degradation_report_frequency=0.05
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=coref degradation_name=remove_correct_link degradation_steps=1000 degradation_report_frequency=0.05
python xp_metrics_over_degradation.py with input_dir="${MAIN_XP_RUN}" task_name=coref degradation_name=coref_all degradation_steps=1000 degradation_report_frequency=0.05
The E2E-Coref experiment can be reproduced with the xp_e2e_llm_coref.py
script:
MAIN_XP_RUN="/path/to/main/xp/run"
LITBANK_PATH="/path/to/litbank"
python xp_e2e_llm_coref.py with\
input_dir="${MAIN_XP_RUN}"\
model="gpt3.5"\
openAI_API_key="insert your openAI key"\
litbank.root="${LITBANK_PATH}"
python xp_e2e_llm_coref.py with\
input_dir="${MAIN_XP_RUN}"\
model="gpt40"\
openAI_API_key="insert your openAI key"\
litbank.root="${LITBANK_PATH}"
python xp_e2e_llm_coref.py with\
input_dir="${MAIN_XP_RUN}"\
model="llama3-8b-instruct"\
hg_access_token="insert your Huggingface access token"\
device="cuda"\
litbank.root="${LITBANK_PATH}"
Similarly, the *E2E-Graphml experiment can be reproduced with the xp_e2e_llm_graphml.py
script:
MAIN_XP_RUN="/path/to/main/xp/run"
python xp_e2e_llm_graphml.py with\
input_dir="${MAIN_XP_RUN}"\
model="gpt3.5"\
openAI_API_key="insert your openAI key"\
litbank.root="${LITBANK_PATH}"
python xp_e2e_llm_graphml.py with\
input_dir="${MAIN_XP_RUN}"\
model="gpt40"\
openAI_API_key="insert your openAI key"\
litbank.root="${LITBANK_PATH}"
python xp_e2e_llm_graphml.py with\
input_dir="${MAIN_XP_RUN}"\
model="llama3-8b-instruct"\
hg_access_token="insert your Huggingface access token"\
device="cuda"\
litbank.root="${LITBANK_PATH}"
Figure | Corresponding Script |
---|---|
Table 1 | print_main_task_results.py |
Table 2 | print_main_graph_results.py |
Table 3 | |
Figure 1 | plot_degradation_metrics.py |
Figure 2 | plot_ner_degradation_metrics.py |
Figure 3 | plot_coref_degradation_metrics.py |
Table 4 | print_e2e_graph_results.py |