Duluth at SemEval-2020 Task 7

This is the codebase for our SemEval 2020 paper: Duluth at SemEval-2020 Task 7: Using Surprise as a Key to Unlock Humorous Headlines.

@inproceedings{duluth2020humor,
    title = "Duluth at SemEval-2020 Task 7: Using Surprise as a Key to Unlock Humorous Headlines",
    author = "Shuning Jin and Yue Yin and XianE Tang and Ted Pedersen",
    booktitle = "Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval 2020)",
    year = "2020",
    url = "https://arxiv.org/abs/2009.02795"
}

Task Introduction

SemEval-2020 Task 7: Assessing Humor in Edited News Headlines

Competition page
- Subtask 1: regression task to predict the funniness score of an edited headline
- Subtask 2: classification task to predict the funnier between two edited headlines
Leaderboard for offical evaluation
- webpage version: “Evaluation-Task-1” is for Subtask 1 and “Evaluation-Task-2” is for Subtask 2.
- cleaned csv version
- Our system ranks 11/49 (0.531 RMSE) in Subtask 1, and 9/32 (0.632 accuracy) in Subtask 2.

1 Configuration

conda environment

packages are specified in environment.yml

require conda3: Anaconda 3 or Miniconda 3

create conda environment:
```
conda env create -f environment.yml
```

activate/deactivate the environment:

# linux/mac (conda>=3.6):
conda activate humor
conda deactivate
# linux/mac (conda<3.6):
source activate humor
source deactivate
# windows:
activate humor
deactivate

spaCy

python -m spacy download en

HuggingFace Transformers Cache Directory

we use the transformers library by HuggingFace. Save caches so you don't have to download the same model more than once.

# replace `/path/to/cache/directory` with your directory
CACHE=/path/to/cache/directory
bash scripts/path_setup.sh HUGGINGFACE_TRANSFORMERS_CACHE $CACHE
# this will add the following line to ~/.bash_profile (mac) or ~/.bashrc (linux)
# export HUGGINGFACE_TRANSFORMERS_CACHE=/path/to/cache/directory

2 Data

see data directory
datasets: Humicroedit (official task data) and Funlines (additional training data)
you can download the data from the source website, or simply run
```
bash scripts/download_data.sh
```
This gives the same data as in data directory.

3 Experiment output

experiment_directory
├── log.log
├── params.json
# if args.save_model
├── model_state.th
# if args.tensorboard
├── tensorboard_train
├── tensorboard_val
# if args.do_eval
└── output-{eval_data_name}.csv

To see tensorboard output:

open http://localhost:6006
tensorboard --logdir tensorboard_train
# you may need to wait a few seconds and refresh the page

open http://localhost:6006
tensorboard --logdir tensorboard_val
# you may need to wait a few seconds and refresh the page

4 Scripts

Baseline
- Baseline 1 uses the average score; Baseline 2 uses the majority label.
- output will be in baseline_output directory
```
bash scripts/baseline.sh > baseline_output/results.log
```
To reproduce our main experiment based on the contrastive approach (Table 2 in the paper):
- produce the highlighted numbers (i.e., best within each model type), with this script
```
bash scripts/table2_best.sh
```
- produce the whole table, with this script
```
bash scripts/table2.sh
```
- Table 2

To reproduce the additional analysis on the non-contrastive approach (Table 3 in the paper)
- use this script
```
bash scripts/table3.sh
```
- Table 3

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
code		code
configurations		configurations
data		data
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Duluth at SemEval-2020 Task 7

Task Introduction

1 Configuration

2 Data

3 Experiment output

4 Scripts

About

Releases

Packages

Languages

sirius-yandex-nlp-team/SemEval-2020-Task-7

Folders and files

Latest commit

History

Repository files navigation

Duluth at SemEval-2020 Task 7

Task Introduction

1 Configuration

2 Data

3 Experiment output

4 Scripts

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages