analyzing_verb_alternation_plms

Installation

Clone the repository: git clone git@github.com:kvah/analyzing_verb_alternations_plms.git
cd into the clone directory: cd analyzing_verb_alternations_plms/
Create the conda environment: conda env create -n 575nn --file ./conda_environment.yaml
Activate the conda environment: conda activate 575nn
Install alternationprober as an editable package with pip: pip install -e .

Development and Tests

To Run tests: pytests ./tests

`alternationprober` package.

Provides the following:

get_bert_word_embeddings: command-line utility to produce static word-level embeddings from the LAVA dataset.
- usage: get_bert_word_embeddings [--model_name]
- model_name: bert-base-uncased, roberta-base, google/electra-base-discriminator, microsoft/deberta-base
- Will produce 2 output files:
  - ./data/embeddings/static/{model_name}.npy: This is a 2d numpy array with the word-embeddings
  - ./data/embeddings/bert-word-embeddings-lava-vocab.json: This is a mapping of vocabulary item to its index in the numpy array. (It is the same as the order from the original file in the LAVA dataset.)

The embeddings and associated vocabulary mapping can be loaded like so:

import json
import numpy as np

from alternationprober.constants import (PATH_TO_BERT_WORD_EMBEDDINGS_FILE,
                                         PATH_TO_LAVA_VOCAB)

embeddings = np.load(PATH_TO_BERT_WORD_EMBEDDINGS_FILE, allow_pickle=True))

with PATH_TO_LAVA_VOCAB.open("r") as f:
    vocabulary_to_index = json.load(f)

get_bert_word_embeddings: command-line utility to produce contextual layer embeddings from the LAVA dataset.
- usage: get_bert_context_word_embeddings [--model_name]
- model_name: bert-base-uncased, roberta-base, google/electra-base-discriminator, microsoft/deberta-base
- Produces the following output file:
  - ./data/embeddings/context/{model_name}.npy: This is a 2d numpy array with the contextual word embeddings

The embeddings and associated vocabulary mapping can be loaded like so:

import json
import numpy as np

from alternationprober.constants import PATH_TO_BERT_CONTEXT_WORD_EMBEDDINGS_FILE

context_embeddings = np.load(PATH_TO_BERT_CONTEXT_WORD_EMBEDDINGS_FILE)

run_linear_classifier_experiment: Will run our experiment to predict alternation classes from static model embeddings derived from the LaVA dataset.
- usage: run_linear_classifier_experiment [--output_directory] [--use_context_embeddings] [--model_name]
- Example: To run the linear classifier experiment with contextual embeddings:
  - run_linear_classifier_experiment --use_context_embeddings --bert_base_uncased
- Note: <output_directory> will default to ./results/linear-probe-for-word-embeddings
- Note: ./download-datasets.sh and the jupyter notebook ./data_analysis.ipynb must be run first to make the data available.
run_linear_classifier_sentence_experiment: Will run our experiment to predict sentence grammaticality from contextual embeddings derived from the FAVA dataset.
- usage: run_linear_classifier_experiment [--output_directory] [--use_context_embeddings] [--model_name]
- Example: To run the linear classifier experiment with contextual embeddings:
  - run_linear_classifier_experiment --use_context_embeddings --bert_base_uncased
- Note: <output_directory> will default to ./results/linear-probe-for-sentence-embeddings
- Note: ./download-datasets.sh and the jupyter notebook ./data_analysis.ipynb must be run first to make the data available.

Other Related Resources in This Repository

`download-datasets.sh`

Shell script to download the LaVa and FAVA datasets to ./data directory.

Usage: sh download-datasets.sh

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
generation		generation
latex		latex
paper		paper
proposal		proposal
results		results
src/alternationprober		src/alternationprober
tests		tests
word		word
.gitignore		.gitignore
README.md		README.md
bert_verb_alternations.pdf		bert_verb_alternations.pdf
conda_environment.yaml		conda_environment.yaml
control_task.ipynb		control_task.ipynb
data_analysis.ipynb		data_analysis.ipynb
download-datasets.sh		download-datasets.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

analyzing_verb_alternation_plms

Installation

Development and Tests

`alternationprober` package.

Other Related Resources in This Repository

`download-datasets.sh`

About

Releases

Packages

Contributors 3

Languages

kvah/analyzing_verb_alternations_plms

Folders and files

Latest commit

History

Repository files navigation

analyzing_verb_alternation_plms

Installation

Development and Tests

alternationprober package.

Other Related Resources in This Repository

download-datasets.sh

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

`alternationprober` package.

`download-datasets.sh`

Packages