Explore Discovering Latent Knowledge Without Supervision

This repository is a fork of the original codebase https://github.com/collin-burns/discovering_latent_knowledge. For a detailed README.md file, please go to the their repository.

The full paper can be found in directory paper.

1. Use logical conjunctions to find the “truth” direction of the classifier

This part is done by Naomi Bashkansky. Go through jupyter_notebook/conjunction.ipynb for an overview.

2. Towards better eliciting latent knowledge on autoregressive models.

This part is done by Chuyue Tang. Some bugs in the original codebase are fixed and codes are modified for running experiments in this project.

1) Using CCS

First, use generate.py for (1) creating contrast pairs, and (2) generating hidden states from a model.

python generate.py --model_name gpt-j --dataset_name amazon_polarity  --num_examples 500 --all_layers

To generate multi-shot context, you can specify context_num.

python generate.py --model_name gpt-j --dataset_name amazon_polarity --num_examples 500 --context_num 10 --all_layers

You can also get data with and without context at the same time by setting context_both(which is later used by visualize.py to generate PCA graph comparing two settings).

python generate.py --model_name gpt-j --dataset_name amazon_polarity --num_examples 100 --context_num 10 --all_layers --context_both
python visualize.py --model_name gpt-j --dataset_name amazon_polarity --num_examples 100 --context_num 10 --all_layers --context_both

Due to the time limit, we do not guarantee that running these commands with other model or dataset settings will work out well.

After generating hidden states, you can use evaluate.py for running our main method, CCS, on those hidden states. We highly recommend you to run this file using the same arguments as you run generate.py.

In addition to evaluating the performance of CCS, evaluate.py also verifies that logistic regression (LR) accuracy is reasonable. We also add a PCA baseline (which is referred to as TPC in the paper). You will also get a plot with x-axis as layers and y-axis as accuracy for CCS, LR, and PCA.

2) Using VINC

Another part of this project is based on the under development codebase https://github.com/EleutherAI/elk.git. First, install requirements based on their codebase. Then run

cd elk
python elk elicit EleutherAI/gpt-j-6b amazon_polarity --int8 True --max_examples 250 250 --num_variants 1 --num_shots 10 --corrupt_prob 0.0

Here, num_variants refers to how many different paraphrase prompts you want to use. By default it uses all available prompt formats from promptsource; num_shots refers to how many context examples you want to include before the query statement; corrupt_prob is the probability of how each context example's label is flipped.

Requirements

This code base was tested on Python 3.7.5 and PyTorch 1.12. It also uses the datasets and promptsource packages for loading and formatting datasets.

3. Extending the CCS baselines with LogitLens

This part is done by Chloe Loughridge. Go through jupyter_notebook/logitlens.ipynb for an overview. Also, look for logitlens.py files.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
__pycache__		__pycache__
elk		elk
jupyter_notebook		jupyter_notebook
paper		paper
CCS.ipynb		CCS.ipynb
LICENSE.md		LICENSE.md
README.md		README.md
dataset.py		dataset.py
evaluate.py		evaluate.py
figure.png		figure.png
generate.py		generate.py
utils.py		utils.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Explore Discovering Latent Knowledge Without Supervision

1. Use logical conjunctions to find the “truth” direction of the classifier

2. Towards better eliciting latent knowledge on autoregressive models.

1) Using CCS

2) Using VINC

Requirements

3. Extending the CCS baselines with LogitLens

About

Releases

Packages

Languages

License

tcy63/discovering_latent_knowledge

Folders and files

Latest commit

History

Repository files navigation

Explore Discovering Latent Knowledge Without Supervision

1. Use logical conjunctions to find the “truth” direction of the classifier

2. Towards better eliciting latent knowledge on autoregressive models.

1) Using CCS

2) Using VINC

Requirements

3. Extending the CCS baselines with LogitLens

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages