Skip to content

Code for Rational LAMOL: A Rationale-based Lifelong Learning Framework

License

Notifications You must be signed in to change notification settings

kanwatchara-k/r_lamol

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rational LAMOL

Lifelong learning (LL) aims to train a neural network on a stream of tasks while retaining knowledge from previous tasks. However, many prior attempts in NLP still suffer from the catastrophic forgetting issue, where the model completely forgets what it just learned in the previous tasks. In this paper, we introduce Rational LAMOL, a novel end-to-end LL framework for language models. In order to alleviate catastrophic forgetting, Rational LAMOL enhances LAMOL, a recent LL model, by applying critical freezing guided by human rationales. When the human rationales are not available, we propose exploiting unsupervised generated rationales as substitutions.

** code mostly taken from LAMOL **

Dataset

The datasets used in the experiment are Bool-Q, Movie Reviews, and SciFact.

Dataset Download Link
Bool-Q ERASER
Movie Reviews ERASER
SciFact Link

Training

Model training directly follows that of LAMOL's with a few distinctions.

Block level

To freeze critical block, run train_freeze_block.py with an additional argument --layer_to_freeze $LAYER where $LAYER is a transformer block index between 0-11.

Head level

To freeze critical heads, modify this line. The format of critical heads to be subjected to freezing is (layer_idx,[head_idx]) e.g. (1,[1,2,3]) means heads indices 1,2,3 of layer index 1 will be kept frozen.

Critical Component Identification (CCI)

To identify critical component, run run_critical_freezing.py

Currently, we've only experiment with using previous task's rationales to identify the component.

Arguments for CCI :

Arguments Description
head_level Do head level Granularity?
head_level_top_k Number of Heads to choose from
data_dir Choice includes: movies,boolq,scifact. Data will be loaded from ./data/{data_dir}/val.jsonl
old_model_dir The folder of the old model e.g. ./bms_model/boolq/
new_model_dir The folder of the new model e.g. ./bms_model/movies/
mo_gt_method Method to select from Model Old to Ground Truth
mn_mo_method Method to select from Model New to Model Old
device Device to use. CPU/GPU
n/n_ann Number of maximum annotations to do ie. 200 (We found that 200 is enough)
gen_rat Use generated rationale?

Unsupervised Rationale Generation

Any rationale generation module can be used. However, in this work we used InvRat.

Generated rationales has to be in the same format as ERASER's jsonl file and in the same directory as human rationales. Then simply run CCI with --gen_rat.

TODO:
Write Proper Readme
Upload Code
Upload SciFact dataset used in the paper
Refactor code, use submodule to properly give credit to LAMOL

About

Code for Rational LAMOL: A Rationale-based Lifelong Learning Framework

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published