Exploring Curriculum Learning for Vision-Language Tasks: A Study on Small-Scale Multimodal Training

Rohan Saha, Abrar Fahim, Alona Fyshe, Alex Murphy

BabyLM Challenge (CoNLL / EMNLP 2024)

Abstract

For specialized domains, there is often not a wealth of data with which to train large machine learning models. In such limited data / compute settings, various methods exist aiming to do more with less, such as finetuning from a pretrained model, modulating difficulty levels as data are presented to a model (curriculum learning), and considering the role of model type / size. Approaches to efficient machine learning also take inspiration from human learning by considering use cases where machine learning systems have access to approximately the same number of words experienced by a 13 year old child (100M words). We investigate the role of 3 primary variables in a limited data regime as part of the multimodal track of the BabyLM challenge. We contrast: (i) curriculum learning, (ii), pretraining (with text-only data), (iii) model type. We modulate these variables and assess them on two types of tasks: (a) multimodal (text+image), and (b) unimodal (text-only) tasks. We find that curriculum learning benefits multimodal evaluations over non-curriclum learning models, particularly when combining text-only pretraining. On text-only tasks, curriculum learning appears to help models with smaller trainable parameter counts. We suggest possible reasons based on architectural differences and training designs as to why one might observe such results.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
plots		plots
src		src
.gitignore		.gitignore
README.md		README.md
assign_max_difficulty_to_all_duplicates.py		assign_max_difficulty_to_all_duplicates.py
count_words.py		count_words.py
create_quartiled_datasets.py		create_quartiled_datasets.py
debug_code.py		debug_code.py
df_backup.ipynb		df_backup.ipynb
difficulty_histogram_nouns_captions_only_bert_pos_tagger.png		difficulty_histogram_nouns_captions_only_bert_pos_tagger.png
difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative.png		difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative.png
difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_assigned_max_replaced_train_seed_0.png		difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_assigned_max_replaced_train_seed_0.png
difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_0.png		difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_0.png
difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_0_assigned_max_replaced.png		difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_0_assigned_max_replaced.png
difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_0_replaced.png		difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_0_replaced.png
difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_1.png		difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_1.png
difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_1_assigned_max_replaced.png		difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_1_assigned_max_replaced.png
difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_1_replaced.png		difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_1_replaced.png
difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_2.png		difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_2.png
difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_2_assigned_max_replaced.png		difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_2_assigned_max_replaced.png
difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_2_replaced.png		difficulty_histogram_nouns_captions_only_bert_pos_tagger_cumulative_seed_2_replaced.png
main.zip		main.zip
scratch.py		scratch.py
test_image.jpg		test_image.jpg
test_image_eval.jpg		test_image_eval.jpg
test_image_eval.png		test_image_eval.png
todo_add_functions.py		todo_add_functions.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploring Curriculum Learning for Vision-Language Tasks: A Study on Small-Scale Multimodal Training

Rohan Saha, Abrar Fahim, Alona Fyshe, Alex Murphy

BabyLM Challenge (CoNLL / EMNLP 2024)

Abstract

About

Releases

Packages

Languages

Alxmrphi/baby_lm_2024

Folders and files

Latest commit

History

Repository files navigation

Exploring Curriculum Learning for Vision-Language Tasks: A Study on Small-Scale Multimodal Training

Rohan Saha, Abrar Fahim, Alona Fyshe, Alex Murphy

BabyLM Challenge (CoNLL / EMNLP 2024)

Abstract

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages