FLEUR

FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model

Getting Started

FLEUR utilizes the LLaVA model for performing image caption evaluation (though you may use other Vision Language Models if desired). Please follow the instructions in the LLaVA GitHub README for the necessary setup. No additional training is required.

Evaluation on Flickr8k-Expert dataset

Running code for FLEUR:

CUDA_VISIBLE_DEVICES=0,1 python fleur.py

Running code for RefFLEUR:

CUDA_VISIBLE_DEVICES=0,1 python reffleur.py

Or get the explanation together

CUDA_VISIBLE_DEVICES=0,1 python fleur_exp.py

The evaluation result will be saved as txt files in the results folder.

Compute Kendall's Tau Correlation

Change file names of annotation file and the evaluation result file in compute_correlation.py

python compute_correlation.py

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
LLaVA @ 4e2277a		LLaVA @ 4e2277a
annotations		annotations
results		results
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
compute_correlation.py		compute_correlation.py
fleur.py		fleur.py
fleur_com.py		fleur_com.py
fleur_exp.py		fleur_exp.py
fleur_pascal.py		fleur_pascal.py
reffleur.py		reffleur.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FLEUR

Getting Started

Evaluation on Flickr8k-Expert dataset

Or get the explanation together

Compute Kendall's Tau Correlation

About

Releases

Packages

Contributors 2

Languages

Yebin46/FLEUR

Folders and files

Latest commit

History

Repository files navigation

FLEUR

Getting Started

Evaluation on Flickr8k-Expert dataset

Or get the explanation together

Compute Kendall's Tau Correlation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages