Skip to content

bluesky333/Phi3KnowChem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Phi3KnowChem

Submission to the L+M-24 shared task at ACL2024 Language + Molecule Workshop. We trained Phi-3-mini-4k for the molecule captioning task. Our work shows that the continued pretraining phase without direct exposure to SMILES representations significantly enhanced the model's performance, a 300% increase for the BLEU scores.

Model Weight

The model weight can be found at 🔽Hugging Face.

Evaluation Dataset Download

The dataset used for evaluation can be found at LPM Dataset.

Running Evaluation

You can generate captions/descriptions with the code in this repository.

git clone https://github.com/bluesky333/Phi3KnowChem
cd Phi3KnowChem
conda create -n knowchem python=3.10 -y
conda activate knowchem
pip install -r requirements.txt

🏃 Inference

python inference-caption.py -c bluesky333/Phi3KnowChem -d Phi3KnowChem -o Phi3KnowChem --max-seq-len 2048 --batch-size 1

About

L+M-24 Shared Task

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages