Whisper Small Ro - Iazar

language

license

base_model

Whisper Small Ro - Iazar

O adjustare a modelului openai/whisper-small pe Date audio colectate în cadrul proiectului TekWill. Obține următoarele rezultate pe setul de evaluare:

Pierdere: 0,8207
Wer: 46.2651

Descriere

Este un model intenționat pentru transcrierea graiului Moldovenesc în text.

Datele

Pentru antrenarea modelului s-au folosit atît date de la Common Voice, cît și date colectate în cadrul proiectului.

Performanță

Am făcut niște testări pe mai multe modele, ca să observăm dacă există un oarecare progres.

Transcriere de către modelul preantrenat de la Whisper.

Transcriere de către modelul antrenat numai cu date de la Common Voice.

Transcriere de către modelul antrenat numai cu datele colectate în cadrul proiectului.

Transcrierea de către modelul antrenat atît cu date de la Common Voice, cît și cu date colectate în cadrul proiectului.

Procedura de antrenament

Codul de antrenare

Am folosit google colab pentru antrenarea modelului.

mai multe detalii -> https://github.com/Yehoward/Iazar?tab=readme-ov-file#code_de_antrenare_iazaripynb

Hiperparametri de antrenament

Următorii hiperparametri au fost utilizați în timpul antrenamentului:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 50
training_steps: 200
mixed_precision_training: Native AMP

Rezultate antrenament

Pierdere la antrenament	Epocă	Pasul	Pierdere de validare	Rata de erori a cuvintelor
0.0005	66.6667	200	0.8207	46.2651

Versiuni cadre

Transformers 4.40.1
Pytorch 2.2.1+cu121
Datasets 2.19.0
Tokenizers 0.19.1

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
pub		pub
runs		runs
.gitattributes		.gitattributes
README.md		README.md
adapter_config.json		adapter_config.json
adapter_model.safetensors		adapter_model.safetensors
added_tokens.json		added_tokens.json
config.json		config.json
generation_config.json		generation_config.json
merges.txt		merges.txt
model.safetensors		model.safetensors
normalizer.json		normalizer.json
preprocessor_config.json		preprocessor_config.json
special_tokens_map.json		special_tokens_map.json
tokenizer_config.json		tokenizer_config.json
training_args.bin		training_args.bin
vocab.json		vocab.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper Small Ro - Iazar

Descriere

Datele

Performanță

Procedura de antrenament

Codul de antrenare

Hiperparametri de antrenament

Rezultate antrenament

Versiuni cadre

About

Releases

Packages

Languages

Yehoward/whisper-small-ro

Folders and files

Latest commit

History

Repository files navigation

Whisper Small Ro - Iazar

Descriere

Datele

Performanță

Procedura de antrenament

Codul de antrenare

Hiperparametri de antrenament

Rezultate antrenament

Versiuni cadre

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages