Skip to content

Model training and evaluation tools for a Polish-Kashubian translator.

Notifications You must be signed in to change notification settings

kashubian-translator/pl-csb-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Setup

Prerequisites

It is recommended to create a Python virtual environment before proceeding. You can read more about how to set it up here.

Installation

  1. Install Dependencies:

    pip install -r requirements.txt
  2. Install Autopep8 Pre-commit Hook:

    pre-commit install

Model Creation

To train a new translation model, run the following command:

python model_utilities train

Model Evaluation

Once the model is trained, you can evaluate it by running:

python model_utilities evaluate

Translation Using the Created Model

To use the trained model for translation, execute the following command:

python model_utilities translate <text to translate>

The model will translate from Polish to Kashubian by default. To translate in reverse, call:

python model_utilities translate <text to translate> true

For debug purposes, you can simply call:

python model_utilities translate

This will translate "Wsiądźmy do tego autobusu" from Polish to Kashubian.

Configuration

All key settings for the model, such as the pretrained model to be used, output model names, and training parameters, can be configured in the config.ini file.

Batch Size Configuration

The batch size setting in the config.ini file should match the memory capacity of the device being used for training. For example, if you are using a GPU with 8GB of memory, set:

BatchSize=8