This project implements a Neural Machine Translation (NMT) system using PyTorch. The system is designed to translate text from English to Hindi using a Transformer model.
dev_test/
: Contains development and test datasets.env/
: Python virtual environment directory.getModel.py
: Contains theget_model
function andTransformerModel
class.model.py
: Defines the neural network architecture includingFeedforwardNeuralNetModel
,Encoder
,Decoder
, andTransformer
.process.py
: Contains data processing functions includingcreate_dataset
.train.py
: Main training script that defines theCustomDataset
class and training loop.train_old.py
: Previous version of the training script.
-
Clone the repository:
git clone <repository-url> cd <repository-directory>
-
Create a virtual environment:
python -m venv env
-
Activate the virtual environment:
- On Windows:
.\env\Scripts\activate
- On macOS/Linux:
source env/bin/activate
- On Windows:
-
Install the required packages:
pip install -r requirements.txt
-
Prepare the data: Ensure that your data files are placed in the
dev_test/
directory. -
Train the model: Run the training script:
python train.py
The model architecture is defined in model.py. It includes:
FeedforwardNeuralNetModel
Encoder
Decoder
Transformer
Data processing functions are defined in process.py. The create_dataset
function is used to prepare the dataset for training.
The main training script is train.py. It defines the CustomDataset
class and the training loop.
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.
This project is licensed under the MIT License. See the LICENSE file for details.