Skip to content

mmiemon/Movie-rating-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A template folder structure is attached. Your submission must confrom with that structure. The structure is explained below. All timings are approximate for a mid range (core-i5, 8GB ram, no GPU) laptop. You need to split your data as follows, 80% for training, 10% for valiadation and 10% for testing. Your data should be in *.zip format as in the template. If you need it unzipped to execute your programs you can do that within your program (using python module like zipfile) in Tmp folder. But remove these files (also from your program) once you are done.

Data\data.zip :- Original unprocessed raw data. This file is the only file that you will put manually. All the following files should be generated by data.py file. Do not upload this file during submission as you already gave them to me.

Data\Train\Under_10_min_training\data.zip :- A subset of training data where a 5 epoch training takes less than 10 min

Data\Train\Under_90_min_tuning\data.zip :- A subset of training data where a 10 epoch training takes less than 90 min. This subset should be used for each hyperparameter combination during tuning.

Data\Train\Best_hyperparameter_80_percent\data.zip :- 80 percent training data. This should be used for training with optimal hyperparameter settings. This learned model must be saved to use separately with test data.

Data\Validation\3_samples\data.zip :- A 3 sample set for validation

Data\Validation\Validation_10_percent\data.zip :- 10 percent validation data. This should be used to evaluate each hyperparameter combination during tuning.

Data\Test\Test_10_percent\data.zip :- 10 percent test data. This should be used to evaluate performance of your saved model.

tuning_results.txt :- Performance for each hyperparameter combination during tuning

hyperparameter.txt :- Optimal hyperparameters after tuning

model.h5 :- Your saved model in HDF5 format

Results.docx :- Tuning and test results in table format.

script.bat :- Install any dependencies for your program.

data.py :- All data preprocessing code

train.py :- All training and model saving code

tune.py :- All tuning, hyperparameter search and validation code. This will call training module from train.py

test.py :- All model loading and testing code

Lib\ :- Any other code, library you need

Tmp\ :- Created runtime. All temporary data should reside in this folder. Deleted at the end of execution.

Execution order is given below. We will assume that current directory is your project folder. Following script is for windows. For linux the python command will not contain .exe extention. Command line arguments in the same line with python command are input files. Following files are output files.

###############################################################################################################################################################

md Tmp

script.bat

python.exe data.py .\Data\data.zip .\Data\Train\Best_hyperparameter_80_percent\ .\Data\Validation\Validation_10_percent\ .\Data\Test\Test_10_percent\ .\Data\Train\Under_10_min_training\ .\Data\Train\Under_90_min_tuning\ .\Data\Validation\3_samples\

python.exe tune.py .\Data\Train\Under_90_min_tuning\data.zip .\Data\Validation\Validation_10_percent\data.zip .\tuning_results.txt .\hyperparameter.txt

python.exe train.py .\Data\Train\Best_hyperparameter_80_percent\data.zip .\hyperparameter.txt .\model.h5

python.exe test.py .\Data\Test\Test_10_percent\data.zip .\model.h5

rd Tmp /s /q

###############################################################################################################################################################

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published