Auto-CNC ⚙️

The objective of this project is to improve a factory supply chain by implementing transformers in the sorting algorithm of which products should be produced and which ones should be delayed. The dataset consists of 1000 real-life examples of sequences of product queries and as the output label we used the real product that was produced within the context (more details can be found inside the data_analysy.ipynb). We used an encoder model (the well-known BERT), to predict the product that should be produced based on a sequence of input products that were sorted by query date.

Project structure

The project consists of the following files:

fine_tune.ipynb: file in which the BERT model is fine-tuned. With Google Colab and a T4 GPU, the model can be trained in under 15 minutes.
data.h5: file in which training and testing data is stored, ready to be fed to the model.
data_analysis.ipynb: file in which the preprocessing and dataset creation are mostly done (here is where the main logic and paradigm are stated).
load_model.ipynb: file in which the fine-tuned model is loaded and its accuracy is assessed in depth.
gaby_bert.py: test file in which we experimented with transformers.
datasets: folder containing the raw data in csv format.

Results

The model attains a good accuracy score, achieving a whopping 65% test accuracy considering the greatest logit, and an acceptable 85% accuracy on the test set when the top 3 logits of each prediction are considered. These results show that traditional RL algorithms may be well suited to scenarios like sorting in an infinite environment, but that transformers may also be useful and easy to implement compared to the logic and error analysis challenges that the design of an RL environment may have. Furthermore, transformers compared to RL paradigms may also be better suited to scenarios in which the reward function is difficult to design due to the lack of data, such as our case study, in which we lacked the data of an accurate measure to maximize or minimize (since no cost nor revenue data was present in the dataset).

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
data_generation		data_generation
datasets		datasets
extra		extra
LICENSE		LICENSE
README.md		README.md
data_analysis.ipynb		data_analysis.ipynb
fine_tune.ipynb		fine_tune.ipynb
inference.py		inference.py
load_model.ipynb		load_model.ipynb
requirements.txt		requirements.txt
reward_model.ipynb		reward_model.ipynb
rlhf.ipynb		rlhf.ipynb
seq_similarity.ipynb		seq_similarity.ipynb
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Auto-CNC ⚙️

Project structure

Results

About

Releases

Packages

Contributors 2

Languages

License

Maximo-Rulli/Auto-CNC

Folders and files

Latest commit

History

Repository files navigation

Auto-CNC ⚙️

Project structure

Results

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages