- Develop 2 different ML Models for the estimation of the Delivery Date (EDD or ETA)
- Data exploration and preparation: understanding, cleaning, variable retention/Feature selection, etc.
- Model Selection
- Data Partitioning: Training and testing
- Model Comparison /Evaluation:
- Parameter tuning.
- Develop a simple API for the deployment of the output, feel free to use the framework which suits the best for this task.
- Utalised jupyter notebook so that i could do some pre analysis of the data
- Performed some transformation e.g. changing individual dates to epoch times for regression
- Decided to use Gradient boosting and Random Forest as my two models. Both are robust and produce good results.
- Found a paper discussing methods used for EDD estimation, they suggested boosting models as well as the forest model. link: https://arxiv.org/pdf/2009.11598.pdf
- Once the models had been produced and checked, they were then exported using pickle.
- A simple website was created using Flask, a python framework. The models were hosted in the backend of the website and were accesed through a REST API.
I used SAS VA which is a tool similar to PowerBI. The report allows the user too look at deliveries at a courier based level. Have a look here: https://viyawaves.sas.com/SASVisualAnalytics/?reportUri=%2Freports%2Freports%2Ffb24327b-4213-42e5-8ca6-2d4363b93ab2§ionIndex=0&sso_guest=true&sas-welcome=false
- In order to use the program you must first clone the repo.
- After cloning the repo ensure you install the requirements.txt
- change directory to EDT-wbsite and then run: python app.py
- website should host on: http://127.0.0.1:5000/
- Enter information into the text boxes.
- Due to the values being encoded during the prep stages, values such as GBR now have numerical values.
- I recommend using these values as an example for the time being (obviously mess around with these values at the same time) Delivery Region: 44, Courier: 3, Return Tracking: 0, Delivery Location: 3, Transit Date: 2020-09-02 13:19:00, Pick up date: 2020-09-02 00:00:00