The CM3111 Big Data Analytics Project is a series of Experiments written in the R Programming Language and Python. It supports the LaTeX Software System.
This project is designed to help reproduce the analysis of a dataset related to the quality of a product provided by a wine business, the product that will be analysed is the red variant of the Portuguese "Vinho Verde" wine provided by the business, the analysis of this dataset will allow the separation of the good products and the bad products provided by the wine business by making use of common machine learning algorithms such as the Logistic Regression Model and the Random Forest Model.
The objective of analysing the red variant of the Portuguese "Vinho Verde" wine provided by the wine business is to separate the products that have the required chemical composition to be classified as good quality products of those products that do not have the required chemical composition which will be classified as bad quality products.
- R (4.0.2+ required)
- R Studio (1.3.1056+ required)
- Python (2.7 required)
- ggplot2 (required)
- pandas (required)
- CX_Freeze (optional)
- LaTeX (optional)
- Git (optional)
The [CM3111 Big Data Analytics Project installation guides] includes instructions for installing the project as part of a local application.
python <path/to/main.py>
- Path to entry point file. If unspecified, the current working directory is used.