This project is an effort to create a framework that automates basic machine learning and will help a team quickly get some results and an idea of what algorithms might be useful. It is not a replacement for custom built systems that leverage machine learning.
To aid developers using machine learning algorithms in finding the best algorithms and optimal configurations for their specific situation. This is accomplished by recording as much information on a certain model as the developer wants, and then analyzing all the data to find which algorithms work best on a dataset and with what settings to they work best.
- Automatically applies several of a number of machine learning algorithms against the input data based on the settings it is given
- Has a large number of tests that make sure algorithms included still run and haven't become outdated
- Can be put in validation or application mode (train/test mode)
- Record results from a machine learning algorithm test
- Saves results in a Firebase Database
- (Coming Soon) View results in a WebUI
- (Coming Soon) Analyze data from results in the WebUI
Unless otherwise stated, when we say test we mean a way of determining if an algorithm works. (As opposed to testing of the actual code, etc..)
Test data is data that is unlabeled, in other words it does not have a column or label which represents the target that a model is trying to predict. So if a model predicts housing prices, "test data" will not have the housing prices listed.
Training data on the other hand does have the target column, because the model has to use that column to be trained.
All the information about the test including hyper-parameters of the model used, information on the test data, results of test, etc.
These instructions will get you a copy of the entire project up and running on your local machine for development and testing purposes. If you wish to deploy submodules individually, please see the instructions for that specific module. See deployment for notes on how to deploy the project on a live system.
What things you need in order to run this project. Detailed instructions included in "Installing".
- A *nix system.
- Python 2.7 (latest version), various python libraries (Scikit-learn, numpy, scipy, etc)
- Javascript, NodeJS, ReactJS, among others.
The following instructions cover setup and install of the entire system.
-
Go into your unix system and install SciPy
-
Note that installation might be different for different systems
-
For Ubuntu, try:
sudo apt-get install python-numpy python-scipy python-pandas python-sympy python-nose
-
Install SciKit-Learn using pip
Follow the instructions here, including submodules
Download and setup the test data so that unit tests run properly
To verify correct setup, please run the tests.
Go to the top directory of the project and run the following,
python -m unittest discover
This will run tests against every module, including ones that test R and Javascript modules. It will not run all tests, but every module will be covered.
Documentation pending
Explain what these tests test and why
Give an example
We measure code quality with CodeClimate, to see that data go here.
Pending
- Bash on Ubuntu on Windows
- Pycharm
- Python, Javascript
- Firebase
- Python: Scikit-learn, scipy, pandas, numpy
- Javascript: NodeJS
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
We do not use versioning currently, we will likely use SemVer for versioning eventually. For the versions available, see the tags on this repository.
- Alexander Clines - Initial work - asclines
- Isaac Griswold-Steiner - Initial work - ASAAR
- Zakery Fyke - Initial work - ZakeryFyke
- Ryan McBerg - Initial work - RyanMcBerg
See also the list of contributors who participated in this project.
We haven't dealt with licensing yet.
- Hat tip to anyone whose code was used
- template for README
- The labels used in the issues section were inspired by this site
- Issue and PR Templates were inspired by this site