diff --git a/README.md b/README.md index c74ca987a..b8009c9bb 100644 --- a/README.md +++ b/README.md @@ -1,25 +1,24 @@ [![Logo](./docs/source/_static/logo_blank_small.png)]() -[![License: GPL v3](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](https://github.com/EpistasisLab/pennai/blob/master/LICENSE) [![PennAI CI/CD](https://github.com/EpistasisLab/pennai/actions/workflows/pennai_tests.yml/badge.svg)](https://github.com/EpistasisLab/pennai/actions/workflows/pennai_tests.yml) [![Coverage Status](https://coveralls.io/repos/github/EpistasisLab/pennai/badge.svg)](https://coveralls.io/github/EpistasisLab/pennai) +[![License: GPL v3](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](https://github.com/EpistasisLab/Aliro/blob/master/LICENSE) [![Aliro CI/CD](https://github.com/EpistasisLab/Aliro/actions/workflows/pennai_tests.yml/badge.svg)](https://github.com/EpistasisLab/Aliro/actions/workflows/pennai_tests.yml) [![Coverage Status](https://coveralls.io/repos/github/EpistasisLab/pennai/badge.svg)](https://coveralls.io/github/EpistasisLab/pennai) News ================================== +**04/18/2022: PennAI** is becoming **Aliro**
+Over the next few weeks, PennAI will be rebranded as **Aliro.** The repository name will be updated on **TBD** and thus the URL for this project will change as well. -**PennAI** is becoming **Aliro** -**04/18/2022:** Over the next few weeks, PennAI will be rebranded as **Aliro.** The repository name will be updated on **TBD** and thus the URL for this project will change as well. - -PennAI: AI-Driven Data Science +Aliro: AI-Driven Data Science ================================== -**PennAI** is an easy-to-use data science assistant. +**Aliro** is an easy-to-use data science assistant. It allows researchers without machine learning or coding expertise to run supervised machine learning analysis through a clean web interface. It provides results visualization and reproducible scripts so that the analysis can be taken anywhere. -And, it has an *AI* assistant that can choose the analysis to run for you. Dataset profiles are generated and added to a knowledgebase as experiments are run, and the AI assistant learns from this to give more informed recommendations as it is used. PennAI comes with an initial knowledgebase generated from the [PMLB benchmark suite](https://github.com/EpistasisLab/penn-ml-benchmarks). +And, it has an *AI* assistant that can choose the analysis to run for you. Dataset profiles are generated and added to a knowledgebase as experiments are run, and the AI assistant learns from this to give more informed recommendations as it is used. Aliro comes with an initial knowledgebase generated from the [PMLB benchmark suite](https://github.com/EpistasisLab/penn-ml-benchmarks). -[**Documentation**](https://epistasislab.github.io/pennai/) +[**Documentation**](https://epistasislab.github.io/Aliro/) -[**Latest Production Release**](https://github.com/EpistasisLab/pennai/releases/latest) +[**Latest Production Release**](https://github.com/EpistasisLab/Aliro/releases/latest) Browse the repo: - [User Guide](./docs/guides/userGuide.md) @@ -28,7 +27,7 @@ Browse the repo: About the Project ================= -PennAI is actively developed by the [Institute for Biomedical Informatics](http://upibi.org) at the University of Pennsylvania. +Aliro is actively developed by the [Institute for Biomedical Informatics](http://upibi.org) at the University of Pennsylvania. Contributors include Heather Williams, Weixuan Fu, William La Cava, Josh Cohen, Steve Vitale, Sharon Tartarone, Randal Olson, Patryk Orzechowski, and Jason Moore. diff --git a/ai/recommender/README.md b/ai/recommender/README.md index 360c8efce..374396aa1 100644 --- a/ai/recommender/README.md +++ b/ai/recommender/README.md @@ -97,9 +97,9 @@ You should now be able to start the AI with your recommender. The easiest way to do so is to add your recommender to the `config/ai.env` file. Edit this file so that `AI_RECOMMENDER=myrec`. -Then when PennAI is launched, it will run with your recommender. +Then when Aliro is launched, it will run with your recommender. -For more control and for testing, launch PennAI with `AI_AUTOSTART=0` set in the +For more control and for testing, launch Aliro with `AI_AUTOSTART=0` set in the `config/ai.env` file. Then, attach to the `pennai_lab_1` docker container with the command diff --git a/data/knowledgebases/README.md b/data/knowledgebases/README.md index 6a68e032e..b257d06a2 100644 --- a/data/knowledgebases/README.md +++ b/data/knowledgebases/README.md @@ -1,9 +1,9 @@ # Knowledgebases Knowledgebases are collections of previous results from machine learning analyses -that are used to bootstrap PennAI. +that are used to bootstrap Aliro. -The results are stored in a .tsv.gz file. By default PennAI loads results from the +The results are stored in a .tsv.gz file. By default Aliro loads results from the benchmark of scikit-learn described in these papers: - Olson, Randal S., William La Cava, Patryk Orzechowski, Ryan J. Urbanowicz, and diff --git a/data/recommenders/pennaiweb/README.md b/data/recommenders/pennaiweb/README.md index bf6446fcb..6843da117 100644 --- a/data/recommenders/pennaiweb/README.md +++ b/data/recommenders/pennaiweb/README.md @@ -1,3 +1,3 @@ -# Serialized recommenders for use with the PennAI web interface +# Serialized recommenders for use with the Aliro web interface Pretrained recommenders are currently provided for the SVD recommender, one for regression and one for classification. diff --git a/data/recommenders/scikitlearn/README.md b/data/recommenders/scikitlearn/README.md index b340adc59..465eb3a92 100644 --- a/data/recommenders/scikitlearn/README.md +++ b/data/recommenders/scikitlearn/README.md @@ -1 +1 @@ -Serialized recommenders for use with the scikit-learn PennAI interface +Serialized recommenders for use with the scikit-learn Aliro interface diff --git a/docker/lab/Dockerfile b/docker/lab/Dockerfile index 67e56d98f..5328c9bdf 100644 --- a/docker/lab/Dockerfile +++ b/docker/lab/Dockerfile @@ -1,4 +1,4 @@ -FROM python:3.7.4-stretch +FROM python:3.7.11-stretch #nodejs RUN wget --quiet https://nodejs.org/dist/v11.14.0/node-v11.14.0-linux-x64.tar.xz -O ~/node.tar.xz && \ diff --git a/docs/guides/Scikit_Learn_API_Guide.md b/docs/guides/Scikit_Learn_API_Guide.md index 35e354251..bb71fc38f 100644 --- a/docs/guides/Scikit_Learn_API_Guide.md +++ b/docs/guides/Scikit_Learn_API_Guide.md @@ -1,7 +1,7 @@ # User Guide of PennAIpy ### Installation of AI engine as a standalone python package ### -PennAI AI engine is built on top of several existing Python libraries, including: +Aliro AI engine is built on top of several existing Python libraries, including: * [NumPy](http://www.numpy.org/) @@ -20,7 +20,7 @@ PennAI AI engine is built on top of several existing Python libraries, including Most of the necessary Python packages can be installed via the [Anaconda Python distribution](https://www.anaconda.com/products/individual), which we strongly recommend that you use. -You can install PennAI AI engine using `pip`. +You can install Aliro AI engine using `pip`. NumPy, SciPy, scikit-learn, pandas and joblib can be installed in Anaconda via the command: @@ -28,7 +28,7 @@ NumPy, SciPy, scikit-learn, pandas and joblib can be installed in Anaconda via t conda install numpy scipy scikit-learn pandas joblib simplejson ``` -Surprise was tweaked for the PennAI AI engine and it can be install with `pip` via the command below. **Note: [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/) is required for building the surprise package in Windows OS. Please download and run the installer with selecting "C++ Build tools". Additionally, the latest version of [`cython`](https://cython.org) is required and it can be installed/updated via `pip install --upgrade cython`.** +Surprise was tweaked for the Aliro AI engine and it can be install with `pip` via the command below. **Note: [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/) is required for building the surprise package in Windows OS. Please download and run the installer with selecting "C++ Build tools". Additionally, the latest version of [`cython`](https://cython.org) is required and it can be installed/updated via `pip install --upgrade cython`.** ```Shell pip install --no-cache-dir git+https://github.com/lacava/surprise.git@1.1.1.1 @@ -40,9 +40,9 @@ Finally to install AI engine itself, run the following command: pip install pennaipy ``` -### Example of using PennAI AI engine ### +### Example of using Aliro AI engine ### -The following code illustrates how PennAI can be employed for performing a simple _classification task_ over the Iris dataset. +The following code illustrates how Aliro can be employed for performing a simple _classification task_ over the Iris dataset. ```Python from pennai.sklearn import PennAIClassifier @@ -68,17 +68,17 @@ print(pennai.score(X_test, y_test)) ``` -### Default knowledgebase/metafeatures of PennAI AI engine +### Default knowledgebase/metafeatures of Aliro AI engine -If you don't specify `knowledgebase` and `kb_metafeatures` in `PennAIClassifier` or `PennAIRegressor`, PennAI AI engine will use default knowledgebase based on [pmlb](https://github.com/EpistasisLab/penn-ml-benchmarks)(version0.3). +If you don't specify `knowledgebase` and `kb_metafeatures` in `PennAIClassifier` or `PennAIRegressor`, Aliro AI engine will use default knowledgebase based on [pmlb](https://github.com/EpistasisLab/penn-ml-benchmarks)(version0.3). | | Default Knowledgebase | Default Metafeatures | |----------------|------------------------------------------------|-----------------------------------------| -| Classification | [sklearn-benchmark-data-knowledgebase-r6.tsv.gz](https://github.com/EpistasisLab/pennai/blob/master/data/knowledgebases/sklearn-benchmark-data-knowledgebase-r6.tsv.gz) | [pmlb_classification_metafeatures.csv.gz](https://github.com/EpistasisLab/pennai/blob/master/data/knowledgebases/pmlb_classification_metafeatures.csv.gz) | -| Regression | [pmlb_regression_results.tsv.gz](https://github.com/EpistasisLab/pennai/blob/master/data/knowledgebases/pmlb_regression_results.tsv.gz) | [pmlb_regression_metafeatures.csv.gz](https://github.com/EpistasisLab/pennai/blob/master/data/knowledgebases/pmlb_regression_metafeatures.csv.gz) | +| Classification | [sklearn-benchmark-data-knowledgebase-r6.tsv.gz](https://github.com/EpistasisLab/Aliro/blob/master/data/knowledgebases/sklearn-benchmark-data-knowledgebase-r6.tsv.gz) | [pmlb_classification_metafeatures.csv.gz](https://github.com/EpistasisLab/Aliro/blob/master/data/knowledgebases/pmlb_classification_metafeatures.csv.gz) | +| Regression | [pmlb_regression_results.tsv.gz](https://github.com/EpistasisLab/Aliro/blob/master/data/knowledgebases/pmlb_regression_results.tsv.gz) | [pmlb_regression_metafeatures.csv.gz](https://github.com/EpistasisLab/Aliro/blob/master/data/knowledgebases/pmlb_regression_metafeatures.csv.gz) | -### Example of using PennAI AI engine with non-default knowledgebase/metafeature. ### +### Example of using Aliro AI engine with non-default knowledgebase/metafeature. ### ```Python @@ -92,8 +92,8 @@ iris = load_iris() X_train, X_test, y_train, y_test = train_test_split(iris.data.astype(np.float64), iris.target.astype(np.float64), train_size=0.75, test_size=0.25, random_state=42) -classification_kb = "https://github.com/EpistasisLab/pennai/raw/ai_sklearn_api/data/knowledgebases/sklearn-benchmark5-data-knowledgebase-small.tsv.gz" -classification_metafeatures="https://github.com/EpistasisLab/pennai/raw/ai_sklearn_api/data/knowledgebases/pmlb_classification_metafeatures.csv.gz" +classification_kb = "https://github.com/EpistasisLab/Aliro/raw/ai_sklearn_api/data/knowledgebases/sklearn-benchmark5-data-knowledgebase-small.tsv.gz" +classification_metafeatures="https://github.com/EpistasisLab/Aliro/raw/ai_sklearn_api/data/knowledgebases/pmlb_classification_metafeatures.csv.gz" pennai = PennAIClassifier( rec_class=KNNMetaRecommender, @@ -110,9 +110,9 @@ print(pennai.score(X_test, y_test)) ``` -### Example of using PennAI AI engine with pre-trained SVG recommender ### +### Example of using Aliro AI engine with pre-trained SVG recommender ### -The pre-trained SVG recommender is provided for saving computational time for initializing the recommender with default knowledgebase in PennAI. The following code illustrates how to use the pre-trained SVG recommender: +The pre-trained SVG recommender is provided for saving computational time for initializing the recommender with default knowledgebase in Aliro. The following code illustrates how to use the pre-trained SVG recommender: ```Python from pennai.sklearn import PennAIClassifier @@ -129,7 +129,7 @@ X_train, X_test, y_train, y_test = train_test_split(iris.data.astype(np.float64) iris.target.astype(np.float64), train_size=0.75, test_size=0.25, random_state=42) # download pre-trained SVG recommender for pennai's github -urllib.request.urlretrieve("https://github.com/EpistasisLab/pennai/raw/ai_sklearn_api/data/recommenders/scikitlearn/SVDRecommender_classifier_accuracy_pmlb.pkl.gz", "SVDRecommender_classifier_accuracy_pmlb.pkl.gz") +urllib.request.urlretrieve("https://github.com/EpistasisLab/Aliro/raw/ai_sklearn_api/data/recommenders/scikitlearn/SVDRecommender_classifier_accuracy_pmlb.pkl.gz", "SVDRecommender_classifier_accuracy_pmlb.pkl.gz") serialized_rec = "SVDRecommender_classifier_accuracy_pmlb.pkl.gz" pennai = PennAIClassifier( diff --git a/docs/guides/developerGuide.md b/docs/guides/developerGuide.md index 7682e8eab..ac06b35f0 100644 --- a/docs/guides/developerGuide.md +++ b/docs/guides/developerGuide.md @@ -5,7 +5,7 @@ ### Requirements Install Docker and docker-compose as per the main installation requirements (see :ref:`user-guide`). - Docker setup - - Shared Drive: (Windows only) Share the drive that will have the PennAI source code with the Docker desktop [Docker Shared Drives](https://docs.docker.com/docker-for-windows/#shared-drives) + - Shared Drive: (Windows only) Share the drive that will have the Aliro source code with the Docker desktop [Docker Shared Drives](https://docs.docker.com/docker-for-windows/#shared-drives) #### Optional dependencies for development/testing: - Python and pyton test runners (in most cases unnecessary. needed only to run unit tests locally outside of docker) @@ -16,24 +16,24 @@ Install Docker and docker-compose as per the main installation requirements (see - [https://nodejs.org/en/](https://nodejs.org/en/) ### Building Docker Images -1. Clone the PennAI project using `git clone git@github.com:EpistasisLab/pennai.git` +1. Clone the Aliro project using `git clone git@github.com:EpistasisLab/Aliro.git` -2. Set up your local PennAI configuration file. From the pennai directory, copy `config\ai.env-template` to `config\ai.env`. +2. Set up your local Aliro configuration file. From the Aliro directory, copy `config\ai.env-template` to `config\ai.env`. -3. Build the development service images by running `docker-compose build` from the pennai directory. It will take several minutes for the images to be built the first time this is run. +3. Build the development service images by running `docker-compose build` from the Aliro directory. It will take several minutes for the images to be built the first time this is run. ### Starting and Stopping ### -To start PennAI, from the PennAI directory run the command `docker-compose up`. To stop PennAI, kill the process with `ctrl+c` and wait for the process to exit. +To start Aliro, from the Aliro directory run the command `docker-compose up`. To stop Aliro, kill the process with `ctrl+c` and wait for the process to exit. -PennAI can be run with multiple machine instances using the `docker-compose-multi-machine.yml` docker compose file, as per: `docker-compose up -f ./docker-compose-multi-machine.yml` +Aliro can be run with multiple machine instances using the `docker-compose-multi-machine.yml` docker compose file, as per: `docker-compose up -f ./docker-compose-multi-machine.yml` To reset the docker volumes, restart using the `--force-recreate` flag or run `docker-compose down` after the server has been stopped. ## Development Notes - After any code changes are pulled, **ALWAYS** rerun `docker-compose build` and when you first reload the webpage first do a hard refresh with ctrl+F5 instead of just F5 to clear any deprecated code out of the browser cache. -- Whenever there are updates to any of the npm libraries as configured with `package.json` files, the images should be rebuilt and the renew-anon-volumes flag should be used when starting PennAI `docker-compose up --renew-anon-volumes` or `docker-compose up -V`. +- Whenever there are updates to any of the npm libraries as configured with `package.json` files, the images should be rebuilt and the renew-anon-volumes flag should be used when starting Aliro `docker-compose up --renew-anon-volumes` or `docker-compose up -V`. - Use `docker-compose build` to rebuild the images for all services (lab, machine, dbmongo) if their dockerfiles or the contents of their build directories have changed. See [docker build docs,](https://docs.docker.com/compose/reference/build/) - To get the cpu and memory status of the running containers use `docker stats` - To clear out all files not checked into git, use `git clean -xdf` @@ -56,7 +56,7 @@ The frontend UI source is in `\lab\webapp` and is managed using [webpack](https: There are two ways to enable watch mode: -* To enable watch mode after PennAi has been started, do the following: +* To enable watch mode after Aliro has been started, do the following: ``` docker exec -it "pennai_lab_1" /bin/bash cd $PROJECT_ROOT/lab/webapp @@ -69,7 +69,7 @@ There are two ways to enable watch mode: To update or add NPM package dependencies: * Update the appropriate `package.json` file * Rebuild the images (`docker-compose build`, `docker-compose -f .\docker-compose-int-test.yml build` etc.) -* Refresh anonymous volumes when restarting PennAI with `docker-compose up --renew-anon-volumes` or `docker-compose up -V` +* Refresh anonymous volumes when restarting Aliro with `docker-compose up --renew-anon-volumes` or `docker-compose up -V` --- @@ -77,20 +77,20 @@ Package management for node is configured in three places: the main backend API Node package installation (`npm install`) takes palace as part of the `docker build` process. If there are changes to a `package.json` file, then during the build those changes will be detected and the updated npm packages will be installed. -When not using the production docker-compose file, node packages are installed in docker anonymous volumes `lab/node_modules`, `lab/webapp/node_modules`, `machine/node_modules`. When starting PennAI after the packages have been rebuilt, the `--renew-anon-volumes` flag should be used. +When not using the production docker-compose file, node packages are installed in docker anonymous volumes `lab/node_modules`, `lab/webapp/node_modules`, `machine/node_modules`. When starting Aliro after the packages have been rebuilt, the `--renew-anon-volumes` flag should be used. ## Architecture Overview -PennAI is designed as a multi-component docker architecture that uses a variety of technologies including Docker, Python, Node.js, scikit-learn and MongoDb. The project contains multiple docker containers that are orchestrated by a docker-compose file. +Aliro is designed as a multi-component docker architecture that uses a variety of technologies including Docker, Python, Node.js, scikit-learn and MongoDb. The project contains multiple docker containers that are orchestrated by a docker-compose file. -![PennAI Architecture Diagram](https://raw.githubusercontent.com/EpistasisLab/pennai/master/docs/source/_static/pennai_architecture.png?raw=true "PennAI Architecture Diagram") +![Aliro Architecture Diagram](https://raw.githubusercontent.com/EpistasisLab/Aliro/master/docs/source/_static/pennai_architecture.png?raw=true "Aliro Architecture Diagram") ##### Controller Engine (aka _The Server_) The central component is the controller engine, a server written in Node.js. This component is responsible for managing communication between the other components using a rest API. ##### Database A MongoDb database is used for persistent storage. ##### UI Component (aka _The Client_) -The UI component (_Vizualization / UI Engine_ in the diagram above) is a web application written in javascript that uses the React library to create the user interface and the Redux library to manage server state. It allows users to upload datasets for analysis, request AI recommendations for a dataset, manually run machine learning experiments, and displays experiment results in an intuitive way. The AI engine is written in Python. As users make requests to perform analysis on datasets, the AI engine will generate new machine learning experiment recommendations and communicate them to the controller engine. The AI engine contains a knowledgebase of previously run experiments, results and dataset metafeatures that it uses to inform the recommendations it makes. Knowledgable users can write their own custom recommendation system. The machine learning component is responsible for running machine learning experiments on datasets. It has a node.js server that is used to communicate with the controller engine, and uses python to execute scikit learn algorithms on datasets and communicate results back to the central server. A PennAI instance can support multiple instances of machine learning engines, enabling multiple experiments to be run in parallel. +The UI component (_Vizualization / UI Engine_ in the diagram above) is a web application written in javascript that uses the React library to create the user interface and the Redux library to manage server state. It allows users to upload datasets for analysis, request AI recommendations for a dataset, manually run machine learning experiments, and displays experiment results in an intuitive way. The AI engine is written in Python. As users make requests to perform analysis on datasets, the AI engine will generate new machine learning experiment recommendations and communicate them to the controller engine. The AI engine contains a knowledgebase of previously run experiments, results and dataset metafeatures that it uses to inform the recommendations it makes. Knowledgable users can write their own custom recommendation system. The machine learning component is responsible for running machine learning experiments on datasets. It has a node.js server that is used to communicate with the controller engine, and uses python to execute scikit learn algorithms on datasets and communicate results back to the central server. A Aliro instance can support multiple instances of machine learning engines, enabling multiple experiments to be run in parallel. ## Code Documentation - Sphinx documentation can be built in the context of a docker container with the command `docker-compose -f .\docker-compose-doc-builder.yml up --abort-on-container-exit`. @@ -116,11 +116,11 @@ Note: If the npm packages have been updated, the unit tests docker image need to - Results: - The results will in xcode format be in `.\target\test-reports\int_jest_xunit.xml` - The results will in html format be in `.\target\test-reports\html\int_jest_test_report.html` -- Docs: See [Documentation](https://github.com/EpistasisLab/pennai/blob/master/tests/integration/readme.md) for details. +- Docs: See [Documentation](https://github.com/EpistasisLab/Aliro/blob/master/tests/integration/readme.md) for details. ### Unit -There are several unit test suites for the various components of PennAI. The unit test suites can be run together in the context of a docker environment or directly on the host system, or an individual test suite can be run by itself. +There are several unit test suites for the various components of Aliro. The unit test suites can be run together in the context of a docker environment or directly on the host system, or an individual test suite can be run by itself. The default location of the test output is the `.\target\test-reports\` directory. @@ -199,7 +199,7 @@ To create a production release: Release procedure: -0. **Test production build.** In the master branch with all changes applied, run `docker-compose -f ./docker-compose-production.yml build` followed by `docker-compose -f ./docker-compose-production.yml up -V`. This should start an instance of PennAI using the production build environment. Test that it works as expected. +0. **Test production build.** In the master branch with all changes applied, run `docker-compose -f ./docker-compose-production.yml build` followed by `docker-compose -f ./docker-compose-production.yml up -V`. This should start an instance of Aliro using the production build environment. Test that it works as expected. 1. **Update the `.env` file with a new version number.** In the master branch, update the TAG environment variable in `.env` to the current production version as per [semantic versioning](https://semver.org/) and the python package version specification [PEP440](https://www.python.org/dev/peps/pep-0440). Development images should have a tag indicating it is a [pre-release](https://www.python.org/dev/peps/pep-0440/#pre-releases) (for example, `a0`). @@ -221,7 +221,7 @@ git checkout production bash release/deploy_production_release.sh ``` -5. **Test DockerHub images and production code.** Test that the production release works with the newly uploaded DockerHub images by navigating to the directory `target/production/pennai-${VERSION}` and running `docker-compose up`. This should start an instance of PennAI that loads the newest images from DockerHub. Test that this works as expected. Check that in the enviromental variables section of the admin page, 'TAG' matches the current version. +5. **Test DockerHub images and production code.** Test that the production release works with the newly uploaded DockerHub images by navigating to the directory `target/production/pennai-${VERSION}` and running `docker-compose up`. This should start an instance of Aliro that loads the newest images from DockerHub. Test that this works as expected. Check that in the enviromental variables section of the admin page, 'TAG' matches the current version. 6. **Create Github Release.** If the test is successful, create a github release using the github web interface. Base the release on the tagged production commit. Attach the file `target/production/pennai-${VERSION}.zip` as an archive asset. @@ -233,4 +233,4 @@ bash release/deploy_production_release.sh 2. Unzip the archive ### Running from production build -1. From the pennai directory, run the command `docker-compose up` to start the PennAI server. +1. From the Aliro directory, run the command `docker-compose up` to start the Aliro server. diff --git a/docs/guides/userGuide.md b/docs/guides/userGuide.md index 103980623..b13b0d153 100644 --- a/docs/guides/userGuide.md +++ b/docs/guides/userGuide.md @@ -1,9 +1,9 @@ # User Guide -PennAI is a platform to help researchers leverage supervised machine learning techniques to analyze data without needing an extensive data science background, and can also assist more experienced users with tasks such as choosing appropriate models for data. Users interact with PennAI via a web interface that allows them to execute machine learning experiments and explore generated models, and has an AI recommendation engine that will automatically choose appropriate models and parameters. Dataset profiles are generated and added to a knowledgebase as experiments are run, and the recommendation engine learns from this to give more informed recommendations as it is used. This allows the AI recommender to become tailored to specific data environments. PennAI comes with an initial knowledgebase generated from the PMLB benchmark suite. +Aliro is a platform to help researchers leverage supervised machine learning techniques to analyze data without needing an extensive data science background, and can also assist more experienced users with tasks such as choosing appropriate models for data. Users interact with Aliro via a web interface that allows them to execute machine learning experiments and explore generated models, and has an AI recommendation engine that will automatically choose appropriate models and parameters. Dataset profiles are generated and added to a knowledgebase as experiments are run, and the recommendation engine learns from this to give more informed recommendations as it is used. This allows the AI recommender to become tailored to specific data environments. Aliro comes with an initial knowledgebase generated from the PMLB benchmark suite. ## Installation -PennAI is a multi-container docker project that uses ([Docker-Compose](https://docs.docker.com/compose/)). +Aliro is a multi-container docker project that uses ([Docker-Compose](https://docs.docker.com/compose/)). ### Requirements - Docker @@ -11,23 +11,23 @@ PennAI is a multi-container docker project that uses ([Docker-Compose](https://d - [Official Docker Website Getting Started](https://docs.docker.com/engine/getstarted/step_one/) - [Official Docker Installation for Windows](https://docs.docker.com/docker-for-windows/install/) - **Runtime Memory**: (Mac and Windows only) If using **Windows** or **Mac**, we recommend docker VM to be configured with at least 6GB of runtime memory ([Mac configuration](https://docs.docker.com/docker-for-mac/#advanced), [Windows configuration](https://docs.docker.com/docker-for-windows/#advanced)). By default, docker VM on Windows or Mac starts with 2G runtime memory. - - **File Sharing**: (Windows only) Share the drive that will contain the PennAI directory with Docker by opening Docker Desktop, navigating to Resources->File Sharing and sharing the drive. [Docker Desktop File Sharing](https://docs.docker.com/docker-for-windows/#file-sharing) + - **File Sharing**: (Windows only) Share the drive that will contain the Aliro directory with Docker by opening Docker Desktop, navigating to Resources->File Sharing and sharing the drive. [Docker Desktop File Sharing](https://docs.docker.com/docker-for-windows/#file-sharing) - Docker-Compose (Version 1.22.0 or greater, Linux only) - Separate installation is only needed for linux, docker-compose is bundled with windows and mac docker installations - [Linux Docker-Compose Installation](https://docs.docker.com/compose/install/) ### Installation -1. Download the production zip `pennai-*.zip` from the asset section of the [latest release](https://github.com/EpistasisLab/pennai/releases/latest) (note that this is different from the source code zip file). +1. Download the production zip `pennai-*.zip` from the asset section of the [latest release](https://github.com/EpistasisLab/Aliro/releases/latest) (note that this is different from the source code zip file). 2. Unzip the archive -## Using PennAI +## Using Aliro ### Starting and Stopping -To start PennAI, from the command line, navigate to the PennAI directory run the command `docker-compose up`. To stop PennAI, kill the process with `ctrl+c` and wait for the server to shut down. It may take a few minutes to build the first time PennAI is run. +To start Aliro, from the command line, navigate to the Aliro directory run the command `docker-compose up`. To stop Aliro, kill the process with `ctrl+c` and wait for the server to shut down. It may take a few minutes to build the first time Aliro is run. -To reset the datasets and experiments in the server, start PennAI with the command `docker-compose up --force-recreate` or run the command `docker-compose down` after the server has stopped. +To reset the datasets and experiments in the server, start Aliro with the command `docker-compose up --force-recreate` or run the command `docker-compose down` after the server has stopped. ### User Interface -Once the webserver is up, connect to to access the website. You should see the **Datasets** page. If it is your first time starting PennAI, there should be a message instructing one to add new datasets. +Once the webserver is up, connect to to access the website. You should see the **Datasets** page. If it is your first time starting Aliro, there should be a message instructing one to add new datasets. ### Adding Datasets One can add new datasets using a UI form within the website or manually add new datasets to the data directory. Datasets have the following restrictions: @@ -37,14 +37,14 @@ One can add new datasets using a UI form within the website or manually add new * Only the label column or categorical or ordinal features can contain string values. * Files must be smaller then 8mb -Some example datasets can be found in the classification section of the [Penn Machine Learning Benchmarks](https://github.com/EpistasisLab/penn-ml-benchmarks/tree/master/datasets/classification) github repository. +Some example datasets can be found in the classification section of the [Penn Machine Learning Benchmarks](https://github.com/EpistasisLab/penn-ml-benchmarks/tree/master/datasets) github repository. #### Uploading Using the Website #### To upload new datasets from the website, click the "Add new Datasets" button on the Datasets page to navigate to the upload form. Select a file using the form's file browser and enter the corresponding information about the dataset: the name of the dependent column, a JSON of key/value pairs of ordinal features, for example ```{"ord" : ["first", "second", "third"]}```, and a comma separated list of categorical column names without quotes, such as `cat1, cat2`. Once uploaded, the dataset should be available to use within the system. #### Adding Initial Datasets to the Data Directory #### -Labeled datasets can also be loaded when PennAI starts by adding them to the `data/datasets/user` directory. PennAI must be restarted if new datasets are added while it is running. If errors are encountered when validating a dataset, they will appear in a log file in `target/logs/loadInitialDatasets.log` and that dataset will not be uploaded. Data can be placed in subfolders in this directory. +Labeled datasets can also be loaded when Aliro starts by adding them to the `data/datasets/user` directory. Aliro must be restarted if new datasets are added while it is running. If errors are encountered when validating a dataset, they will appear in a log file in `target/logs/loadInitialDatasets.log` and that dataset will not be uploaded. Data can be placed in subfolders in this directory. @@ -71,4 +71,4 @@ From the **Datasets** page, click 'completed experiments' to navigate to the **E ### Downloading and Using Models ### A pickled version of the fitted model and an example script for using that model can be downloded for any completed experiment from the **Experiments** page. -Please see the [jupiter notebook script demo](https://github.com/EpistasisLab/pennai/blob/production/docs/PennAI_Demo/Demo_of_using_exported_scripts_from_PennAI.ipynb) for instructions on using the scripts and model exported from PennAI to reproduce the findings on the results page and classify new datasets. +Please see the [jupiter notebook script demo](https://github.com/EpistasisLab/Aliro/blob/production/docs/PennAI_Demo/Demo_of_using_exported_scripts_from_PennAI.ipynb) for instructions on using the scripts and model exported from Aliro to reproduce the findings on the results page and classify new datasets. diff --git a/lab/webapp/src/components/App/Navbar/index.jsx b/lab/webapp/src/components/App/Navbar/index.jsx index b1b03f399..677c9dbea 100644 --- a/lab/webapp/src/components/App/Navbar/index.jsx +++ b/lab/webapp/src/components/App/Navbar/index.jsx @@ -49,7 +49,7 @@ function Navbar({ preferences }) { return ( - + diff --git a/lab/webapp/webpack.config.js b/lab/webapp/webpack.config.js index 4b4c2c74e..f1fe19e07 100644 --- a/lab/webapp/webpack.config.js +++ b/lab/webapp/webpack.config.js @@ -77,7 +77,7 @@ var config = { inject: false, template: require('html-webpack-template'), - title: 'PennAI Launchpad', + title: 'Aliro Launchpad', headHtmlSnippet: ` `, diff --git a/machine/learn/README.md b/machine/learn/README.md index 0204d9f5e..194371f86 100644 --- a/machine/learn/README.md +++ b/machine/learn/README.md @@ -1,8 +1,8 @@ # Learn Overview -The scripts in this folder will build and evaluate model based on experiment requests from PennAI API. +The scripts in this folder will build and evaluate model based on experiment requests from Aliro API. -To see options for machine learning algorithms defined in `projects.json` of PennAI, attach machine docker container (e.g. `docker exec -it "pennai_machine_1" /bin/ba -sh`) and type `python learn/driver.py -h` to see options for machine learning algorithms in PennAI. +To see options for machine learning algorithms defined in `projects.json` of Aliro, attach machine docker container (e.g. `docker exec -it "pennai_machine_1" /bin/ba +sh`) and type `python learn/driver.py -h` to see options for machine learning algorithms in Aliro. To check options for a specific algorithm from [scikit learn](https://scikit-learn.org/stable/), e.g [DecisionTreeClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html), type `python learn/driver.py DecisionTreeClassifier -h`. diff --git a/tests/integration/int_test_runner.sh b/tests/integration/int_test_runner.sh index e30290a08..b482c7fc3 100644 --- a/tests/integration/int_test_runner.sh +++ b/tests/integration/int_test_runner.sh @@ -1,7 +1,7 @@ -#!/bin/bash -echo "starting tests..." -npm test - -echo "cleanup" -rm -rf '/appsrc/ai/__pycache__/*' -rm -rf '/appsrc/ai/metalearning/__pycache__/*' +#!/bin/bash +echo "starting tests..." +npm test + +echo "cleanup" +rm -rf '/appsrc/ai/__pycache__/*' +rm -rf '/appsrc/ai/metalearning/__pycache__/*' diff --git a/tests/integration/wait_pennai.sh b/tests/integration/wait_pennai.sh index 45a8ae44c..76b30a7b5 100644 --- a/tests/integration/wait_pennai.sh +++ b/tests/integration/wait_pennai.sh @@ -1,12 +1,12 @@ -#!/bin/bash - -echo "waiting for lab to be responsive..." -/opt/wait-for-it.sh -t 300 lab:5080 -- echo "lab wait over" - -echo "waiting for machine to be responsive..." -/opt/wait-for-it.sh -t 30 machine:5081 -- echo "machine wait over" - - -# for now, hardcode some time for the datasets to get loaded -echo "hardcoded sleep to load datasets..." -sleep 10s +#!/bin/bash + +echo "waiting for lab to be responsive..." +/opt/wait-for-it.sh -t 300 lab:5080 -- echo "lab wait over" + +echo "waiting for machine to be responsive..." +/opt/wait-for-it.sh -t 30 machine:5081 -- echo "machine wait over" + + +# for now, hardcode some time for the datasets to get loaded +echo "hardcoded sleep to load datasets..." +sleep 10s