forked from feathr-ai/feathr
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add a Sandbox for Feathr (feathr-ai#966)
* Update registry-access-control.md * Update README.md * add logo * Update README.md * Add docs for how to create bacpac file * update dockerfile * update * Update local_quickstart_nyc_taxi_demo.ipynb * Update FeathrSandbox.Dockerfile * add SQLIte connection * Update local_quickstart_nyc_taxi_demo.ipynb * update local registry * update registry * update * add dockerfile * Change to ORM * Update db_registry.py * update registry * delete unused files * don't change the existing registry code * update * Update main.py * update configs * make jupyter runnable * add readme * Update start.sh * Revert "Add docs for how to create bacpac file" This reverts commit 2837926. * delete unused files * Update local_quickstart_nyc_taxi_demo.ipynb * Update local_quickstart_nyc_taxi_demo.ipynb * Fix redis issues * Update client.py * Update _env_config_reader.py * add docs * Update quickstart_local_sandbox.md * Update quickstart_local_sandbox.md * Update quickstart_local_sandbox.md * Update quickstart_local_sandbox.md * merge ORM based sql registry to sql registry * fix typo * improve usability * Update FeathrSandbox.Dockerfile * Update FeathrSandbox.Dockerfile * Update start_local.sh * Update FeathrSandbox.Dockerfile * update instructions * Add code server * Remove unused dockerfile * disable code server * update samples * Update feathr_init_script.py * update notebook * Update FeathrSandbox.Dockerfile * Update local_quickstart_notebook.ipynb * Update _feathr_registry_client.py * Update setup.py * remove numpy * Update quickstart_local_sandbox.md * Update quickstart_local_sandbox.md * Add search function in sandbox * Update db_registry_orm.py * Update db_registry_orm.py * Update db_registry_orm.py * fix search issue * udpate * Update FeathrSandbox.Dockerfile * update * Update feathr_init_script.py * merge ORM based registry * Merge * Update main.py * Delete db_registry_orm.py * update dependencies * Update .prettierrc * update docs * Update database.py * Update database.py * Update database.py * Add CI docker push * Optimize image size * Update local_quickstart_notebook.ipynb * Update start_local.sh * update based on comments
- Loading branch information
1 parent
ae752c5
commit 290ceb3
Showing
25 changed files
with
2,004 additions
and
205 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
# TODO: persist the SQLite file in the volumes | ||
|
||
# Stage 1: build frontend ui | ||
FROM node:16-alpine as ui-build | ||
WORKDIR /usr/src/ui | ||
COPY ./ui . | ||
|
||
## Use api endpoint from same host and build production static bundle | ||
RUN echo 'REACT_APP_API_ENDPOINT=http://localhost:8000' >> .env.production | ||
RUN npm install && npm run build | ||
|
||
|
||
FROM jupyter/pyspark-notebook | ||
|
||
USER root | ||
|
||
## Install dependencies | ||
RUN apt-get update -y && apt-get install -y nginx freetds-dev sqlite3 libsqlite3-dev lsb-release redis gnupg redis-server lsof | ||
|
||
# UI Sectioin | ||
## Remove default nginx index page and copy ui static bundle files | ||
RUN rm -rf /usr/share/nginx/html/* | ||
COPY --from=ui-build /usr/src/ui/build /usr/share/nginx/html | ||
COPY ./deploy/nginx.conf /etc/nginx/nginx.conf | ||
|
||
|
||
# Feathr Package Installation Section | ||
# always install feathr from main | ||
WORKDIR /home/jovyan/work | ||
COPY --chown=1000:100 ./feathr_project ./feathr_project | ||
RUN python -m pip install -e ./feathr_project | ||
|
||
|
||
# Registry Section | ||
# install registry | ||
COPY ./registry /usr/src/registry | ||
WORKDIR /usr/src/registry/sql-registry | ||
RUN pip install -r requirements.txt | ||
|
||
|
||
|
||
## Start service and then start nginx | ||
WORKDIR /usr/src/registry | ||
COPY ./feathr-sandbox/start_local.sh /usr/src/registry/ | ||
|
||
# install code server | ||
# RUN curl -fsSL https://code-server.dev/install.sh | sh | ||
|
||
# default dir by the jupyter image | ||
WORKDIR /home/jovyan/work | ||
USER jovyan | ||
# copy as the jovyan user | ||
# UID is like this: uid=1000(jovyan) gid=100(users) groups=100(users) | ||
COPY --chown=1000:100 ./docs/samples/local_quickstart_notebook.ipynb . | ||
COPY --chown=1000:100 ./feathr-sandbox/feathr_init_script.py . | ||
|
||
# Run the script so that maven cache can be added for better experience. Otherwise users might have to wait for some time for the maven cache to be ready. | ||
RUN python feathr_init_script.py | ||
RUN python -m pip install interpret | ||
|
||
USER root | ||
WORKDIR /usr/src/registry | ||
RUN ["chmod", "+x", "/usr/src/registry/start_local.sh"] | ||
|
||
# remove ^M chars in Linux to make sure the script can run | ||
RUN sed -i "s/\r//g" /usr/src/registry/start_local.sh | ||
|
||
|
||
# install a Kafka single node instance | ||
# Reference: https://www.looklinux.com/how-to-install-apache-kafka-single-node-on-ubuntu/ | ||
RUN wget https://downloads.apache.org/kafka/3.3.1/kafka_2.12-3.3.1.tgz && tar xzf kafka_2.12-3.3.1.tgz && mv kafka_2.12-3.3.1 /usr/local/kafka && rm kafka_2.12-3.3.1.tgz | ||
|
||
# /usr/local/kafka/bin/zookeeper-server-start.sh /usr/local/kafka/config/zookeeper.properties | ||
# /usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties | ||
|
||
WORKDIR /home/jovyan/work | ||
|
||
|
||
# 80: Feathr UI | ||
# 8000: Feathr REST API | ||
# 8888: Jupyter | ||
# 8080: VsCode | ||
# 7080: Interpret | ||
EXPOSE 80 8000 8080 8888 7080 2181 | ||
# run the service so we can initialize | ||
# RUN ["/bin/bash", "/usr/src/registry/start.sh"] | ||
CMD ["/bin/bash", "/usr/src/registry/start_local.sh"] |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
--- | ||
layout: default | ||
title: Quick Start Guide with Local Sandbox | ||
--- | ||
|
||
# Feathr Quick Start Guide with Local Sandbox | ||
|
||
We provide a local sandbox so users can use Feathr easily. The goal of the Feathr Sandbox is to: | ||
|
||
- make it easier for users to get started, | ||
- make it easy to validate feature definitions and new ideas | ||
- make it easier for Feathr developers to setup environment and develop new things | ||
- Interactive experience, usually try to run a job takes less than 1 min. | ||
|
||
As an end user, you can become productive in less than 5 mins and try out Feathr. | ||
|
||
The Sandbox is ideal for: | ||
|
||
- Feathr users who want to get started quickly | ||
- Feathr developers to test new features since this docker should everything they need. It comes with the python package as editable model so developers can iterate easily. | ||
|
||
## Getting Started | ||
|
||
To get started, simply run the command below. Note that the image is around 5GB so it might take a while to pull it from DockerHub. | ||
|
||
```bash | ||
# 80: Feathr UI 8000: Feathr API 8888: Jupyter 8080: VsCode 7080: Interpret | ||
docker run -it --rm -p 8888:8888 -p 8000:8000 -p 80:80 -p 8080:8080 -p 7080:7080 --env CONNECTION_STR="Server=" --env API_BASE="api/v1" --env FEATHR_SANDBOX=True -e GRANT_SUDO=yes feathrfeaturestore/feathr-sandbox | ||
``` | ||
|
||
It should pop up a Jupyter link in `http://127.0.0.1:8888/`. Double click on the notebook file to start the Jupyter Notebook, and you should be able to see the Feathr sample notebook. Click the triangle button on the Jupyter notebook and the whole notebook will run locally. | ||
|
||
The default jupyter notebook is here: | ||
```bash | ||
http://localhost:8888/lab/workspaces/auto-w/tree/local_quickstart_notebook.ipynb | ||
``` | ||
|
||
![Feathr Notebook](./images/feathr-sandbox.png) | ||
|
||
|
||
After running the Notebooks, all the features will be registered in the UI, and you can visit the Feathr UI at: | ||
|
||
```bash | ||
http://localhost:80 | ||
``` | ||
|
||
|
||
After executing those scripts, you should be able to see a project called `local_spark` in the Feathr UI. You can also view lineage in the Feathr UI and explore all the details. | ||
![Feathr UI](./images/feathr-sandbox-ui.png) | ||
|
||
![Feathr UI](./images/feathr-sandbox-lineage.png) | ||
|
||
## Components | ||
|
||
The Feathr sandbox comes with: | ||
- Built-in Jupyter Notebook | ||
- Pre-installed data science packages such as `interpret` so that data science development becomes easy | ||
- Pre-installed Feathr package | ||
- A local spark environment for dev/test purpose | ||
- Feathr samples that can run locally | ||
- A local Feathr registry backed by SQLite | ||
- Feathr UI | ||
- Feathr Registry API | ||
- Local Redis server | ||
|
||
|
||
## Build Docker Container | ||
|
||
If you want to build the Feathr sandbox, run the below command in the Feathr root directory: | ||
|
||
```bash | ||
docker build -f FeathrSandbox.Dockerfile -t feathrfeaturestore/feathr-sandbox . | ||
``` | ||
|
||
|
||
## For Feathr Developers | ||
The Feathr package is copied to the user folder, and is installed with `pip install -e` option, which means you can do interactive development in the python package. For example you want to validate changes, instead of setting up the environment, you can simply go to the | ||
|
||
|
||
note that if you are using Jupyter notebook to run the code, make sure you restart jupyter notebook so the kernel can reload Feathr package. | ||
You should be able to see the | ||
|
||
![Feathr Dev Experience](./images/feathr-sandbox-dev-experience.png) | ||
|
||
In the future, an VSCode Server might be installed so that you can do interactive development in the docker container. |
Oops, something went wrong.