Skip to content

Commit

Permalink
Adding a working Docker setup for developing sparkmagic (jupyter-incu…
Browse files Browse the repository at this point in the history
…bator#361)

* Adding a working Docker setup for developing sparkmagic

It includes the Jupyter notebook as well as the Livy+Spark endpoint.
Documentation is in the README

* Pre-configure the ~/.sparkmagic/config.json

Now you can just launch a PySpark wrapper kernel and have it work
out of the box.

* Add R to Livy container

Also added an R section to example_config.json to make it work
out of the box - and I think it's just a good thing to have it
anyway, otherwise how would users ever know it was meant to be
there?

* Add more detail to the README container section

* Add dev_mode build-arg.

Disabled by default. When enabled, builds the container using your local
copy of sparkmagic, so that you can test your development changes inside
the container.

* Adding missing kernels

Was missing Scala and Python2. Confirmed that Python2 and
Python3 are indeed separate environments on the spark
container.
  • Loading branch information
apetresc authored and aggFTW committed Jun 1, 2017
1 parent 5f5744a commit 610c916
Show file tree
Hide file tree
Showing 5 changed files with 126 additions and 1 deletion.
31 changes: 31 additions & 0 deletions Dockerfile.jupyter
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
FROM jupyter/base-notebook:d0b2d159cc6c

ARG dev_mode=false

USER $NB_USER

# Install sparkmagic - if DEV_MODE is set, use the one in the host directory.
# Otherwise, just install from pip.
COPY hdijupyterutils hdijupyterutils/
COPY autovizwidget autovizwidget/
COPY sparkmagic sparkmagic/
RUN if [ "$dev_mode" = "true" ]; then \
cd hdijupyterutils && pip install . && cd ../ && \
cd autovizwidget && pip install . && cd ../ && \
cd sparkmagic && pip install . && cd ../ ; \
else pip install sparkmagic ; fi

RUN mkdir /home/$NB_USER/.sparkmagic
COPY sparkmagic/example_config.json /home/$NB_USER/.sparkmagic/config.json
RUN sed -i 's/localhost/spark/g' /home/$NB_USER/.sparkmagic/config.json
RUN jupyter nbextension enable --py --sys-prefix widgetsnbextension
RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/sparkkernel
RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/pysparkkernel
RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/pyspark3kernel
RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/sparkrkernel
RUN jupyter serverextension enable --py sparkmagic

USER root
RUN chown $NB_USER /home/$NB_USER/.sparkmagic/config.json
RUN rm -rf hdijupyterutils/ autovizwidget/ sparkmagic/
USER $NB_USER
35 changes: 35 additions & 0 deletions Dockerfile.spark
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
FROM gettyimages/spark:2.1.0-hadoop-2.7

RUN apt-get update && apt-get install -yq --no-install-recommends --force-yes \
git \
openjdk-7-jdk \
maven \
python2.7 \
python3.4 \
r-base \
r-base-core && \
rm -rf /var/lib/apt/lists/*

ENV LIVY_BUILD_VERSION livy-server-0.3.0
ENV LIVY_APP_PATH /apps/$LIVY_BUILD_VERSION
ENV LIVY_BUILD_PATH /apps/build/livy
ENV PYSPARK_PYTHON python2.7
ENV PYSPARK3_PYTHON python3.4

RUN mkdir -p /apps/build && \
cd /apps/build && \
git clone https://github.com/cloudera/livy.git && \
cd $LIVY_BUILD_PATH && \
git checkout v0.3.0 && \
mvn -DskipTests -Dspark.version=$SPARK_VERSION clean package && \
ls -al $LIVY_BUILD_PATH && ls -al $LIVY_BUILD_PATH/assembly && ls -al $LIVY_BUILD_PATH/assembly/target && \
unzip $LIVY_BUILD_PATH/assembly/target/$LIVY_BUILD_VERSION.zip -d /apps && \
rm -rf $LIVY_BUILD_PATH && \
mkdir -p $LIVY_APP_PATH/upload && \
mkdir -p $LIVY_APP_PATH/logs


EXPOSE 8998

CMD ["/apps/livy-server-0.3.0/bin/livy-server"]

35 changes: 34 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,39 @@ See [Pyspark](examples/Pyspark Kernel.ipynb) and [Spark](examples/Spark Kernel.i

jupyter serverextension enable --py sparkmagic

## Docker

The included `docker-compose.yml` file will let you spin up a full
sparkmagic stack that includes a Jupyter notebook with the appropriate
extensions installed, and a Livy server backed by a local-mode Spark instance.
(This is just for testing and developing sparkmagic itself; in reality,
sparkmagic is not very useful if your Spark instance is on the same machine!)

In order to use it, make sure you have [Docker](https://docker.com) and
[Docker Compose](https://docs.docker.com/compose/) both installed, and
then simply run:

docker-compose build
docker-compose up

You will then be able to access the Jupyter notebook in your browser at
http://localhost:8888. Inside this notebook, you can configure a
sparkmagic endpoint at http://spark:8998. This endpoint is able to
launch both Scala and Python sessions. You can also choose to start a
wrapper kernel for Scala, Python, or R from the list of kernels.

To shut down the containers, you can interrupt `docker-compose` with
`Ctrl-C`, and optionally remove the containers with `docker-compose
down`.

If you are developing sparkmagic and want to test out your changes in
the Docker container without needing to push a version to PyPI, you can
set the `dev_mode` build arg in `docker-compose.yml` to `true`, and then
re-build the container. This will cause the container to install your
local version of autovizwidget, hdijupyterutils, and sparkmagic. Make
sure to re-run `docker-compose build` before each test run.

### Server extension API

#### `/reconnectsparkmagic`:
Expand Down Expand Up @@ -125,4 +158,4 @@ To run unit tests, run:

nosetests hdijupyterutils autovizwidget sparkmagic

If you want to see an enhancement made but don't have time to work on it yourself, feel free to submit an [issue](https://github.com/jupyter-incubator/sparkmagic/issues) for us to deal with.
If you want to see an enhancement made but don't have time to work on it yourself, feel free to submit an [issue](https://github.com/jupyter-incubator/sparkmagic/issues) for us to deal with.
21 changes: 21 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
version: "3"
services:
spark:
image: jupyter/sparkmagic-livy
build:
context: .
dockerfile: Dockerfile.spark
hostname: spark
ports:
- "8998:8998"
jupyter:
image: jupyter/sparkmagic
build:
context: .
dockerfile: Dockerfile.jupyter
args:
dev_mode: "false"
links:
- spark
ports:
- "8888:8888"
5 changes: 5 additions & 0 deletions sparkmagic/example_config.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@
"password": "",
"url": "http://localhost:8998"
},
"kernel_r_credentials": {
"username": "",
"password": "",
"url": "http://localhost:8998"
},

"logging_config": {
"version": 1,
Expand Down

0 comments on commit 610c916

Please sign in to comment.