Demo of running an OU maintained Docker container as the computational environment.
Note that the workspace can be slow to build (watch the logs!) and once the JupyterLab environment has loaded, it may take a little while for the kernel to become available for the first loaded notebook. (Try restarting it after a while, or stopping and closing the notebook, then opning it again, if the kernel does not seem to be responsive).
I've also noticed JupyterLab somtimes gets a bit stuck; hard refreshing the browser with devtools open to clear the cache generally fixes it...
Try it here: https://github.com/codespaces/new/ouseful-testing/codespaces-jupyter-tm351?editor=jupyter
The demo is interesting for several reasons:
-
a custom container can be provided that defines a complex computational environment for use in the Codespace. In the current example, the container is a container used in the Open University module TM351 Data Management and Analysis. This environment includes:
- a PostgreSQL server with preconfigured user accounts;
- a MongoDB server seeded with a several data collections;
- we can deliver a complex computational environment to students that we can update as required by updating the original Docker image
-
Codespaces provide a generous amount of free hosted compute hours per month, meaning an install free user experience;
- students can use the Codespaces environment for free without any hosting burden on, or costs to, the OU
-
the editing environment is separate from the containerised computational environment. Codespaces can be used to provide install free, customisable, browser based VS Code and JupyterLab environments:
- the VS Code editor can be customised by installing additional extensions via the
devcontainer.json
file; - the JupyterLab editing environment can be customised by installing JuptyerLab extensions via the
requirements.txt
file; - we can separate concerns of delivering a customised computational environment (via the Docker image) and a customised user editing environment (via
.devcontainer
config files in the repo)
- the VS Code editor can be customised by installing additional extensions via the
-
file edits can be persisted in a Github repository using the VS Code and JupyterLab git extensions. (Modified files are persisted in the container and can also be published to a new branch);
- provides a natural rationale for getting students into the habit of using version control / git
-
the JupyterLab environment does supports a local file access extension but this doesn't appear to work in this context at the moment...
-
language pack extensions can be added to the JupyterLab environment; the current environment includes French and Chinese language packs, along with English (the default); change language from the JupyterLab
Settings > Language
menu option;- we can support localised language packs for foreign students, or locally customised labels as required; users can install additional language packs themselves
The full TM351 environment includes a proxied OpenRefine server, although this seems to knock the Codespace container over. I think this is because OpenRefine is a resource heavy Java application that seems happiest with at least 4GB of memory available.
We can specify the default Codespaces editor to be JuptyerLab from the Github user settings page:
Create a repository from the Use this template button:
To access the environment:
- create a Github account;
- click on Use this template;
- create a private repository based on the template;
- open a new Codespace from your own repository (see above for how to ensure that JupyterLab is the default launch UI).
Note that it may take several minutes for the container to be built the first time you use it or if you delete the workspace and create a new one. If you stop a workspace and restart it at a later time, the set-up should be quicker.
When the Codespace container is built, the JupyterLab UI should be automatically opened in your browser with access to that environment.
If you get something like this:
then hard refresh the browser page.
The intial display may appear a bit broken... Try clicking things to reset the view...
A wide variety of JupyterLab extensions are preinstalled in the environment, including a branding pack and cell execution status indicators.
Empinken styling is supported, and a growing number of MyST styling features are supported by the jupyerlab-myst
extension.
The pre-installed jupyterlab-git
extension allows you to commit and push changes back to the code repository.
The extension also includes a differ, so you can inspect changes you have made to a file, but not committed:
To upload notebooks, students can download a (controlled) release from the VLE, unzip them, and then drag and drop the notebook directory onto their Github repo page to upload them to the repo. If their Codespace already exists, they should be able to use the git tools to synch the uploaded files into their workspace.
When using the devcontainer locally by opening the cloned repo directory using VS Code, it seems that the dev containr does not install its own JupyterLab server? However, there is the JupyterLab environment that we built into the original TM351 container, and it seems we can run that by issuing the jupytr lab
command from the terminal inside a VS Code environment attached to the container:
If we choose to open the Codespace in the browser (which is to say, VS Code in the browser), or in VS Code (which is to say, VS Code running locally), or if we clone the repo and open the directory in a local version of VS Code with the Dev Containers extension installed, we can run the code cells by selecting the /usr/bin/python3
environment:
To customise the JupyterLab environment published by the devcontainer, we install the required JupyterLab extensions from the requirements.txt
file included in this repo: "updateContentCommand": "python3 -m pip install -r requirements.txt"
The container used in the demo mmh352/tm351:23j.0b8
is a standalone container that includea all the required services packages and applications.
The container features first-run as well as on-start procedures:
- first run: the first run process includes a step to copy the seeded database db files, as shipped within the original container, to a user mounted location in order the persist any changes that are made to the database to a persistent location outside the container (either the user's desktop for the local VCE, or the persistent user storage area in the OU hosted VCE).
- start procedure: the start procedure performs various seeding checks and calls first run routines as required, and then starts the PostgreSQL and MongoDB services.
In the devcontainer.json
, we replicate the original container start procedure in the following way:
- first run:
"postCreateCommand": "sudo /var/startup/start_jh_extras"
- on start:
"postStartCommand": "sudo service postgresql restart && sudo mongod --fork --logpath /dev/stdout --dbpath ${MONGO_DB_PATH}"
We hold the launch of the UI until the environment has been properly seeded and the postStartCommand
has started the database services.
Updates to the PostgreSQL and MongoDB databases are not persisted outsude the container.