The jupyter
package provides a type-safe high-level interface for interacting with Jupyter kernels
and clients using the Jupyter messaging
protocol, without having to deal
with the low-level details of network communication and message encoding and decoding. Specifically,
the package provides a quick way of writing Jupyter kernels and Jupyter
clients in Haskell.
If you are looking for a Haskell kernel (for evaluating Haskell in the Jupyter notebook or another frontend), take a look at IHaskell.
The Jupyter project is a set of tools and applications for working with interactive coding environments. Jupyter provides the architecture and frontends (also called clients), and delegates the language-specific details to external programs called Jupyter kernels.
Jupyter kernels are interpreters for a language which communicate with Jupyter clients using the messaging protocol. The messaging protocol consists primarily of a request-reply pattern, in which the kernel acts as a server that responds to client requests. Clients can send requests to the kernel for executing code, generating autocompletion suggestions, looking up documentation or metadata about symbols, searching through execution history, and more.
Jupyter clients (also known as frontends) are programs that connect to kernels using the messaging protocol, typically interacting with the kernels via a request-reply pattern. Clients can do whatever they want – currently, frontends exist for notebook interfaces, a graphical terminal, a console interface, Vim plugins for evaluating code, Emacs modes, and probably more.
The most commonly used and complex frontend is likely the Jupyter notebook, which allows you to create notebook documents with complex interactive visualizations, Markdown documentation, code execution and autocompletion, and a variety of other IDE-like features. You can try using the notebook with an online demo notebook.
The following screenshots are examples of the Jupyter notebook, taken from their website:
jupyter
can be installed similarly to most Haskell packages, either via stack
or cabal
:
# Stack (recommended)
stack install jupyter
# Cabal (not recommended)
cabal install jupyter
jupyter
depends on the zeromq4-haskell
package, which requires ZeroMQ to be installed. Depending
on your platform, this may require compiling ZeroMQ from source. If you use a Mac, it is recommended that
you install Homebrew (if you have not installed it already) and use it to install ZeroMQ.
Installation Commands:
- Mac OS X (Homebrew, recommended):
brew install zeromq
- Mac OS X (MacPorts):
ports install zmq
- Ubuntu:
sudo apt-get install libzmq3-dev
- Other: Install ZeroMQ from source:
git clone git@github.com:zeromq/zeromq4-x.git libzmq
cd libzmq
./autogen.sh && ./configure && make
sudo make install
sudo ldconfig
The Jupyter messaging protocol is centered around a request-reply pattern, with clients sending requests to kernels and kernels replying to every request with a reply. Although that is the primary communication pattern, there are different messaging patterns that happen between clients and kernels:
- Request-Reply: Clients send requests to kernels, and kernels reply with precisely one response to every request.
- Outputs: Kernels broadcast outputs to all connected clients. (A client request can have only reply, as mentioned previously, but can also trigger any number of outputs being sent separately.)
- Kernel Requests: Kernels can send requests to individual clients; this is currently used only for querying for standard input, such as when the Python kernel calls
raw_input()
or the Haskell kernel callsgetLine
. (Requests for input are sent from the kernel to the client, and clients get the input from the user and send it to the kernel.) - Comms: For ease of extension and plugin development, the messaging protocol supports
comm
channels; acomm
is a one-directional communication channel that can be opened on either the kernel or client side, and arbitrary data may be sent by either the client to the kernel or the kernel to the client. This can be used for doing things that the messaging spec does not explicitly support, such as the ipywidget support in the notebook.
The jupyter
package encodes these messaging patterns in the type system. Each
type of message corresponds to its own data type, and clients and kernels are created by
supplying appropriate handlers for all messages that they can receive. For example, client requests correspond to the
ClientRequest
data type:
-- | A request sent by a client to a kernel.
data ClientRequest
= ExecuteRequest CodeBlock ExecuteOptions
| InspectRequest CodeBlock CodeOffset
| CompleteRequest CodeBlock CodeOffset
| ...
-- | A code cell
newtype CodeBlock = CodeBlock Text
-- | A cursor offset in the code cell
newtype CodeOffset = CodeOffset Int
Each of the ClientRequest
constructors has a corresponding KernelReply
constructor:
-- | A reply sent by a kernel to a client.
data KernelReply
= ExecuteReply ExecuteResult
| InspectReply InspectResult
| CompleteReply CompleteResult
| ...
Kernels may send
KernelOutput
messages to publish outputs to the clients:
-- | Outputs sent by kernels to all connected clients
data KernelOutput
= StreamOutput Stream Text
| DisplayDataOutput DisplayData
| ...
-- | Which stream to write to
data Stream = StreamStdout | StreamStderr
-- | Display data mime bundle, with one value at most per mimetype.
newtype DisplayData = DisplayData (Map MimeType Text)
data MimeType
= MimePlainText
| MimeHtml
| MimePng
| ...
The other message types are represented by the
KernelRequest
data type (requests from
the kernel to a single client, e.g. for standard input), ClientReply
(replies to
KernelRequest
messages), and Comm
messages (arbitrary unstructured communication between
frontends and servers).
A kernel is an executable with distinct but related functions: first, registering the kernel with Jupyter, and second, communicating via the messaging protocol with any connected clients.
The kernel must be able to register itself with the Jupyter client system on the user's machine, so that clients running on the machine know what kernels are available and how to invoke each one.
The Jupyter project provides a command-line tool (invoked via jupyter kernelspec install
) which
installs a kernel when provided with a directory known as a kernelspec. The jupyter
project
automates creating and populating this directory with all needed files and invoking jupyter kernelspec install
via the
installKernel
function in Jupyter.Install
:
installKernel :: InstallUser -- ^ Whether to install globally or just for this user.
-> Kernelspec -- ^ Record describing the kernel.
-> IO InstallResult
-- | Install locally (with --user) or globally (without --user).
data InstallUser = InstallLocal | InstallGlobal
-- | Did the install succeed or fail?
data InstallResult = InstallSucccessful
| InstallFailed Text -- ^ Constructor with reason describing failure.
A kernel is described by a
Kernelspec
:
data Kernelspec =
Kernelspec
{ kernelspecDisplayName :: Text
, kernelspecLanguage :: Text
, kernelspecCommand :: FilePath -> FilePath -> [String]
, kernelspecJsFile :: Maybe FilePath
, kernelspecLogoFile :: Maybe FilePath
, kernelspecEnv :: Map Text Text
}
The key required bits are the display name (what to call this kernel in UI elements), the language
name (what to call this kernel in code and command-line interfaces), and the command (how the kernel
can be invoked). The kernelspecCommand
function is provided with the absolute canonical path to
the currently running executable and to a connection file (see the next section
for more on connection files), and must generate a command-line invocation.
For testing and demo purposes, jupyter
provides a helper function simpleKernelspec
to generate a default kernelspec just from the three fields described above:
simpleKernelspec :: Text -- ^ Display name
-> Text -- ^ Language name
-> (FilePath -> FilePath -> [String]) -- ^ Command generator
-> Kernelspec
Using simpleKernelspec
, we can put together the smallest viable snippet for installing a kernelspec:
-- | Install a kernel called "Basic" for a language called "basic".
--
-- The kernel is started by calling this same executable with the command-line
-- argument "kernel" followed by a path to the connection file.
install :: IO ()
install =
let invocation exePath connectionFilePath = [exePath, "kernel", connectionFilePath]
kernelspec = simpleKernelspec "Basic" "basic" invocation
in installKernel InstallLocal kernelspec
Once the kernel is registered, clients can start the kernel, passing it a connection file as a
command-line parameter. A connection file contains a JSON encoded kernel profile
(KernelProfile
), which specifies low-level details such as the IP address to serve on, the
transport method, and the ports for the ZeroMQ sockets used for communication.
To decode the JSON-encoded profile, the
readProfile
utility is provided:
-- | Try to read a kernel profile from a file; return Nothing if parsing fails.
readProfile :: FilePath -> IO (Maybe KernelProfile)
Obtaining the KernelProfile
enables you to call the main interface to the Jupyter.Kernel
module:
serve :: KernelProfile -- ^ Specifies how to communicate with clients
-> CommHandler -- ^ What to do when you receive a Comm message
-> ClientRequestHandler -- ^ What to do when you receive a ClientRequest
-> IO ()
The kernel behaviour is specified by the two message handlers, the CommHandler
and the ClientRequestHandler
. The ClientRequestHandler
receives a ClientRequest
and must generate a KernelReply
to send to the client:
type ClientRequestHandler = KernelCallbacks -> ClientRequest -> IO KernelReply
The constructor of the output KernelReply
must match the constructor of the ClientRequest
.
Besides generating the KernelReply
, the ClientRequestHandler
may also send
messages to the client using the publishing callbacks:
data KernelCallbacks = PublishCallbacks {
sendKernelOutput :: KernelOutput -> IO (),
sendComm :: Comm -> IO (),
sendKernelRequest :: KernelRequest -> IO ClientReply
}
For example, during code execution, the kernel will receive an
ExecuteRequest
, run the
requested code, using sendKernelOutput
to send KernelOutput
messages to the client with
intermediate and final outputs of the running code, and then generate a
ExecuteReply
that
is returned once the code is done running.
The CommHandler
is similar, but is called when the kernel receives Comm
messages
instead of ClientRequest
messages. Many kernels will not want to support any Comm
messages,
so a default handler defaultCommHandler
is provided, which simply ignores all Comm
messages.
Unlike the CommHandler
, the ClientRequestHandler
must generate a reply to
every request; it does not have the option of returning no output. Since there are quite a few
request types, a default implementation is provided as
defaultClientRequestHandler
, which
responds to almost all messages with an empty response:
defaultClientRequestHandler :: KernelProfile -- ^ The profile this kernel is serving
-> KernelInfo -- ^ Basic metadata about the kernel
-> ClientRequestHandler
The KernelInfo
record required for the defaultClientRequestHandler
has quite a
bit of information in it, so for any production kernel you will want to fill out all the records,
but for demo purposes juptyer
provides the utility
simpleKernelInfo
:
simpleKernelInfo :: Text -- ^ Name to give this kernel
-> KernelInfo
Putting this all together, the shortest code snippet which runs a valid (but useless) Jupyter kernel is as follows:
runKernel :: FilePath -> IO ()
runKernel profilePath = do
Just profile <- readProfile profilePath
serve profile defaultCommHandler $
defaultClientRequestHandler profile $ simpleKernelInfo "Basic"
Recall from the registering the kernel section that the kernel provides
an invocation to Jupyter; in our example, our kernel is invoked as $0 kernel $CONNECTION_FILE
(where $0
is the path to the current executable). Our runKernel
function from the previous
section must receive the file path passed as the $CONNECTION_FILE
, so these two must be combined
in the same executable with a bit of flag parsing:
main :: IO ()
main = do
args <- getArgs
case args of
-- If invoked with the 'install' argument, then generate and register the kernelspec
["install"] ->
void $ installKernel InstallGlobal $
simpleKernelspec "Basic" "basic" $ \exe connect -> [exe, "kernel", connect]
-- If invoked with the 'kernel' argument, then serve the kernel
["kernel", profilePath] -> do
Just profile <- readProfile profilePath
serve profile defaultCommHandler $
defaultClientRequestHandler profile $ simpleKernelInfo "Basic"
_ -> putStrLn "Invalid arguments."
This example is available in the
examples/basic
subdirectory, and you can build and run it
with stack
:
stack build jupyter:kernel-basic
stack exec kernel-basic install
Once it is installed in this manner, you can run the kernel and connect to it from frontends; you
can try it with jupyter console --kernel basic
. The kernel does not do much, though!
In order to write a more useful kernel, we would need to supply a more useful client request
handler; the client handler would need to parse the code being sent for execution, execute it, and
publish any results of the execution to the frontends using the publishOutput
callback. An example
kernel that implements a simple calculator and handles most message types is provided in the
examples/calculator
directory, and can be built and run similarly to the basic
kernel
(see above).
In the Jupyter messaging protocol, the kernel never needs to send requests to the frontend, with the exception of one instance: reading from standard input. Because knowing when standard input is read requires executing code (something only the kernel can do), only the kernel can initiate reading from standard input.
Since the kernel may be running as a subprocess of the frontend, or can even be running on a remote
machine, the kernel must be able to somehow intercept reads from standard input and turn them into
requests to the Jupyter frontend that requested the code execution. To facilitate this, the
KernelCallbacks
record provided to the ClientRequestHandler
has a
sendKernelRequest
callback:
-- Send a request to the kernel and wait for a reply in a blocking manner.
sendKernelRequest :: KernelCallbacks -> KernelRequest -> IO ClientReply
-- Request from kernel to client for standard input.
data KernelRequest = InputRequest InputOptions
data InputOptions = InputOptions { inputPrompt :: Text, inputPassword :: Bool }
-- Response to a InputRequest.
data ClientReply = InputReply Text
The KernelRequest
and ClientReply
data types are meant to mirror the
more widely used ClientRequest
and KernelReply
data types; at the moment, these
data types are used only for standard input, but future versions of the messaging protocol may
introduce more messages.
An example of a kernel that uses these to read from standard input during code execution is
available in the
examples/stdin
folder.
Jupyter clients (also commonly called frontends) are programs which connect to a running kernel
(possibly starting the kernel themselves) and then query them using the ZeroMQ-based messaging
protocol. Using the jupyter
library, the same data types are used for querying kernels as for
receiving client messages, so if you understand how to write kernels, using the
client interface will be straightforward.
Any kernel that registers itself with Jupyter using jupyter kernelspec install
(or the utilities
from the jupyter
library) can then be located using jupyter kernelspec list
. The jupyter
package provides two convenient wrappers around jupyter kernelspec list
:
-- Locate a single kernelspec in the Jupyter registry.
findKernel :: Text -> IO (Maybe Kernelspec)
-- List all kernelspecs in the Jupyter registry.
findKernels :: IO [Kernelspec]
Using the Kernelspec
and the kernelspecCommand
field, we can find out how to launch any
registered kernel as a separate process.
(System.Process
and the spawnProcess
function may prove useful here.)
Before we can communicate with a kernel, we must first set up handlers for what to do when the
kernel sends messages to the client. The following handlers are required, stored in the
ClientHandlers
record:
data ClientHandlers =
ClientHandlers
{ kernelRequestHandler :: (Comm -> IO ()) -> KernelRequest -> IO ClientReply
, commHandler :: (Comm -> IO ()) -> Comm -> IO ()
, kernelOutputHandler :: (Comm -> IO ()) -> KernelOutput -> IO ()
}
Each of the handlers receives a Comm -> IO ()
callback; this may be used to send Comm
messages to whatever kernel sent the message being handled.
- The
kernelRequestHandler
receives aKernelRequest
(likely a request for standard input from the user), and must generate an appropriateClientReply
, with a constructor matching the one of the request. - The
commHandler
may do anything in response to aComm
message, including doing nothing; since doing nothing is a common option, thedefaultCommHandler
function is provided that does exactly this (that is, nothing). - The
kernelOutputHandler
is called whenever aKernelOutput
message is produced by the kernel. This can be used to display output to the user, update a kernel busy / idle status indicator, etc.
Once a
ClientHandlers
value is set up, the runClient
function can be used to run
any Client
command:
runClient :: Maybe KernelProfile
-> Maybe Username
-> ClientHandlers
-> (KernelProfile -> Client a)
-> IO a
In addition to the handlers, runClient
takes an optional KernelProfile
to
connect to and an optional username. If no profile is provided, one is chosen automatically; if no
username is provided, a default username is used. The profile that was used (whether autogenerated
or set by the caller) is provided to an action KernelProfile -> Client a
, and the
Client
action returned is run in IO
.
To send requests to kernels (and receive replies), construct the appropriate Client
action; these are thin wrappers around IO
that allow you to use the
sendClientComm
and
more importantly
sendClientRequest
to communicate with kernels.
Before sendClientRequest
can be used, though, the connection to the kernel must be verified.
This is done by
connectKernel
, which blocks until the kernel connects to the client and
returns a KernelConnection
for use with sendClientRequest
. Once we have a KernelConnection
,
we can query the kernel, as in:
getKernelInfoReply :: KernelConnection -> Client KernelInfo
getKernelInfoReply connection = do
KernelInfoReply info <- sendClientRequest connection KernelInfoRequest
return info
For example, the KernelInfo
for the python3
kernel can be obtained as follows:
{-# LANGUAGE OverloadedStrings #-}
import Control.Monad.IO.Class (MonadIO(liftIO))
import System.Process (spawnProcess)
import Jupyter.Client (runClient, sendClientRequest, ClientHandlers(..), connectKernel,
defaultClientCommHandler, findKernel, writeProfile, Kernelspec(..))
import Jupyter.Messages (ClientRequest(KernelInfoRequest), KernelReply(KernelInfoReply),
KernelRequest(InputRequest), ClientReply(InputReply))
main :: IO ()
main = do
-- Find the kernelspec for the python 3 kernel
Just kernelspec <- findKernel "python3"
-- Start the client connection
runClient Nothing Nothing handlers $ \profile -> do
-- Write the profile connection file to a JSON file
liftIO $ writeProfile profile "profile.json"
-- Launch the kernel process, giving it the path to the JSON file
let command = kernelspecCommand kernelspec "" "profile.json"
_ <- liftIO $ spawnProcess (head command) (tail command)
-- Send a kernel info request and get the reply
connection <- connectKernel
KernelInfoReply info <- sendClientRequest connection KernelInfoRequest
liftIO $ print info
handlers :: ClientHandlers
handlers = ClientHandlers {
-- Do nothing on comm messages
commHandler = defaultClientCommHandler,
-- Return a fake stdin string if asked for stdin
kernelRequestHandler = \_ req ->
case req of
InputRequest{} -> return $ InputReply "Fake Stdin",
-- Do nothing on kernel outputs
kernelOutputHandler = \_ _ -> return ()
}
This full example is in the
examples/client-kernel-info
directory,
and can be built with stack build jupyter:client-kernel-info
, and executed
with stack exec client-kernel-info
. (It will work if the python3
kernel is
installed, but not otherwise!)
Any and all contributions are welcome!
If you'd like to submit a feature request, bug report, or request for additional documentation, or have any other questions, feel free to file an issue.
If you would like to help out, pick an issue you are interested in and comment on it, indicating that you'd like to work on it. Me (or other project contributors) will try to promptly merge and pull requests and respond to any questions you may have. If you'd like to talk to me (@gibiansky) off of Github, feel free to email me; my email is available on my Github profile sidebar.
stack build
: Usestack build
to build the library and run the examples.stack test
: For any bug fix or feature addition, please make sure to extend the test suite as well, and verify thatstack test
runs your test and succeeds. Travis CI is used to test all pull requests, and must be passing before they will be merged.stack exec python python/tests.py
: Part of the test suite is triggered from Python (to be able to use thejupyter_client
library); make sure that the Python test suite passes as well. You can create a Python environment for yourself separate from your global one with thepyvenv
command:pyvenv env && source env/bin/activate
.stack haddock
: Please make sure that all top-level identifiers are well documented; specifically, runstack haddock
and ensure that all modules have 100% complete documentation. This is tested automatically on Travis CI, as well.