tweag · reubenharry · Jan 9, 2023 · Dec 22, 2022 · Dec 22, 2022 · Dec 22, 2022
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -0,0 +1,18 @@
+name: ci 
+on:
+  push:
+    branches:
+      - newdocs
+      - master
+permissions:
+  contents: write
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3
+      - uses: actions/setup-python@v4
+        with:
+          python-version: 3.x
+      - run: pip install mkdocs-material 
+      # - run: mkdocs gh-deploy --force
diff --git a/.gitignore b/.gitignore
@@ -1,4 +1,5 @@
 src/data
+docs/site
 venv
 _cache
 docs/build

diff --git a/.netlify/state.json b/.netlify/state.json
diff --git a/MAINTAINERS.md b/MAINTAINERS.md
@@ -51,8 +51,4 @@ A **new major GHC version** has been released. Here's what you need to do:
 
 ## Documentation
 
-The docs are built with Sphinx. Once installed, cd to the `docs` directory, then run `make html` to build locally. CI does this automatically, so to update the docs, just update the markdown (e.g. docs/source/usage.md), and push.
-
-## Website
-
-The website is also hosted in the repo (`/monad-bayes-site`), and is built with `hakyll`. Do `stack exec site build` to build. CI **does not** automatically build the site, so to update, you will need to run this command, and only then push to github.
+The docs are built with MkDocs. Serve locally with: `mkdocs serve`. Site is served online with Netlify.
diff --git a/README.md b/README.md
@@ -1,4 +1,4 @@
-# [Monad-Bayes](https://monad-bayes-site.netlify.app/_site/about.html)
+# [Monad-Bayes](https://monad-bayes.netlify.app/)
 
 A library for probabilistic programming in Haskell. 
 
@@ -7,7 +7,7 @@ A library for probabilistic programming in Haskell.
 [![Hackage Deps](https://img.shields.io/hackage-deps/v/monad-bayes.svg)](http://packdeps.haskellers.com/reverse/monad-bayes)
 [![Build status](https://badge.buildkite.com/147af088063e8619fcf52ecf93fa7dd3353a2e8a252ef8e6ad.svg?branch=master)](https://buildkite.com/tweag-1/monad-bayes) -->
 
-[See the website](https://monad-bayes-site.netlify.app/_site/about.html) for an overview of the documentation, library, tutorials, examples (and a link to this very source code). 
+[See the docs](https://monad-bayes.netlify.app/) for a user guide, notebook-style tutorials, an example gallery, and a detailed account of the implementation.
 
 <!-- Monad-Bayes is a library for **probabilistic programming in Haskell**. The emphasis is on composition of inference algorithms, and is implemented in terms of monad transformers. -->
 

diff --git a/docs/Makefile b/docs/Makefile
diff --git a/docs/docs/examples.md b/docs/docs/examples.md
@@ -0,0 +1,18 @@
+---
+title: Example Gallery
+---
+
+## [Histograms](/notebooks/Histogram.html)
+
+## [JSON (with `lens`)](/notebooks/Lenses.html)
+
+## [Diagrams](/notebooks/Diagrams.html)
+
+## [Probabilistic Parsing](/notebooks/Parsing.html)
+
+## [Streams (with `pipes`)](/notebooks/Streaming.html)
+
+## [Ising models](/notebooks/Ising.html)
+
+## [Physics](/notebooks/ClassicalPhysics.html)
+
diff --git a/...-bayes-site/_site/images/code_example.png → docs/docs/images/code_example.png b/...-bayes-site/_site/images/code_example.png → docs/docs/images/code_example.png
diff --git a/...-bayes-site/_site/images/haskell-logo.png → docs/docs/images/haskell-logo.png b/...-bayes-site/_site/images/haskell-logo.png → docs/docs/images/haskell-logo.png
diff --git a/monad-bayes-site/_site/images/plot.png → docs/docs/images/plot.png b/monad-bayes-site/_site/images/plot.png → docs/docs/images/plot.png
diff --git a/docs/source/_static/priorpred.png → docs/docs/images/priorpred.png b/docs/source/_static/priorpred.png → docs/docs/images/priorpred.png
diff --git a/monad-bayes-site/_site/images/randomwalk.png → docs/docs/images/randomwalk.png b/monad-bayes-site/_site/images/randomwalk.png → docs/docs/images/randomwalk.png
diff --git a/docs/source/_static/regress.png → docs/docs/images/regress.png b/docs/source/_static/regress.png → docs/docs/images/regress.png
diff --git a/docs/docs/index.md b/docs/docs/index.md
@@ -0,0 +1,32 @@
+# Welcome to Monad-Bayes
+
+Monad-Bayes is a library for **probabilistic programming** written in **Haskell**.
+
+**Define distributions** [as programs](/notebooks/Introduction.html)
+
+**Perform inference** [with a variety of standard methods](tutorials.md) [defined compositionally](http://approximateinference.org/accepted/ScibiorGhahramani2016.pdf)
+
+**Integrate with Haskell code** [like this](examples.md) because Monad-Bayes is just a library, not a separate language
+
+## Example
+
+```haskell
+model :: Distribution Double
+model = do
+     x <- bernoulli 0.5
+     normal (if x then (-3) else 3) 1
+
+image :: Distribution Plot
+image = fmap (plot . histogram 200) (replicateM 100000 model)
+
+sampler image
+```
+
+The program `model` is a mixture of Gaussians. Its type `Distribution Double` represents a distribution over reals. 
+`image` is a program too: as its type shows, it is a distribution over plots. In particular, plots that arise from forming a 200 bin histogram out of 100000 independent identically distributed (iid) draws from `model`. 
+To sample from `image`, we simply write `sampler image`, with the result shown below:
+
+
+<img src="images/plot.png" 
+     width="450" 
+     height="300" />
diff --git a/docs/docs/javascripts/mathjax.js b/docs/docs/javascripts/mathjax.js
@@ -0,0 +1,16 @@
+window.MathJax = {
+    tex: {
+      inlineMath: [["\\(", "\\)"]],
+      displayMath: [["\\[", "\\]"]],
+      processEscapes: true,
+      processEnvironments: true
+    },
+    options: {
+      ignoreHtmlClass: ".*|",
+      processHtmlClass: "arithmatex"
+    }
+  };
+
+  document$.subscribe(() => {
+    MathJax.typesetPromise()
+  })
diff --git a/monad-bayes-site/AdvancedSampling.html → docs/docs/notebooks/AdvancedSampling.html b/monad-bayes-site/AdvancedSampling.html → docs/docs/notebooks/AdvancedSampling.html
diff --git a/monad-bayes-site/Bayesian.html → docs/docs/notebooks/Bayesian.html b/monad-bayes-site/Bayesian.html → docs/docs/notebooks/Bayesian.html
diff --git a/monad-bayes-site/ClassicalPhysics.html → docs/docs/notebooks/ClassicalPhysics.html b/monad-bayes-site/ClassicalPhysics.html → docs/docs/notebooks/ClassicalPhysics.html
diff --git a/monad-bayes-site/Diagrams.html → docs/docs/notebooks/Diagrams.html b/monad-bayes-site/Diagrams.html → docs/docs/notebooks/Diagrams.html
diff --git a/monad-bayes-site/Functional_PPLs.html → docs/docs/notebooks/Functional_PPLs.html b/monad-bayes-site/Functional_PPLs.html → docs/docs/notebooks/Functional_PPLs.html
diff --git a/monad-bayes-site/Histogram.html → docs/docs/notebooks/Histogram.html b/monad-bayes-site/Histogram.html → docs/docs/notebooks/Histogram.html
diff --git a/monad-bayes-site/Introduction.html → docs/docs/notebooks/Introduction.html b/monad-bayes-site/Introduction.html → docs/docs/notebooks/Introduction.html
@@ -14614,7 +14614,7 @@
 </div>
 <div class="jp-InputArea jp-Cell-inputArea"><div class="jp-InputPrompt jp-InputArea-prompt">
 </div><div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput " data-mime-type="text/markdown">
-<h1 id="Introduction-to-Monad-Bayes">Introduction to Monad-Bayes<a class="anchor-link" href="#Introduction-to-Monad-Bayes">&#182;</a></h1><p>This serves as an interactive alternative to <a href="https://monad-bayes.netlify.app/probprog.html">the user guide</a>. This isn't intended as a tutorial to Haskell, but if you're familiar with probabilistic programming, the general flow of the code should look familiar.</p>
+<h1 id="Introduction-to-Monad-Bayes">Introduction to Monad-Bayes<a class="anchor-link" href="#Introduction-to-Monad-Bayes">&#182;</a></h1><p>This serves as an interactive alternative to the user guide. This isn't intended as a tutorial to Haskell, but if you're familiar with probabilistic programming, the general flow of the code should look familiar.</p>
 <p>To get a sense of how probabilistic programming with monad-bayes works, consider the following:</p>
 
 </div>

diff --git a/monad-bayes-site/Ising.html → docs/docs/notebooks/Ising.html b/monad-bayes-site/Ising.html → docs/docs/notebooks/Ising.html
diff --git a/monad-bayes-site/Lazy.html → docs/docs/notebooks/Lazy.html b/monad-bayes-site/Lazy.html → docs/docs/notebooks/Lazy.html
diff --git a/monad-bayes-site/Lenses.html → docs/docs/notebooks/Lenses.html b/monad-bayes-site/Lenses.html → docs/docs/notebooks/Lenses.html
diff --git a/monad-bayes-site/MCMC.html → docs/docs/notebooks/MCMC.html b/monad-bayes-site/MCMC.html → docs/docs/notebooks/MCMC.html
diff --git a/monad-bayes-site/Parsing.html → docs/docs/notebooks/Parsing.html b/monad-bayes-site/Parsing.html → docs/docs/notebooks/Parsing.html
diff --git a/monad-bayes-site/RealTimeInference.html → docs/docs/notebooks/RealTimeInference.html b/monad-bayes-site/RealTimeInference.html → docs/docs/notebooks/RealTimeInference.html
diff --git a/monad-bayes-site/SMC.html → docs/docs/notebooks/SMC.html b/monad-bayes-site/SMC.html → docs/docs/notebooks/SMC.html
diff --git a/monad-bayes-site/Sampling.html → docs/docs/notebooks/Sampling.html b/monad-bayes-site/Sampling.html → docs/docs/notebooks/Sampling.html
diff --git a/monad-bayes-site/Streaming.html → docs/docs/notebooks/Streaming.html b/monad-bayes-site/Streaming.html → docs/docs/notebooks/Streaming.html
diff --git a/docs/source/probprog.md → docs/docs/probprog.md b/docs/source/probprog.md → docs/docs/probprog.md
@@ -1,19 +1,27 @@
-# Quickstart
+# User Guide
 
 Probabilistic programming is all about being able to write probabilistic models as programs. For instance, here is a Bayesian linear regression model, which we would write equationally as:
 
-```{math}
 
+$$
 \beta \sim \operatorname{normal}(0, 2)
+$$
 
+$$
 \alpha \sim \operatorname{normal}(0, 2)
+$$
 
+$$
 \sigma^2 \sim \operatorname{gamma}(4, 4)
+$$
 
+$$
 \epsilon_{n} \sim \operatorname{normal}(0, \sigma)
+$$
 
+$$
 y_{n}=\alpha+\beta x_{n}+\epsilon_{n} 
-```
+$$
 
 but in code as:
 
@@ -36,7 +44,7 @@ regression xsys = do
 
 This is the *model*. To perform *inference* , suppose we have a data set `xsys` like:
 
-![](_static/priorpred.png)
+![](images/priorpred.png)
 
 To run the model
 
@@ -51,7 +59,7 @@ mhRunsRegression = sampler
 
 This yields 1000 samples from an MCMC walk using an MH kernel. `mh n` produces a distribution over chains of length `n`, along with the probability of that chain. Sampling a chain and plotting its final state gives:
 
-![](_static/regress.png)
+![](/images/regress.png)
 
 Monad-bayes provides a variety of MCMC and SMC methods, and methods arising from the composition of the two. 
 
@@ -65,9 +73,9 @@ Monad-bayes provides a variety of MCMC and SMC methods, and methods arising from
 
 Other probabilistic programming languages with fairly similar APIs include WebPPL and Gen. This cognitive-science oriented introduction to [WebPPL](https://probmods.org/) is an excellent resource for learning about probabilistic programming. The [tutorials for Gen](https://www.gen.dev/tutorials/) are also very good, particularly for learning about traces.
 
-# Specifying distributions
+## Specifying distributions
 
-A distribution in monad-bayes over a set {math}`X`, is of type:
+A distribution in monad-bayes over a set $X$, is of type:
 
 ```haskell
 MonadMeasure m => m X
@@ -81,7 +89,7 @@ Monad-bayes provides standard distributions, such as
 random :: Distribution Double
 ```
 
-which is distributed uniformly over {math}`[0,1]`.
+which is distributed uniformly over $[0,1]$.
 
 The full set is listed at https://hackage.haskell.org/package/monad-bayes-0.1.1.0/docs/Control-Monad-Bayes-Class.html
 
@@ -100,9 +108,9 @@ monad-bayes also lets us construct new distributions out of these. `MonadMeasure
 fmap (> 0.5) random :: MonadMeasure m => m Bool
 ```
 
-This is the uniform distribution over {math}`(0.5, 1]`.
+This is the uniform distribution over $(0.5, 1]$.
 
-As an important special case, if `x :: MonadMeasure m => m (a,b)` is a joint distribution over two variables, then `fmap fst a :: MonadMeasure m => m a` **marginalizes** out the second variable. That is to say, `fmap fst a` is the distribution {math}`p(a)`, where {math}`p(a) = \int_b p(a,b)`.
+As an important special case, if `x :: MonadMeasure m => m (a,b)` is a joint distribution over two variables, then `fmap fst a :: MonadMeasure m => m a` **marginalizes** out the second variable. That is to say, `fmap fst a` is the distribution $p(a)$, where $p(a) = \int_b p(a,b)$.
 
 The above example use only the functor instance for `m`, but we also have the monad instance, as used in:
 
@@ -113,13 +121,12 @@ example = bernoulli 0.5 >>= (\x -> if x then random else normal 0 1)
 
 It's easiest to understand this distribution as a probabilistic program: it's the distribution you get by first sampling from `bernoulli 0.5`, then checking the result. If the result is `True`, then sample from `random`, else from `normal 0 1`. As a distribution, this has a PDF:
 
-```{math}
- f(x) = 1[0\leq x \leq 1]*0.5  + \mathcal{N}(0,1)(x)*0.5  
- ```
+$$
+f(x) = 1[0\leq x \leq 1]*0.5  + \mathcal{N}(0,1)(x)*0.5  
+$$
 
 
 
-<!-- $$ \int\_{[0,1]} 1[x>0.5]* + (1[x\leq 0.5]*N(0,1)(x)) dx $$ -->
 
 Equivalently, we could write this in do-notation as:
 
@@ -138,6 +145,7 @@ That said, it is often useful to think of probabilistic programs as specifying d
 
 monad-bayes provides a function `score :: MonadMeasure m => Log Double -> m ()`. (**Note**: `Log Double` is a wrapper for `Double` which stores doubles as their logarithm, and does multiplication by addition of logarithms.)
 
+
 ```haskell
 example :: MonadMeasure m => m Double
 example = do
@@ -167,42 +175,7 @@ example = do
 
 This describes a Poisson distribution in which all even values of the random variable are marginalized out.
 
-<!-- Another use case is Bayesian inference as in:
-
-<!-- The most intuitive way to understand `score` is to think of a probabilistic program as making a series of random choices which trace out a possible execution of the program. At any point in this series, we can interject a `score x` statement, where the value of `x` depends on the previous choices. This statement multiplies the weight of this "trace" by the score. -->
-
-<!-- ```haskell
-bayesianExample :: (Eq a, MonadMeasure m) => m a -> (a -> m b) -> (b -> m a)
-bayesianExample prior likelihood b = do
-    a <- prior
-    b' <- likelihood a
-    condition (b==b')
-    return a
-```
-
-Note that operationally speaking, this approach is only going to work well for discrete distributions, since `b==b'` is going to be zero-measure in the continuous case. But in the discrete case, we could for example do: -->
-
-<!-- ```haskell
-example :: MonadMeasure 
-example =  bayesianExample (bernoulli 0.5) (\x -> if x then bernoulli 0.8 else bernoulli 0.9) True 
-``` 
--->
-
-
-
-
-<!-- ```haskell
-example :: MonadMeasure m => m Bool
-example = do 
-  x <- normal 0 1
-  y <- normal 0 2
-  z <- normal 0 3
-  return (x > y)
-```
-
-Note that in this example, commenting out the line `z <- normal 0 3` would not change the distribution at all. **But**, there is no guarantee in theory that the inference method you use knows this. More generally,  -->
 
-<!-- **Not all ways of expressing denotationally equivalent distributions are equally useful in practice** -->
 
 ## Inference methods
 
@@ -267,7 +240,7 @@ which gives
 [([1,2,3,4],0.5),([2,3,4,5],0.5)]
 ```
 
-### Near exact inference for continuous distributions
+## Near exact inference for continuous distributions
 
 Monad-Bayes does not currently support exact inference (via symbolic solving) for continuous distributions. However, it *does* support numerical integration. For example, for the distribution defined by
 
@@ -278,7 +251,7 @@ model = do
   normal 0 (sqrt var)
 ```
 
-you may run `probability (0, 1000) model` to obtain the probability in the range `(0,1000)`. As expected, this should be roughly {math}`0.5`, since the PDF of `model` is symmetric around {math}`0`.
+you may run `probability (0, 1000) model` to obtain the probability in the range `(0,1000)`. As expected, this should be roughly $0.5$, since the PDF of `model` is symmetric around $0$.
 
 You can also try `expectation model`, `variance model`, `momentGeneratingFunction model n` or `cdf model n`. 
 
@@ -305,7 +278,7 @@ example = do
   if x then normal 0 1 else normal 1 2
 ```
 
-`sampler example` will produce a sample from a Bernoulli distribution with {math}`p=0.5`, and if it is {math}`True`, return a sample from a standard normal, else from a normal with mean 1 and std 2.
+`sampler example` will produce a sample from a Bernoulli distribution with $p=0.5$, and if it is $True$, return a sample from a standard normal, else from a normal with mean 1 and std 2.
 
 `(replicateM n . sampler) example` will produce a list of `n` independent samples. However, it is recommended to instead do `(sampler . replicateM n) example`, which will create a new model (`replicateM n example`) consisting of `n` independent draws from `example`. 
 
@@ -395,15 +368,15 @@ run = (sampler . mcmc (MCMCConfig {
   proposal = SingleSiteMH})) example
 ```
 
-produces {math}`5` unbiased samples from the posterior, by using single-site trace MCMC with the Metropolis-Hastings (MH) method. This means that the random walk is over execution traces of the probabilistic program, and the proposal distribution modifies a single random variable as a time, and then uses MH for the accept-reject criterion. For example, from the above you'd get:
+produces $5$ unbiased samples from the posterior, by using single-site trace MCMC with the Metropolis-Hastings (MH) method. This means that the random walk is over execution traces of the probabilistic program, and the proposal distribution modifies a single random variable as a time, and then uses MH for the accept-reject criterion. For example, from the above you'd get:
 
 ```
 [True,True,True,True,True]
 ```
 
 The final element of the chain is the head of the list, so you can drop samples from the end of the list for burn-in.
 
-### Piped MCMC
+## Streaming MCMC
 
 You can also run `MCMC` using `mcmcP`. This creates an infinite chain, expressed as a stream or using the corresponding type from the `pipes` library, a `Producer`. This is a very natural representation of a random walk in Haskell.
 
@@ -623,7 +596,7 @@ mixture1 point = do
     return cluster
 ```
 
-is a piece of code to infer whether an observed point was generated from a Gaussian of mean {math}`1` or {math}`5`. That is, `mixture1` is a conditional Bernoulli distribution over the mean given an observation. You're not going to be able to do much with `mixture1` though. Exact inference is impossible because of the sample from the normal, and as for sampling, there is zero probability of sampling the normal to exactly match the observed point, which is what the `condition` requires.
+is a piece of code to infer whether an observed point was generated from a Gaussian of mean $1$ or $5$. That is, `mixture1` is a conditional Bernoulli distribution over the mean given an observation. You're not going to be able to do much with `mixture1` though. Exact inference is impossible because of the sample from the normal, and as for sampling, there is zero probability of sampling the normal to exactly match the observed point, which is what the `condition` requires.
 
 However, the same conditional distribution is represented by 
 

diff --git a/docs/docs/tutorials.md b/docs/docs/tutorials.md
@@ -0,0 +1,21 @@
+---
+title: Tutorials
+---
+
+## [Introduction to Monad-Bayes](/notebooks/Introduction.html)
+
+## [Sampling from a distribution](/notebooks/Sampling.html)
+
+## [Bayesian models](/notebooks/Bayesian.html)
+
+## [Markov Chain Monte Carlo](/notebooks/MCMC.html)
+
+## [Sequential Monte Carlo](/notebooks/SMC.html)
+
+## [Lazy Sampling](/notebooks/Lazy.html)
+
+## [Advanced Inference Methods](/notebooks/AdvancedSampling.html)
+
+<!-- ## [Advanced Inference Methods](../AdvancedSampling.html)
+
+## [Building your own inference methods]() -->