From 035d82a10d8c9404ff9b5f2da18a65cac22b9321 Mon Sep 17 00:00:00 2001 From: "Benjamin T. Vincent" <inferencelab@gmail.com> Date: Thu, 25 Apr 2024 17:07:35 +0100 Subject: [PATCH 01/10] committing today's work --- docs/source/index.rst | 1 + docs/source/quasi_dags.ipynb | 396 +++++++++++++++++++++++++++++++++++ docs/source/references.bib | 25 +++ pyproject.toml | 1 + 4 files changed, 423 insertions(+) create mode 100644 docs/source/quasi_dags.ipynb diff --git a/docs/source/index.rst b/docs/source/index.rst index 4e3cc9c8..3e312037 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -141,6 +141,7 @@ Documentation outline :caption: Knowledge Base design_notation.md + quasi_dags.ipynb glossary.rst .. toctree:: diff --git a/docs/source/quasi_dags.ipynb b/docs/source/quasi_dags.ipynb new file mode 100644 index 00000000..f0904b12 --- /dev/null +++ b/docs/source/quasi_dags.ipynb @@ -0,0 +1,396 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Causal DAGS for Quasi-Experiments" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This page provides an overview of structural causal models for some of the most common quasi-experiments. It takes inspiration from a paper by {cite:t}`steiner2017graphical`, and the books by {cite:t}`cunningham2021causal` and {cite:t}`huntington2021effect`, and readers are encouraged to consult these sources for more details." + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": { + "tags": [ + "remove-input" + ] + }, + "outputs": [], + "source": [ + "import daft\n", + "\n", + "GRID_UNIT = 2.0\n", + "DPI = 200\n", + "NODE_EC = \"none\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Before we take a look at randomized controlled trials (RCTs) and quasi-experiments, let's first consider the concept of confounding. Confounding occurs when a variable (or variables) causally influence both the treatment and the outcome and is very common in observational studies. This can lead to biased estimates of the treatment effect (the causal effect of $Z \\rightarrow Y$). The following causal DAG illustrates the concept of confounding." + ] + }, + { + "cell_type": "code", + "execution_count": 85, + "metadata": { + "tags": [ + "remove-input" + ] + }, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAATMAAAEMCAYAAACodFEmAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/H5lhTAAAACXBIWXMAAB7CAAAewgFu0HU+AAAQAElEQVR4nO3de2jV9R/H8ddxaqV4g5kXnJVGOZfzsnlJXIlZm5akKaywpHQ6/8myzMgmJBGZVkoUkbJRIKWpZavpzMVW2siQ8oLTLlNzapaZyJzT3b6/P/p58Ox6pmfne877+3zAwHPOd/Ku1tPP+5zt6HMcxxEARLl2bg8AAKFAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMAJhAzACYQMwAmEDMcM0uXbqkQYMGyefzBXw8/fTTzX5ecXGxYmJiGnxeQUFBmCaHRT7HcRy3h0D0Ki4uVkpKiurq6vz3+Xw+ffvtt0pJSWlw/eXLlzVs2DAdPnw44P6MjAytXbu2zeeFXZzMcF3Gjh2rBQsWBNznOI7mzJmjysrKBtcvW7asQcj69eunN998s03nhH2czHDdLl68qMTERJWWlgbcv2jRIq1cudJ/+6efftLo0aNVU1MTcF1eXp4mT54clllhFzFDSBQVFWnChAm6+sspJiZGxcXFGjVqlGpqapScnKx9+/YFfN6sWbP00UcfhXtcGMSaiZAYP3685s+fH3BfbW2tZs+eraqqKi1fvrxByHr37q3Vq1eHcUpYxskMIXPhwgXddddd+uOPPwLunzlzpjZu3KiqqqqA+z/77DNNmzYtnCPCMGKGkNqxY4ceeOCBFq9LT0/X+vXrwzARvIKYIeQyMjKUnZ3d5OM9e/bUwYMH1bNnzzBOBeuIGULu/PnzSkhI0MmTJxt9fP369UpPTw/zVLCOFwAQct26ddOECRMafaxz585BraFAaxEzhNx3332ndevWNfpYRUWFFi5cGOaJ4AWsmQipyspKJSYm6vfff2/2um3btiktLS1MU8ELOJkhpLKyshqErH379g2umzdvnsrLy8M1FjyAmCFkdu/e3eCbYH0+n7788kvFx8cH3F9WVqYXXnghjNPBOmKGkKiqqtLs2bMD3j1DkjIzM5WWlqacnBy1axf45bZmzRoVFhaGc0wYRswQEsuWLVNJSUnAfXFxcVqxYoUkacyYMQ2e+HccRxkZGbp48WLY5oRdvACA6/bzzz/7f5j8alu3btWkSZP8tysrKzV06FD99ttvAdc988wz/Iwmrhsxw3WpqanRyJEjtXfv3oD7m3o3jF27dumee+4JeHeNdu3aaefOnRo7dmxbjwvDWDNxXV5//fUGIevVq5dWrVrV6PXjxo1r8LbadXV1mjNnji5dutRWY8IDOJnhmpWUlGj48OEN3g1j06ZNmj59epOfV1FRocTERB05ciTg/hdffFHLly9vk1lhHzEDYAJrJgATiBkAE4gZABOIGQATiBkAE4gZABOIGQATiBkAE4gZABOIGQATiBkAE4gZABOIGQATiBkAE4gZABOIGQATiBkAE4gZABOIGQATiBkAE4gZABOIGQATiBkAE4gZJEmnT59WZWWl22O02tGjR8Vf/QqJmOH/+vTpo06dOunjjz92e5SgOI6jadOmacCAAZo6darb4yACEDNo3rx5/l/PnDlT//77r4vTBOett97Sli1bJEm5ubnav3+/uwPBdT6HM7qnlZaW6vbbb/ffvvfee1VUVOTeQK3g8/kCbtfV1TW4D97Byczjrg6ZJBUWFro0SeuVlJQE3Gbd9DZi5mFXr5eStHv37qg62cTHxys9Pd1/m3XT21gzPSqa18v6WDchcTLzrGheL+tj3YREzDwp2tfL+lg3IbFmeo6l9bI+1k1v42TmMZbWy/pYN72NmHmItfWyPtZNb2PN9AjL62V9rJvexMnMIyyvl/WxbnoTMfMA6+tlfayb3sSaaZyX1sv6WDe9hZOZcV5aL+tj3fQWYmaY19bL+lg3vYU10ygvr5f1sW56Ayczo7y8XtbHuukNxMwgr6+X9bFuegNrpjGsl01j3bSNk5kxrJdNY920jZgZwnrZPNZN21gzjWC9DB7rpk2czIxgvQwe66ZNxMwA1svWYd20iTUzyrFeXjvWTVs4mUU51strx7ppCzGLYqyX14d10xbWzCjFehk6rJs2cDKLUqyXocO6aQMxi0Ksl6HFumkDa2YEO3PmjMrKyjRixAj/fayXbae5ddNxHH399ddKTU11YzQEgZNZBMvOztaoUaO0dOlSVVVVSWK9bEtNrZunTp3SlClTlJaWxoktkjmISDU1Nc6tt97qSHIkOUOGDHFSUlL8tyU5u3fvdntMc9LT0wP+HWdlZTndu3f3354/f77bI6IJrJkRKi8vTw899FCTj7Netp3mnn/s3LmzTp06pa5du4ZxIgSDNTNCvf/++80+/vbbb4dpEm9xHEcvv/xyk49XVFRo3bp1YZwIweJkFoGOHj2qgQMHqrn/NDExMVqyZImysrLUsWPHME5n16lTp5SZmamvvvqq2esSEhJ04MABXkGOMJzMItCaNWuaDZkk1dbW6tVXX1VycrJOnz4dpsnsKigoUEJCQoshk6SDBw9q165dYZgKrUHMIszly5eVnZ0d9PWpqanq1atXG07kDSNHjtSgQYOCvr6lpwEQfsQswmzevFlnzpwJ6tpFixZpxYoVrDsh0K1bN+Xn52vMmDFBXb9p0yb99ddfbTwVWoOYRZhg/8QnZKHXmqBVV1crJycnDFMhWLwAEEEOHDigxMTEFq8jZG3r/PnzSktL0w8//NDsdf3799eRI0cUExMTpsnQHE5mESSYUxkha3vBntCOHz+ubdu2hWkqtISTWYQoLy9X3759deHChSavIWThFcwJbdKkSdq6dWsYp0JTOJlFiHXr1hGyCBPMCS0/P19Hjx4N41RoCjGLAI7jNLtiEjL3tBQ0x3H0wQcfhHkqNIY1MwLs2rVLKSkpjT5GyCJDcytnbGysTpw4oRtuuMGFyXAFJ7MI0NSpjJBFjuZOaP/88482bdrkwlS4Giczl/3999/q16+fqqurA+4nZJGpqRPa2LFj9f3337s0FSROZq7LyckhZFGkqRNacXExb9zoMmLmotra2gZPHhOyyNdU0Ph5TXcRMxf9+uuvOnnypP82IYsejQWtsLCwxXc7QdvhOTOX7d+/X0899ZQmTJhAyKLQlefQ7rzzTq1atUo9evRweyTPImYRoKamRjExMYQsStXU1Kh9+/Zuj+F5xAyACTxnBsAEYgbABGIGwARiBsAEYgbABGIGwARiBsAEYgbABGIGwARiBsAEYgbABGIGwARiBsAEYgbABGIGwARiBsAEYlbPlbeuvtaP1atXu/2PgChy7tw5denSRT6fT3FxcaqpqWnxc2prazVp0iT/19wnn3wShkkjHzGrZ8+ePdf1+UOGDAnRJPCCHj16aO7cuZKkEydOBPWXCT///PPKz8+XJGVlZemxxx5r0xmjBW+bXc+RI0d08eLFoK4tLy9Xenq6ysrKJEkjRozQzp071alTp7YcEcaUlZVp4MCBqq6u1ujRoxv8BcNXW7t2rebNmydJmj59ujZu3MjfHXGFg2tSWVnpjB8/3pHkSHLi4+OdM2fOuD0WotSsWbP8X0vFxcWNXlNYWOh06NDBkeQMHz7cqaioCPOUkY018xpUV1drxowZKioqkiTddtttKigoUGxsrLuDIWotXrzYf8Jq7HnX0tJSzZgxQ9XV1erdu7dyc3PZAOohZq1UV1enJ554Qnl5eZKkvn37qqCgQH379nV5MkSzhIQETZ48WZK0efNmHT9+3P/Y+fPnNWXKFJ09e1Y33nijvvjiC/Xr18+tUSMWMWulzMxMbdiwQZIUGxurHTt2aMCAAS5PBQsWL14s6b9XK999913/rx999FEdOnRIkpSTk6NRo0a5NmNEc3vPjSbPPfec/3mNrl27Onv27HF7JBgzZswYR5LTvXt358KFC86CBQv8X3NLly51e7yIxquZQVq2bJleeeUVSVKnTp20fft2jRs3zt2hPOjZZ59VbGyskpKSlJSUpJtvvtntkULq888/1yOPPCJJuu+++/TNN99I4pXLYBCzIKxevVoLFy6UJHXs2FG5ublKTU11eSpvGjRokH755Rf/7bi4OH/YLATOcRzFx8cH/DPyLT/BIWYtyMnJUUZGhhzHUUxMjDZs2KDp06e7PZZn1Y9ZY6I9cNnZ2crIyJAk9enTRz/++CNP+AehvdsDRLJPP/1Uc+fOleM48vl8ys7OJmRRoKysTGVlZdqyZYv/vmgK3MCBA/2/zszMJGTBcu3ZugiXl5fn/wZFSc4777zj9kjNeumll/yz8hHcR1xcnDN16lRnx44dbv/nC7Bq1Sr/jFu2bHF7nKhBzBpRVFTk3HTTTf4vqNdee83tkVrkdhii+SPSXiV88skn/bMdO3bM7XGiBt9nVs+ePXs0ZcoUVVZWSvrve3+WLFni8lRoS+3aRdb/Bnv37pX03w+h33LLLe4OE0V4zuwqBw8eVFpamsrLyyVJ8+fP1xtvvOHyVMHJz8/3v4xv2cqVK6/79/D5fLrjjjuUnJyspKQkPfjggyGYLDSqq6tVUlIiSRo6dKjL00QXYvZ/paWluv/++3X27FlJ0syZM/Xee++5PFXwUlNTPfHtIrm5uS2+mnm1+uFKSkrS8OHD1aVLlzac8todOnRIVVVVkqRhw4a5O0yUIWaSTp48qYkTJ+rPP/+UJD388MP68MMPI279QPOiLVyNubJiSsSstTwfs3PnzmnixIk6duyYpP++j2np0qU6fPhw0L9HXFycunXr1kYTojEWwtWYffv2+X9NzFrH8zHbvn17QLgOHz6s5OTkVv0excXFuvvuu0M9GhqRlZWl/v37mwhXY66czDp27KjBgwe7O0yU8XzMDhw4cF2fHxMTw5+gYfT444+7PUKbunIyGzx4sDp06ODyNNGFH2cCYALPcAMwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEwgZgBMIGYATCBmAEw4X95pcLi6dHvaAAAAABJRU5ErkJggg==", + "text/plain": [ + "<Figure size 267.717x228.346 with 1 Axes>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "pgm = daft.PGM(dpi=DPI, grid_unit=GRID_UNIT, node_ec=NODE_EC)\n", + "\n", + "pgm.add_node(\"z\", \"$Z$\", 1, 0)\n", + "pgm.add_node(\"x\", \"$\\mathbf{X}$\", 1.5, 0.75)\n", + "pgm.add_node(\"y\", \"$Y$\", 2, 0)\n", + "\n", + "pgm.add_edge(\"z\", \"y\")\n", + "pgm.add_edge(\"x\", \"y\")\n", + "pgm.add_edge(\"x\", \"z\")\n", + "\n", + "pgm.render();" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Randomized controlled trials (RCTs) are considered the gold standard for estimating causal effects. One reason for this is that we (as experimenters) intervene in the system by randomly assigning subjects to treatment groups. This ensures that the treatment is independent of any confounding variables. Importantly, this act of intervention breaks the causal link of the confounders $\\mathbf{X}$ upon the treatment $Y$. The following causal DAG illustrates the structure of an RCT." + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "metadata": { + "tags": [ + "remove-input" + ] + }, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAdEAAAEMCAYAAACbY4xqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/H5lhTAAAACXBIWXMAAB7CAAAewgFu0HU+AAASY0lEQVR4nO3daWxUZR+G8XvaQkqxNBjQglQUEkJZCrVIkRRCAKWFICoIGgQUyuIHVEAwKiTywYhgAiEaXllliYJFLGBZ0ipFSCOGsNi0rAWkRSQGCUKn0O28HwiTblOmD505s1y/ZJLpzBnzN9P24nl6ZsZhWZYlAADQZGF2DwAAQKAiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAAsqdO3fUvXt3ORyOWpfZs2c3+ri8vDyFh4fXe1xOTo6PJkcwcliWZdk9BAA0RV5engYNGqTq6mrXbQ6HQwcPHtSgQYPqHX/37l317dtXp0+frnV7enq61qxZ4/V5EbxYiQIIOAMHDtQ777xT6zbLsjRt2jSVlZXVO37x4sX1AtqpUyd98cUXXp0TwY+VKICA5HQ6lZCQoKKiolq3v//++1q2bJnr62PHjik5OVmVlZW1jsvKytLIkSN9MiuCFxEFELByc3M1dOhQ1fw1Fh4erry8PPXv31+VlZXq16+fTp48WetxkydP1saNG309LoIQ27kAAtaQIUM0a9asWrdVVVVp6tSpKi8v15IlS+oFNDY2VitWrPDhlAhmrEQBBLTbt2+rV69e+vPPP2vdPnHiRGVkZKi8vLzW7Tt27NDLL7/syxERxIgogICXnZ2tF1544YHHTZgwQVu3bvXBRAgVRBRAUEhPT9e6devc3t++fXsVFBSoffv2PpwKwY6IAggKN2/eVM+ePXXlypUG79+6dasmTJjg46kQ7DixCEBQiImJ0dChQxu8r3Xr1h5t9wJNRUQBBIVff/1VW7ZsafC+0tJSzZkzx8cTIRSwnQsg4JWVlSkhIUHnz59v9Li9e/cqNTXVR1MhFLASBRDwFi5cWC+gERER9Y6bMWOGbt265auxEAKIKICAduTIkXpvnuBwOLR7927Fx8fXur24uFjz58/34XQIdkQUQMAqLy/X1KlTa32aiyTNnDlTqampWr9+vcLCav+aW716tQ4cOODLMRHEiCiAgLV48WIVFhbWui0uLk5Lly6VJA0YMKDeCUWWZSk9PV1Op9NncyJ4cWIRgIB0/Phx15vM17Rnzx6lpaW5vi4rK1OfPn107ty5Wse9++67vIcuHhoRBRBwKisr9eyzz+rEiRO1bnf36SyHDx/W4MGDa33aS1hYmA4dOqSBAwd6e1wEMbZzAQSczz77rF5AH3/8cS1fvrzB41NSUjR79uxat1VXV2vatGm6c+eOt8ZECGAlCiCgFBYWKjExsd6ns2zfvl1jx451+7jS0lIlJCTowoULtW7/4IMPtGTJEq/MiuBHRAEAMMR2LgAAhogoAACGiCgAAIaIKAAAhogoAACGiCgAAIaIKAAAhogoAACGiCgAAIaIKAAAhogoAACGiCgAAIaIKAAAhogoAACGiCgAAIaIKAAAhogoAACGiCgAAIaIKAAAhogoAACGiCgAAIaIKAAAhogoAACGiCgANIFlWbp48aLdY8BPEFEA8NCJEycUFhamLl26yOl02j0O/IDDsizL7iEAIBA4HA7X9fDwcFVWVto4DfwBK1EA8NDq1atd16uqqrRp0yYbp4E/YCUKAE1QczUqSaWlpYqKirJpGtiNlSgANMF///1X6+s2bdrYNAn8AREFgCaIjo5mWxcubOcCgAG2dSGxEgUAI2zrQiKiAGCEbV1IbOcCwENhWze0sRIFgIfAtm5oI6IA8BDY1g1tbOcCQDNgWzc0sRIFgGbAtm5oIqIA0AzY1g1NbOcCQDNiWze0sBIFgGbEtm5oIaIA0IzY1g0tbOcCgBewrRsaWIkCgBewrRsaiCgAeAHbuqGB7VwA8CK2dYMbK1EA8CK2dYMbEQUAL2JbN7ixnQsAPsC2bnBiJQoAPsC2bnAiogDgA2zrBie2cwHAh9jWDS6sRAHAh9jWDS5EFAB8iG3d4MJ2LgDYgG3d4MBKFABswLZucCCiAGADT7Z1S0pKVFBQ4OvR0ARs5wKAjRra1m3VqpU2bNigOXPmKDU1Vdu2bbNpOjwIEQUAG926daveVm5aWpr27t0rSYqIiFBxcbFiY2PtGA8PwHYuANio7rauJFdAJamyslJr16719VjwECtRALBZSUmJ4uLi3N4fFxenCxcuKCIiwodTwROsRAHAJpZlacOGDerVq1ejxxUXFysrK8tHU6EpiCgA2KCiokIvvviipk6dqps3bz7w+FWrVvlgKjQVEQUAG7Ro0ULjx4+vd3auO/v371dRUZGXp0JTEVEAsMmkSZO0ceNGj0P6v//9z8sToak4sQgAbLZ582ZNmTJFD/p1/Oijj6qkpEStWrXy0WR4EFaiAGAzT1ek//77rzIyMnw0FTxBRAHAD3gaUk4w8i9s5wKAH/Fka/fYsWNKTEz04VRwh5UoAPgRT1akrEb9BytRAPBDja1Io6Ki9NdffykmJsaGyVATK1EA8EONrUidTme9j02DPViJAoAfc7cijY+PV0FBgcevMYV3sBIFAD/mbkV66tQpHTx40KapcB8RBQA/5y6knGBkP7ZzASBA1N3a5QO77cdKFAACRN0VaWVlpQ4dOmTzVKGNiAJAALkf0s6dOysrK0uvvvqq3SOFNLZzASAAVVZWKiIiwu4xQh4RBQDAENu5AAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqIAABgiogAAGCKiAAAYIqKNWLVqlRwOR4OX1q1bq3v37po1a5ZOnTpl96gIcEuXLnX7vebJZcWKFXb/LyCA3LhxQ9HR0XI4HIqLi1NlZeUDH1NVVaW0tDTX99x3333ng0n9HxFtxIkTJ9ze53Q6debMGX399ddKTEzUtm3bfDcYgs7Ro0cf6vG9e/dupkkQCtq2bavp06dLkkpKSrR9+/YHPmbevHnat2+fJGnhwoV6/fXXvTpjoHBYlmXZPYS/Sk5O1u+//66YmBgdPnzYdXt5ebmKioq0YsUK5eXlSZJatWqlc+fO6YknnrBrXASwCxcuyOl0enTsrVu3NGHCBBUXF0uSnnnmGR06dEhRUVHeHBFBpri4WF27dlVFRYWSk5P122+/uT12zZo1mjFjhiRp7NixysjIkMPh8NWo/s1Cg6qqqqyoqChLkpWSkuL2mAEDBliSLEnWsmXLfDwlQk1ZWZk1ZMgQ1/dcfHy89c8//9g9FgLU5MmTXd9LeXl5DR5z4MABq0WLFpYkKzEx0SotLfXxlP6N7Vw3zpw541oZJCQkNHhMWFiY3n77bdfXBQUFPpkNoamiokLjxo1Tbm6uJOnpp59WTk6O2rVrZ+9gCFgLFixwrSgb+rt6UVGRxo0bp4qKCsXGxmrXrl3seNRBRN2o+ffQxv7e1LlzZ9d1T/44D5iorq7WpEmTlJWVJUnq2LGjcnJy1LFjR5snQyDr2bOnRo4cKUn64YcfdPnyZdd9N2/e1OjRo3X9+nVFRkZq586d6tSpk12j+i0i6kbNiLpbiUrStWvXXNeffvppb46EEDZz5kzXyWvt2rVTdna2unTpYvNUCAYLFiyQdO/s2y+//NJ1/bXXXnO98mD9+vXq37+/bTP6MyLqRs2I9urVy+1xmZmZrutjxozx4kQIVfPmzdPatWslSW3atNG+ffvUo0cPm6dCsBg8eLAGDBgg6d4JRKWlpZo7d67rTNxFixZxJm4jODvXjdjYWF27dk1PPfWULl682OAxmZmZGjt2rKqrqzVu3DhlZGT4eMrQ895776ldu3ZKSkpSUlKSHnvsMbtH8qrFixfrk08+kSRFRUVp//79SklJsXeoEJOTk6PMzEzX91yPHj0UERFh91jN6scff9Qrr7wiSRo2bJh+/vlnSZyJ6wki2oC///5bHTp0kCSNHj1au3btct139+5dnT17Vhs2bNDKlStVVVWllJQU7dmzR9HR0XaNHDK6d++uM2fOuL6Oi4tz/XILtrCuWLFCc+bMkSS1bNlSu3bt0ogRI2yeKvRs2rRJU6ZMcX0dGRmpPn36qF+/fkETVsuyFB8fX+tni5dOeSZwn3UvOn78uOv67t273f4rLCkpSVOnTtWMGTMC+gcokBUXF6u4uLjWtnowhHX9+vWaO3euJCk8PFzffvstAfUTd+7c0ZEjR3TkyBHXbYEeVofDofnz5ys9PV2S1KFDB+3cuZOAeiAwnmEfa+ydimq6ffu20tLSAuYHJVQEeli///57TZ8+XZZlyeFwaN26dRo7dqzdY6ERwRDWrl27uq7PnDmTM3E9ZeeLVP3V+PHjXS9Azs3NtfLz8638/HzryJEj1ubNm63ExETX/YMGDbJ7XMuyLOvDDz90zcTFs0tcXJz10ksvWdnZ2XY/fS5ZWVmuF7ZLslauXGn3SG45nU7bn8NAu0RGRlrJycnW7Nmz7X766lm+fLlrzszMTLvHCRhEtAHdunWzJFnt2rVr8P6ysjKrZ8+erm+4o0eP+njC+uz+5RDIl0WLFtn99FmWZVm5ublWq1atXHN9+umndo/UqBs3btj+3AXqJSIiwu6nr54333zTNd+lS5fsHidg8BKXOpxOp86fPy9JSkxMbPCYyMhILVy40PX1li1bfDIbvCMszP4fg6NHj2r06NEqKyuTdO+1ex999JHNUyGU3P8zVtu2bWu9iQwa55+b8zY6efKkqqurJUl9+/Z1e9yYMWP0yCOP6Pbt29qxY4eWL1/uowkbtm/fPtdp6cFs2bJlD/3fcDgc6tatm+tvVaNGjWqGycwVFBQoNTVVt27dkiTNmjVLn3/+ua0zeaJ169aaP3++3WN4XXZ2tsfnSTQmJibG9bfRfv36PfxgzaiiokKFhYWSpD59+tg8TWAhonXU/GFxtxKV7n1qy/Dhw5WZmanLly/rjz/+aPSdjbxtxIgRIXH25q5du2qdhv8gdYOZlJSkxMREv3k5UlFRkZ5//nldv35dkjRx4kR99dVXNk/lmRYtWmjp0qV2j+F1dV/i4omawbwfzS5duvjt6y1PnTql8vJySY0vHlAfEa2jZkQf9M00atQo1xmgu3fvtjWi8P9g1nXlyhUNHz5cV69elXRvd+Obb77xi+1leC7QgtmQpvzeQ21EtI7730xRUVHq1q1bo8eOHDlSDodDlmXpp59+0scff+yDCSEFXjDrunHjhoYPH65Lly5JuvcmEosWLdLp06c9/m/ExcUpJibGSxOiIcEQzIacPHnSdZ2INg3vWFRDdXW1oqOj5XQ6H/ghtfclJSXp2LFjCgsL09WrV/32tYfBYsuWLXryyScDKpgN2bp160O/H2leXp6ee+65ZpoI7uTn56uwsDBogtmQYcOG6ZdfflHLli11+/ZttWjRwu6RAgYr0RrOnj3r+gxRT/81NmrUKB07dkzV1dXKysrSW2+95cUJ8cYbb9g9QrPIz89/qMeHh4ezYvCR3r17N/pxiMHg/kq0R48eBLSJWIkCAGCIMxgAADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADBERAEAMEREAQAwREQBADD0f1uMy4xvJGA6AAAAAElFTkSuQmCC", + "text/plain": [ + "<Figure size 425.197x228.346 with 1 Axes>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "pgm = daft.PGM(dpi=DPI, grid_unit=GRID_UNIT, node_ec=NODE_EC)\n", + "\n", + "pgm.add_node(\"r\", \"$R$\", 0, 0)\n", + "pgm.add_node(\"z\", \"$Z$\", 1, 0)\n", + "pgm.add_node(\"x\", \"$\\mathbf{X}$\", 1.5, 0.75)\n", + "pgm.add_node(\"y\", \"$Y$\", 2, 0)\n", + "\n", + "pgm.add_edge(\"r\", \"z\")\n", + "pgm.add_edge(\"z\", \"y\")\n", + "pgm.add_edge(\"x\", \"y\")\n", + "\n", + "pgm.render();" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The new variable $R$ represents the random assignment of units to the treatment group. So now $Z$ is entirely causally influenced by $R$, and not by any other variables. This means that the treatment effect $Z \\rightarrow Y$ can be estimated without bias." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Instrumental Variables\n", + "\n", + "In quasi-experiments, we cannot randomly assign subjects to treatment groups. So confounders $\\mathbf{X}$ will still influence treatment assignment. In the instrumental variable (IV) approach, the causal effect of $Z \\rightarrow Y$ is identifiable if we have an IV that causally influences the treatment $Z$ but not the outcome $Y$." + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "metadata": { + "tags": [ + "remove-input" + ] + }, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "<Figure size 598.425x303.15 with 1 Axes>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "pgm = daft.PGM(dpi=DPI, grid_unit=GRID_UNIT, node_ec=NODE_EC)\n", + "\n", + "pgm.add_node(\"iv\", \"$IV$\", 0, 0)\n", + "pgm.add_node(\"z\", \"$Z$\", 1, 0)\n", + "pgm.add_node(\"y\", \"$Y$\", 2, 0)\n", + "pgm.add_node(\"x\", \"$\\mathbf{X}$\", 1.5, 0.75)\n", + "pgm.add_edge(\"iv\", \"z\")\n", + "pgm.add_edge(\"x\", \"z\")\n", + "pgm.add_edge(\"x\", \"y\")\n", + "pgm.add_edge(\"z\", \"y\")\n", + "\n", + "pgm.render();" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Readers are referred to {cite:t}`steiner2017graphical` for a more in-depth discussion of the IV approach from the causal DAG and SCM perspective." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{note}\n", + "TODO: Explain the intuition behind how the IV approach works.\n", + ":::" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Interrupted Time Series\n", + "\n", + "A causal DAG for interrupted time series is given by {cite:t}`huntington2021effect`, though that book refers to it as [Event Studies](https://theeffectbook.net/ch-EventStudies.html). These kinds of studies are suited to situations where an intervention is made at a given point in time and any causal effect is assumed to have a lasting (not a transient) effect. Here's the causal DAG - note that $\\text{time}$ represents all the things changing over time such as the time index as well as time-varying predictor variables." + ] + }, + { + "cell_type": "code", + "execution_count": 99, + "metadata": { + "tags": [ + "remove-input" + ] + }, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "<Figure size 425.197x267.717 with 1 Axes>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "pgm = daft.PGM(dpi=DPI, grid_unit=GRID_UNIT, node_ec=NODE_EC)\n", + "\n", + "pgm.add_node(\"a\", \"after\\ntreatment\", -1, 0)\n", + "pgm.add_node(\"z\", \"$Z$\", 0, 0)\n", + "pgm.add_node(\"y\", \"$Y$\", 1, 0)\n", + "pgm.add_node(\"t\", \"time\", 0, 1)\n", + "\n", + "pgm.add_edge(\"a\", \"z\")\n", + "pgm.add_edge(\"t\", \"a\")\n", + "pgm.add_edge(\"t\", \"y\")\n", + "pgm.add_edge(\"z\", \"y\")\n", + "\n", + "pgm.render();" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "What we want to understand is the causal effect of the treatment upon the outcome, $Z \\rightarrow Y$. But we have a back door path between $Z$ and $Y$ which will make this hard, $Z \\leftarrow \\text{after treatment} \\leftarrow \\text{time} \\rightarrow Y$.\n", + "\n", + "The approach taken is to use the pre-treatment data only to create a prediction of what would have happened in the absence of treatment (i.e. the counterfactual). If we can assume that in the absence of the treatment, nothing would have changed, then this counterfactual estimate will be unbiased and we can estimate the treatment effect by comparing the observed (post-treatment) data with the counterfactual." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Difference in Differences\n", + "\n", + ":::{warning}\n", + "This section, including the DAG is a work in progress.\n", + ":::" + ] + }, + { + "cell_type": "code", + "execution_count": 93, + "metadata": { + "tags": [ + "remove-input" + ] + }, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAATMAAAEzCAYAAABdWOReAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/H5lhTAAAACXBIWXMAAB7CAAAewgFu0HU+AAAbmklEQVR4nO3deXCU9R3H8c8mJCEhkN1AMaYkIExBkgCBcBjRhiPIIFBgoAUrCM5EpDPFIgiKyiVQQKpDWyx1OEwLgigKhOEo9yEqhwMaEWQ4AiEcUhIoOSDHfvsH7mM22c1ukt19nv3t5zWzM2Gf3fBNeHjzXHkwiYiAiMjPBek9ABGRJzBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREpgTEjIiUwZkSkBMaMiJTAmBGREhgzIlICY0ZESmDMiEgJjBkRKYExIyIlMGZEpATGjIiUwJgRkRIYMyJSAmNGREpgzIhICYwZESmBMSMiJTBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREpgTEjIiUwZkSkBMaMiJTAmBGREhgzIlICY0ZESmDMiEgJjBkRKYExIyIlMGZEpATGjIiUwJgRkRIYMyJSAmNGREpgzIhICYwZESmBMSMiJTBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREpgTEjIiUwZkSkBMaMiJTAmBGREhgzIlJCwMVs7969aNq0KYYNGwYR0XscIvKQgIvZxo0bkZ+fj02bNiE/P1/vcYjIQxroPYCv/eEPf8CJEyfw5JNPomnTpnqPQ0QeYhJF97XOnj2Ljz76CDNnztR7FCLyAWV3MxcuXAir1ar3GETkI0rGbMeOHcjMzNR7DCLyIeVitmfPHvz2t7/lmUqiAKNUzN59912MHDkSJSUlAB7saprNZpjNZnzwwQewWq3YsWMHhg0bhl/96lfV3n/9+nW88cYbiI6ORk5ODgBg5cqVSEhIQKNGjfDkk0/ixIkT2uu//vprDBo0CE2aNEFMTAxmzJjhdNe2uLgY8+fPR3JyMqKiomA2m9G/f38cOHDA898I8pri4mLMmTMHCQkJaNy4MSwWC37zm9/g2LFj1V578+ZNLFq0CK1bt0ZmZibu3r2L5557Dk2aNMHgwYNx//597bUVFRXIzMzEr3/9a8TGxqJx48bo0qULFi9ebPc6ALBYLDCZTNqj8l7IokWL7Jb16tXL7r3nz5/HtGnT0KxZM+Tk5KCoqAgvv/wyHn74YURGRqJfv34Ovxa/IApKS0sTADJr1iztuaNHj0r//v2lQYMGAkBatmypLbtz545kZGRIWFiYABAAcuHCBRk/frw0atRIYmJitOebN28uN2/elC1btkjDhg2lRYsW0rBhQ235/Pnzq81z5coVSUpKknnz5smdO3ektLRUli9fLmFhYRIUFCQffPCB978pVG8FBQXSqVMniYyMlK1bt0pFRYUcP35cWrVqJSaTScLDwyUqKkosFoskJydLdHS0tl6sWrVKBgwYIJGRkdpze/bsEZEH61///v2lbdu2cvToURERycvLk5EjRwoASUpKkhs3btjNsnXrVu3zVF1/8vPzpVu3bgJA0tLStOfGjBljt45///330qNHD2natKk0b95ce75hw4ayf/9+r38/PS1gYmYzderUajErLy+XvLw8+fvf/679gY4bN07ee+89uXfvnoiIbNq0SUwmkwCQsWPHyoABA+T7778XEZGSkhIZNGiQAJCHHnrI7vezWq2Smpoqr7zySrVZZs6cqa08eXl5nvsGkFc8//zzAkBef/11u+c3btwoAKRBgwZy7tw57flr165JSEiIAJDHH39cPvnkE8nPz5cXX3xRhg4dKoWFhSIiMnr0aAkKCpLs7Gy7z1tRUSE9e/YUANKjRw8pLy+3Wx4bG+swZiIi06dPt4uZzZYtW7R1fNiwYbJy5UqxWq0iIrJ27VoJDQ0VABIfHy/379+v67dKFwEXs3/84x/VYmaTnZ2t/UEfOHCg2vLU1FQBIL179662Yu3fv197b35+vvZ8VlaWAJAzZ85U+3zbt2/X3rNkyZLaf6HkM/fv39e2wDdu3Gi3zGq1SlRUlMMtc9tW/ZgxYxx+3kOHDgkA6d69u8Plhw8fttu6q6xly5ZOYzZr1iyHMTt37pz2+VavXl3tfXPnztWWr1u3zuFMRqXUMTN3NGzY0OmyiIgI7eP4+Phqy1u1aqUtCw4OtlsWGxurfVxYWKh9vHHjRgBAjx49tON3tsfvfvc7hIWFISwsDBcuXKjT10O+kZ+fj3v37gEATCaT3TKTyYRHHnkEAJCbm2u3LDQ0FADQp08fh5935cqVAIBOnTo5XP74449r692///3vug1fSeX19oknnqi2/I9//CPCwsIAAPv376/37+dLAfcTAFVXxMqCgmpuu23FdCQkJET7uKKiQvs4OzsbwIOTCzWFlIytefPmaNSoEYqKinDx4sVqy+Wns+cPPfSQ3fM1rW8AcPDgQQBAVFSU09d069YNOTk5OHnyZC2nrj2z2Yzk5GQcOXKkWpiNLuC2zHzt9u3bAIC8vDx9B6F6CQoKwgsvvAAA+Ne//mV36c+dO3dw7tw5BAcHY+TIkbX6vLb1ori42OlrbFv9Vc9qektcXBwA+N3lTYyZl9l2XV1tslfemiNjWrhwIQYNGoSTJ09i8uTJKCgowI0bN5CRkYHi4mL85S9/Qfv27Wv1ORs1agQADrf2bMLDwwH8HBlvs+2BVN3KNDrGzMvatWsHAPjb3/6G8vJyh6+5fPky5s2b58uxqA7CwsLw1ltvISEhAUePHkWbNm3QsWNHFBcXY+fOnZg0aVKtP2fXrl0BAF999ZXT9ePu3bsAgN69e9s936DBg6NENf3YXk1bV86W3bhxAwCQmprq9L1GpGTMbMe+bAds9TR48GAAwLfffouJEydWW4GsViteeuklpKen6zEe1UJ2djb69OmDzZs34/Dhw8jPz8eNGzewdevWOv/5jR07FgBQUFCALVu2OHzN6dOnAQDjxo2ze75x48YAHhyPrerSpUsAgNLSUqe/d1FRUbXnSktL8fXXX6NRo0YYPny46y/AQJSMme0P+fPPP0dFRQXOnTuHt956C8DPxyZsPyVQWeXnHIXQ9i+noxWk8m5i5c8zatQoJCYmAgD++c9/Ii0tDZ9++im++eYbZGVl4amnnkJZWRl69uxZ66+TfGvChAkoKirCsWPH8O233+LMmTM4e/Yszp8/j7y8PIfHtGzrjLN7540aNUo7q/jGG29UW+9yc3Nx6NAhjB07Fo899pjdskcffRTAg7OctmNvV69exZgxY7QTT+fPn8f9+/cdbr05OvSxcuVK3L59G9OnT/e/W2TpeV2It/z5z3+2u2I/JiZGcnJy5N69e5Kenq4t27dvn/Yeq9WqXZsDQObOnatdTCgikpubK7/85S8FgLRq1UquXLli99558+Zp750zZ45UVFRoy8+ePSvx8fHa8sqPdu3ayY8//uiT7wvVT5cuXRz+GdoeoaGhMnjwYLl8+bKIiOzYsUOCgoIEgLRv314uXbrk8PP++OOPkpiYKACkX79+2oW33333nXTu3FkGDhwod+/erfa+Xbt2aRdyh4SESHx8vISFhUlmZqbduty6dWt55513RETk4sWL2vPR0dHy4Ycfyv3796W8vFzWrl0rERERMnLkSLv1118oGbPi4mJ59tlnJSIiQrp37y7Hjh2Tffv2aT/KVPnRt29fKS8vt/sxD9sjLCxMLl26JOPHj6+2zGQyydixY+XSpUvaVdNV31vZf//7X5k8ebK0atVKQkNDJS4uTiZNmmR3gS0Z27Vr1yQ+Pl46duwoMTExEhERocWq8iMpKUmSk5MdBq/qTw/YFBcXy7x58yQpKUnCw8OlRYsWkpaWJqtXr64xLKtXr5Y2bdpIeHi4PPHEE9o/0LNmzZLExERZtmyZ9pMGIvYx2759uwwcOFCioqLEbDZLt27dZMWKFX4ZMhERZW/OSORpb7/9NgoKCrBgwYJqy8rLy1FQUIADBw5gzJgxOHr0KDp06KDDlDXLycnRLvC9ePGidkGuCgLuolmiuti2bRsWLFiAq1evOlzeoEED/OIXv8CIESOwYMEClxdgk+fxO07kgtVqxYQJExASEuIyUsePH0dpaSkSEhJ8NB3ZMGZELvzvf//D1atXcfPmTaSlpWH79u3Vzjrevn0b77//PoYMGYIVK1a4/DEmvVQ+4+rojL4/Y8yIXDCbzVi6dCnCwsJw5MgRPP3004iMjERcXBzatm2L2NhYREdHY8GCBdi8eTN69Oih98gOVVRUYO3atdqv16xZo9RPnvAEAJGbLl68iGXLlmHnzp3a9VtNmzZFcnIyhg4dinHjxml3nDCaPXv2YMCAASgrK7N7PiQkBAcPHqx2DZs/YsyISAnczSQiJTBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREpgTEjIiUwZkSkBMaMiJTAmBGREhgzAxs9ejQWLVqk9xj0k8TEROzcuVPvMcgJ3jbboH7/+99j3bp12q9fffVVHachi8WC27dvo3///vjPf/6Dp556Su+RqAreAsiARKTa7ZkXLlzIoOnEFjKbhIQEnDp1Sr+ByCHGzKBu374Ni8Vi9xyD5ntVQ2axWJz+h76kLx4zMyiz2YyCggK751577TUeQ/Mhhsy/MGYGxqDphyHzP4yZwTFovseQ+SfGzA8waL7DkPkvxsxPMGjex5D5N8bMjzBo3sOQ+T/GzM8waJ7HkKmBMfNDDJrnMGTq4EWzbsjJycGGDRsAAKmpqejZs6fOEz3AC2vrx+ghW7NmDa5fvw6TyYQpU6boPY7hMWZumDZtGhYvXqz92kjfMgatboweMgAwmUzax0Za54yKu5l+jructecPIaPaY8wUwKC5jyFTF2OmCAbNNYZMbYyZQhg05xgy9TFmimHQqmPIAgNjpiAG7WcMWeBgzBTFoDFkgYYxU1ggB40hCzyMmeICMWgMWWBizAJAIAWNIQtcjFmACISgMWSBjTELICoHjSEjxizAqBg0howAxiwgqRQ0hoxsGLMApULQGDKqjDELYP4cNIaMqmLMApw/Bo0hI0cYM/KroDFk5AxjRgD8I2gMGdWEMSONkYPGkJErjBnZMWLQGDJyB2NG1RgpaAwZuYsxI4eMEDSGjGqDMSOn9AwaQ0a1xZhRjfQIGkNGdcGYkUu+DBpDRnXFmJFbfBE0hozqgzEjt3kzaAwZ1RdjRrXijaAxZOQJjBnVmieDxpCRpzBmVCeeCBpDRp7EmFGd1SdoDBl5GmNG9VKXoDFk5A2MGdVbbYLGkJG3MGbkEe4EjSEjb2LMyGNqChpDRt7GmJFHOQsaQ0bexpiRxzkKmg1DRt7CmJFXmM1mREVFVXv+1Vdf1WEaCgSMGXmFxWLBnTt3qj2v9y24SV2MGXlc1YP9ZrPZbjmDRt7AmJFHOTprWVBQoPstuEl9jBl5TE2XXxjh/xQgtTFm5BHuXEfGoJE3MWZUb7W5IJZBI29hzKhe6nJlP4NG3sCYUZ3V50eUGDTyNMaM6sQTP2vJoJEnMWZUa578oXEGjTyFMaNa8cbdLxg08gTGjNzmzdv4MGhUX4wZucUX9yNj0Kg+GDNyyZc3VmTQqK4YM6qRHneIZdCoLhgzckrPW10zaFRbjBk5ZIR79jNoVBsN9B7AiPbu3YusrCzt1ytWrLBbPmnSJO3jpKQkZGRk+Go0nzBCyGxsQbNYLNpzr732GgC17lorIpg6dSrKy8sdLq+8zgHAtGnTEBsb64PJ/IhQNVlZWQLArcfUqVP1HtejzGaz3ddnsVj0HklERAoKCqp97xcuXKj3WB7VoUMHt9a54OBgKS4u1ntcw+FupgMpKSluv7Zr165enMS3jLRFVlUg7HK6u94lJiYiPDzcy9P4H8bMgdjYWMTExLj12tqEz8iMHDIb1YPm7rqkyjrnaYyZE+6sMGazGa1bt/bBNN7lDyGzUTlojFn9MGZOuLPCdOnSBSaTyQfTeI8/hcxG1aB16tQJQUGu/0oyZo4xZk64cyzM31cqfwyZjYpBi4iIQGJiYo2vCQ4ORqdOnXw0kX9hzJxwJ1T+fPDfn0Nmo2LQXK13PPjvHGPmhDsnAfx1y0yFkNmoFjRX65S/rnO+wJjVoKYVx18P/qsUMhuVgsaY1R1jVoOaVhx/PPivYshsVAmaq5MAjJlzjFkNajom5m8rlcohs1EhaDWdBODB/5oxZjWoKVj+dPA/EEJmo0LQnK13PPhfM8asBjWdBPCXLbNACpmNvwfN2brlL+ucXhgzFxytQEY7+P/xxx9j79691Z4PxJDZ1CZoCxcuxKVLl3w1mkuMWR3p/ZPuRjdz5sxqdy3o06eP3mNprl27JhaLRcLDw2XPnj3a80a9+4WvubrbxqxZswSApKeni9Vq1XHSnxUVFUlQUFC1ub/88ku9RzM0xswFR7cDMsptf6xWqwwZMkSbyxY0hsyes6DZQmZ7vP/++3qPqql6OyDe9sc1xsyFvLy8an8R1q9fr/dYIiKyZs0al/e+CvSQ2TgKWtVHZGSk5OTk6D2qiIiMGzfObraOHTvqPZLh8ZiZC45OAhjh2MX169cxceLEGl8TSMfIXHF0DK2qwsJCZGRkQER8NJVzVdcxI6xzRseYuaHyimSEg/8iggkTJrj8y7lhwwYfTeQfzGYzpkyZUuNrdu/ejeXLl/toIucYs9pjzNxQeUUywpX/a9euxebNm12+btCgQQ7Pcgaq2bNn45133nH5uilTpuh+drPqTwIwZq4xZm7o1asXunfvjpSUFAwaNEjXWdzZvbQpKSlh0H4ye/ZszJkzx63XGmF3MyIiAsOHD0dKSgq6d+/OK//dwP+dyQ29e/fGkSNH9B7D7d3LysLDw3Hr1i0vTmV85eXlKCoqQlBQEKxWq1vvse1ujh8/3svTOffxxx/r9nv7I5MY4WgnueXDDz/E6NGj3X790KFDsWzZMrf/PwPVffnll3j++efxww8/uPX6yMhIfPfdd2jZsqWXJyNP4G6mn6jN7mV0dDTWrl2Lzz77jCGrJDU1FSdOnMArr7zi1u2pjbC7Se5jzPxAbXYvhw4dilOnTuGZZ57R/USFEYWHh2Px4sX4/PPP0a5dO5evN8rZTXKNu5l+wJ3dy+joaCxduhSjRo1ixNxUUlKCmTNn4t13363xWBp3N/0DY2Zw169fR0JCQo1bZTw2Vj/uHEtLT0/Hzp07+Q+FgXE308Bc7V7y2JhnuHMsjbubxsctMwOrafeSW2PeUdNWGnc3jY1bZgbl7Owlt8a8q6atNJ7dNDbGzICc7V7yTKVv1HTGk7ubxsWYGVBhYSGys7O1X3NrTB/OttIOHz6s41TkDI+ZGdTly5fRu3dvdOzYkcfGDMB2LC0pKQnr1q1DSEiI3iNRFYyZgd29exeRkZHcpTSIkpISNGjQgCEzKMaMiJTAY2ZEpATGjIiUwJgRkRIYMyJSAmNGREpgzIhICYwZESmBMSMiJTBmRKQExoyIlMCYEZESGDMiUgJjVsXbb78Nk8lU58eSJUv0/hLIjxQUFKBx48YwmUyIi4tDeXm5y/dUVFRgwIAB2jq3bt06H0xqfIxZFcePH6/X+zt06OChSSgQWCwWvPDCCwCAK1euYMOGDS7fM2XKFOzYsQMA8Oabb+KZZ57x6oz+grcAquLChQsoLi5267V3797FyJEjkZubCwDo0qULDh06hIiICG+OSIrJzc1FmzZtUFZWhh49euCrr75y+trly5dj/PjxAIDhw4fjk08+4f3ubITqpKSkRHr16iUABIC0b99ebt68qfdY5Keee+45bV364osvHL5m3759EhISIgCkc+fOUlRU5OMpjY27mXVQVlaGESNGYP/+/QCARx55BLt370azZs30HYz81rRp07QtLEfHXc+fP48RI0agrKwMMTExyMrK4h5AFYxZLVmtVowZMwZbt24FAMTGxmL37t2IjY3VeTLyZ4mJiXj66acBAJ9++ikuX76sLbtz5w4GDx6MW7duoWHDhti8eTNatGih16iGxZjV0osvvoj169cDAJo1a4Zdu3ahdevWOk9FKpg2bRqAB2crly5dqn08atQonD59GgCwatUqdO/eXbcZDU3v/Vx/MnnyZO24RpMmTeT48eN6jxRw/vSnP8ncuXNl27ZtcuPGDb3H8bjHHntMAIjZbJbCwkJ56aWXtHVuxowZeo9naIyZm2bPnq2tVBEREXLo0CG9RwpI7dq10/4cAEhcXJwMHTpUmcB99tln2tfWt29f7ePhw4eL1WrVezxD46UZbliyZAlefvllAEBoaCiysrLQv39/nacKTI8++ih++OGHGl8TFxeHlJQUu0fz5s19NGH9iAjat29v9zXykh/3MGYurFq1ChkZGRARBAcHY/369Rg+fLjeYwUsd2LmiD8FbuXKlcjIyAAAPPzwwzh69CgP+LtDz81Co1u/fr0EBQUJADGZTJKZman3SE5Nnz7dbveLD9cP2y7qrl279P7js7Nv3z5txtmzZ+s9jt/g2Uwntm3bhtGjR8NqtQIA/vrXv2Ls2LE6T+XcggUL9B7B7+Tm5mLTpk04ePCg3qPYOXnypPZxcnKybnP4G8bMgQMHDmgXKALA/PnzMXHiRJ2nIm8JCjLWX4NvvvlG+5gxc18DvQcwmuPHj2Pw4MEoKSkB8ODan9dff13nqVzbsWMH9uzZo/cYXrd48eJ6fw6TyYS2bduia9euSElJwcCBAz0wmefYtswsFgtatmyp7zB+hCcAKjl16hTS0tJw69YtAMCECROwbNkynaeiymp7AqBquFJSUtC5c2c0btzYi1PWXVlZGSIjI1FaWopevXph3759eo/kN7hl9pPz58+jX79+WsieffZZvPfeezpPRbXhb+Fy5PTp0ygtLQXAXczaYswA5OXlIT09HdeuXQMADBkyBJmZmYY7lkI/UyFcjvDgf90FfMwKCgqQnp6OnJwcAA92Y2bMmIEzZ864/Tni4uIQFRXlpQmpsjfffBPx8fFKhMsRHvyvu4A/ZvbRRx/V+06dX3zxBVJTUz00EQWyvn37Yu/evQgNDUVhYSFCQkL0HslvBPx+VHZ2dr3eHxwczH9ByWNsW2YJCQkMWS0F/JYZEakh4LfMiEgNjBkRKYExIyIlMGZEpATGjIiUwJgRkRIYMyJSAmNGREpgzIhICYwZESmBMSMiJTBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREpgTEjIiUwZkSkBMaMiJTAmBGREhgzIlICY0ZESmDMiEgJjBkRKYExIyIlMGZEpATGjIiUwJgRkRIYMyJSAmNGREpgzIhICYwZESmBMSMiJTBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREpgTEjIiUwZkSkBMaMiJTAmBGREhgzIlICY0ZESmDMiEgJjBkRKYExIyIlMGZEpATGjIiUwJgRkRIYMyJSAmNGREpgzIhICYwZESmBMSMiJTBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREp4f/x46BQsXZs+AAAAABJRU5ErkJggg==", + "text/plain": [ + "<Figure size 267.717x267.717 with 1 Axes>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "pgm = daft.PGM(dpi=DPI, grid_unit=GRID_UNIT, node_ec=NODE_EC)\n", + "\n", + "pgm.add_node(\"z\", \"$Z$\", 0, 0)\n", + "pgm.add_node(\"y\", \"$Y$\", 1, 0)\n", + "pgm.add_node(\"t\", \"time\", 0, 1)\n", + "pgm.add_node(\"g\", \"group\", 1, 1)\n", + "\n", + "pgm.add_edge(\"t\", \"z\")\n", + "pgm.add_edge(\"t\", \"y\")\n", + "pgm.add_edge(\"g\", \"z\")\n", + "pgm.add_edge(\"g\", \"y\")\n", + "pgm.add_edge(\"z\", \"y\")\n", + "\n", + "pgm.render();" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Readers are referred to {cite:t}`huntington2021effect` for more details." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Regression Discontinuity\n", + " \n", + "The causal graph for the regression discontinuity design is shown below (left). $A$ is a continuous running variable which determines the treatment assignment $A \\rightarrow Z$. Assignment is based on a cutoff value $a_c$. The running variable may also influence the outcome $A \\rightarrow Y$. The running variable may also be associated with a set of variables $\\mathbf{X}$ that influence the outcome, $A - - - - \\mathbf{X}$." + ] + }, + { + "cell_type": "code", + "execution_count": 83, + "metadata": { + "tags": [ + "remove-input" + ] + }, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "<Figure size 834.646x393.701 with 1 Axes>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "pgm = daft.PGM(dpi=DPI, grid_unit=GRID_UNIT, node_ec=NODE_EC)\n", + "\n", + "# data generating graph\n", + "pgm.add_node(\"a\", \"$A$\", 0, 1)\n", + "pgm.add_node(\"z\", \"$Z$\", 0, 0)\n", + "pgm.add_node(\"x\", \"$\\mathbf{X}$\", 1, 1)\n", + "pgm.add_node(\"y\", \"$Y$\", 1, 0)\n", + "pgm.add_edge(\"a\", \"z\")\n", + "pgm.add_edge(\"a\", \"y\")\n", + "pgm.add_edge(\n", + " \"a\",\n", + " \"x\",\n", + " plot_params={\"ec\": \"grey\", \"lw\": 1.5, \"ls\": \":\", \"head_length\": 0, \"head_width\": 0},\n", + ")\n", + "pgm.add_edge(\"z\", \"y\")\n", + "pgm.add_edge(\"x\", \"y\")\n", + "pgm.add_text(0, 1.3, \"Data generating graph\")\n", + "\n", + "# limiting graph\n", + "x_offset = 2\n", + "pgm.add_node(\"a2\", r\"$A \\rightarrow a_c$\", 0 + x_offset, 1)\n", + "pgm.add_node(\"z2\", \"$Z$\", 0 + x_offset, 0)\n", + "pgm.add_node(\"x2\", \"$\\mathbf{X}$\", 1 + x_offset, 1)\n", + "pgm.add_node(\"y2\", \"$Y$\", 1 + x_offset, 0)\n", + "pgm.add_edge(\"a2\", \"z2\")\n", + "pgm.add_edge(\"z2\", \"y2\")\n", + "pgm.add_edge(\"x2\", \"y2\")\n", + "pgm.add_text(x_offset, 1.3, \"Limiting graph\")\n", + "\n", + "pgm.render();" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The causal effect of $Z \\rightarrow Y$ is identified by comparing the outcome for units just above and just below the cutoff value, $A \\rightarrow a_c$.\n", + "\n", + "Readers are referred to {cite:t}`steiner2017graphical` and {cite:t}`cunningham2021causal` who discuss limiting graphs in more detail. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## References\n", + ":::{bibliography}\n", + ":filter: docname in docnames\n", + ":::" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "CausalPy", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.8" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/docs/source/references.bib b/docs/source/references.bib index 08034e5a..93acf8a8 100644 --- a/docs/source/references.bib +++ b/docs/source/references.bib @@ -76,3 +76,28 @@ @book{shadish_cook_cambell_2002 year={2002}, publisher={Houghton Mifflin Boston, MA} } + +@article{steiner2017graphical, + title={Graphical models for quasi-experimental designs}, + author={Steiner, Peter M and Kim, Yongnam and Hall, Courtney E and Su, Dan}, + journal={Sociological methods \& research}, + volume={46}, + number={2}, + pages={155--188}, + year={2017}, + publisher={SAGE Publications Sage CA: Los Angeles, CA} +} + +@book{cunningham2021causal, + title={Causal inference: The mixtape}, + author={Cunningham, Scott}, + year={2021}, + publisher={Yale university press} +} + +@book{huntington2021effect, + title={The effect: An introduction to research design and causality}, + author={Huntington-Klein, Nick}, + year={2021}, + publisher={Chapman and Hall/CRC} +} diff --git a/pyproject.toml b/pyproject.toml index 8a2d980c..a4dcaa48 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -28,6 +28,7 @@ requires-python = ">=3.10" # https://packaging.python.org/discussions/install-requires-vs-requirements/ dependencies = [ "arviz>=0.14.0", + "daft", "graphviz", "ipython!=8.7.0", "matplotlib>=3.5.3", From 17e0f4d81359e49435b5c9965d49f866ca99990f Mon Sep 17 00:00:00 2001 From: "Benjamin T. Vincent" <inferencelab@gmail.com> Date: Fri, 26 Apr 2024 17:29:32 +0100 Subject: [PATCH 02/10] updates to IV and DID --- docs/source/quasi_dags.ipynb | 77 ++++++++++++++++++++++++++---------- 1 file changed, 57 insertions(+), 20 deletions(-) diff --git a/docs/source/quasi_dags.ipynb b/docs/source/quasi_dags.ipynb index f0904b12..475f49d9 100644 --- a/docs/source/quasi_dags.ipynb +++ b/docs/source/quasi_dags.ipynb @@ -16,7 +16,17 @@ }, { "cell_type": "code", - "execution_count": 66, + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import daft\n", + "import matplotlib.pyplot as plt" + ] + }, + { + "cell_type": "code", + "execution_count": 2, "metadata": { "tags": [ "remove-input" @@ -24,7 +34,8 @@ }, "outputs": [], "source": [ - "import daft\n", + "ff = \"times new roman\"\n", + "plt.rcParams[\"font.family\"] = ff\n", "\n", "GRID_UNIT = 2.0\n", "DPI = 200\n", @@ -40,7 +51,7 @@ }, { "cell_type": "code", - "execution_count": 85, + "execution_count": 3, "metadata": { "tags": [ "remove-input" @@ -81,7 +92,7 @@ }, { "cell_type": "code", - "execution_count": 67, + "execution_count": 4, "metadata": { "tags": [ "remove-input" @@ -132,7 +143,7 @@ }, { "cell_type": "code", - "execution_count": 69, + "execution_count": 5, "metadata": { "tags": [ "remove-input" @@ -141,9 +152,9 @@ "outputs": [ { "data": { - "image/png": "", + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAdEAAAEMCAYAAACbY4xqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/H5lhTAAAACXBIWXMAAB7CAAAewgFu0HU+AAAVMElEQVR4nO3deWwU9f/H8de0RWsVEBRFpRqKolAoVrEgEaN4FFAjChENCkZBPIJGEa/QKBo8UCMqoilQFK2iKJ5osQgIpoIh0qrlUFGwiLdISg96zfcPfp0f23P3w+7O7szzkTTZnZ3dvMzWvvi897O7lm3btgAAQMgS3A4AAEC8okQBADBEiQIAYIgSBQDAECUKAIAhShQAAEOUKAAAhihRAAAMUaIAABiiRAEAMESJAgBgiBIFAMAQJQoAgCFKFAAAQ5QoAACGKFEAAAxRogAAGKJEAQAwRIkCAGCIEgUAwBAlCgCAIUoUAABDlCgAAIYoUQAADFGiAAAYokQBADBEiQIAYIgSBQDAECUKAIAhShQAAEOUKAAAhihRAAAMUaIAABiiRAEAMESJAgBgiBIFAMAQJQoAgCFKFAAAQ5QoAACGKFEAAAxRogAAGKJEAQAwRIkCAGCIEgUAwBAlCgCAIUoUAABDlCgAAIYoUQAADFGiAAAYokQBADBEiQIAYIgSBQDAECUKAIAhShQAAEOUKAAAhihRAAAMUaIAABiiRAEAMESJAgBgiBIFAMAQJQogrlRXV+u0006TZVkBP1OmTGnzfkVFRUpMTGx2vxUrVkQpObzIsm3bdjsEAISiqKhIQ4cOVUNDg3PMsix9/vnnGjp0aLPz9+3bp9NPP11btmwJOD5x4kTNmzcv4nnhXaxEAcSdIUOG6Pbbbw84Ztu2brzxRlVVVTU7f8aMGc0KtEePHnrqqacimhPex0oUQFyqrKxURkaGtm3bFnD87rvv1pNPPulc//rrrzVo0CDV1dUFnLds2TKNHDkyKlnhXZQogLi1evVqDRs2TAf+GUtMTFRRUZGysrJUV1engQMHqqSkJOB+48eP1yuvvBLtuPAgxrkA4tZ5552nm2++OeBYfX29brjhBtXU1Ojxxx9vVqDdu3fX7Nmzo5gSXsZKFEBc27t3r/r166cdO3YEHB83bpyWLFmimpqagONLly7VFVdcEc2I8DBKFEDcKyws1MUXX9zueWPHjtXixYujkAh+QYkC8ISJEydqwYIFrd7erVs3lZaWqlu3blFMBa+jRAF4wp49e5Senq5ff/21xdsXL16ssWPHRjkVvI6NRQA8oXPnzho2bFiLtx1++OFBjXuBUFGiADxhzZo1eu2111q8raKiQnfeeWeUE8EPGOcCiHtVVVXKyMjQjz/+2OZ5n3zyiYYPHx6lVPADVqIA4t706dObFWhSUlKz82666SaVl5dHKxZ8gBIFENfWr1/f7MMTLMvShx9+qD59+gQcLysr07Rp06KYDl5HiQKIWzU1NbrhhhsCvs1FkiZPnqzhw4crLy9PCQmBf+Zyc3O1atWqaMaEh1GiAOLWjBkztGnTpoBjqampmjVrliRp8ODBzTYU2batiRMnqrKyMmo54V1sLAIQlzZu3Oh8yPyBPv74Y40YMcK5XlVVpQEDBuiHH34IOO+OO+7gM3Rx0ChRAHGnrq5OZ511loqLiwOOt/btLF988YXOPffcgG97SUhI0Nq1azVkyJBIx4WHMc4FEHcee+yxZgV67LHH6plnnmnx/HPOOUdTpkwJONbQ0KAbb7xR1dXVkYoJH2AlCiCubNq0SZmZmc2+neXtt9/W6NGjW71fRUWFMjIy9NNPPwUcv/fee/X4449HJCu8jxIFAMAQ41wAAAxRogAAGKJEAQAwRIkCAGCIEgUAwBAlCgCAIUoUAABDlCgAAIYoUQAADFGiAAAYokQBADBEiQIAYIgSBQDAECUKAIAhShQAAEOUKAAAhihRAAAMUaIAABiiRAEAMESJAgBgiBIFAMAQJQoAgCFKFAAAQ5QoANf8/PPPbkcImW3bcZkbkUGJAnBFfn6+0tLS1LlzZ1VXV7sdJyjFxcVKSEhQWlqaKisr3Y6DGGDZtm27HQKAv1RVVSklJcW5fsopp+j77793MVFwLMtyLicmJqqurs7FNIgFrEQBRN2RRx4ZcH3Dhg3uBAlRbm6uc7m+vl6LFi1yMQ1iAStRAFGVn5+va6+91rk+d+5c3XLLLS4mCs2Bq1FJqqioCFhVw18oUQBR03SMK+3fqBNPysvL1alTJ+c6Y11/Y5wLIGqajnH37NnjTpCD0LFjR8a6cLASBRAV8T7GbYqxLiRKFEAUeGGM2xRjXUiMcwFEgRfGuE0x1oXEShRAhHltjNsUY11/o0QBRIwXx7hNMdb1N8a5ACLGi2Pcphjr+hsrUQAR4fUxblOMdf2JEgUQdn4Y4zbFWNefGOcCCDs/jHGbYqzrT6xEAYSV38a4TTHW9RdKFEDY+HGM2xRjXX9hnAsgbPw4xm2Ksa6/sBIFEBZ+H+M2xVjXHyhRAAeNMW5zjHX9gXEugIPGGLc5xrr+wEoUwEFhjNs2xrreRokCMMYYt32Mdb2NcS4AY4xx28dY19tYiQIwwhg3NIx1vYkSBRAyxrihY6zrTYxzAYSsS5cuAdcZ47aPsa43sRIFEBLGuAeHsa63UKIAgsYY9+Ax1vUWxrkAgsYY9+Ax1vUWVqIAgsIYN7wY63oDJQqgXYxxw4+xrjcwzgUQYOXKlaqpqQk4xhg3/IIZ6+7cuVOlpaXRjoYQUKIAHHv37tWoUaOUlZWl4uJiSfvHuPv27XPOmTt3bsAKCuYmTZoUcH3ChAmqrKyUbdvKy8tTenq6Hn74YZfSIRiMcwE4cnNzNXnyZElSUlKS7r33Xs2cOTPgHP5khFfTsa4kjRgxQp988omk/c9DWVmZunfv7kY8tIOVKABJ+8tx7ty5zvW6urpmBcoYN/yajnUlOQUq7X8e5s+fH+1YCBIlCkCStG7dOpWUlLR6e0ZGhg477LAoJvKPESNGtHl7bm4um45iFCUKQJICVqEt+eabb5SVldVm0SI0tm1r4cKF6tevX5vnlZWVadmyZVFKhVDwmigA/f333zrhhBOa7cptSVJSkp599lndeuutUUjmXbW1tbryyiv10UcfBXV+dna2CgoKIpwKoWIlCkALFy4MqkAlqXv37rr44osjnMj7OnTooKuuuqrZhy60Zvny5dq2bVuEUyFUlCjgcw0NDXrppZeCOrdHjx5atWqVTj755Ain8ofrrrtOr7zyStBFGuzzhOhhnAv4XEFBQbsbWyQKNJJeffVVTZgwod23D3Xt2lU7d+5kg1cMYSUK+Fx7G4okCjTSgl2R/vvvv1qyZEmUUiEYrEQBH9uxY4fS0tLU0NDQ6jkUaPQEsyIdPHiwvvzyyyimQltYiQI+lpubS4HGkGBWpOvWrdPGjRujmAptoUQBn6qpqWnzk3AoUHcEU6QvvvhiFBOhLZQo4FPvvvuu/vzzzxZvo0Dd1V6R5ufn8xGMMYISBXyqtQ1FFGhsaKtIKysrm31tGtzBxiLAh0pLS1v8qDkKNPa0ttmoT58+Ki0tDfo9pogMVqKAD7X0mhoFGptaW5Fu3rxZn3/+uUup0IgSBXxm7969zUaBFGhsa61I2WDkPkoU8JnXX39d5eXlznUKND60VKRLly7V77//7mIqUKKAz6xcudK5TIHGl6ZFWldXp7Vr17qcyt/YWAT4TG1trWbOnKlFixbp008/pUDj0KuvvqqcnBzNnTtXI0eOdDuOr1GigE/V1dUpKSnJ7RgwxPMXGyhRAAAM8ZooAACGKFEAAAxRogAAGKJEAQAwRIkCAGCIEgUAwBAlCgCAIUoUAABDlCgAAIYoUQAADFGiAAAYokQBADBEiQIAYIgSBQDAECUKAIAhShQAAEOUqKQXX3xRlmXJsizl5+c7x88//3xZlqUuXbqE/Jjjx493HvPtt98OZ1x40KxZs5zfF5Of2bNnu/2fgDiye/dudezYUZZlKTU1VXV1de3ep76+XiNGjHB+5954440oJI19lKik4uJi5/Lpp5/uXO7Xr58k6b///tOuXbuCfrySkhKnjIcMGaIxY8aEJSe8a8OGDQd1//79+4cpCfygS5cumjRpkiRp586dQf1Df+rUqSooKJAkTZ8+Xddcc01EM8YLy7Zt2+0Qbhs0aJC++uorJScna+/evUpMTJQkvfTSS7rlllskSStWrNAFF1wQ1ONlZ2fr008/lSR9+eWXGjx4cGSCwzN++uknVVZWBnVueXm5xo4dq7KyMknSGWecobVr1yolJSWSEeExZWVl6tWrl2prazVo0CCtW7eu1XPnzZunm266SZI0evRoLVmyRJZlRStqbLN9rr6+3k5JSbEl2QMHDgy4bc2aNbYkW5L93HPPBfV4hYWFzn3Gjh0bicjwsaqqKvu8885zfsf69Olj//XXX27HQpwaP36887tUVFTU4jmrVq2yO3ToYEuyMzMz7YqKiiinjG2+H+du3brVWQEcOMqVpPT0dOfypk2b2n0s27Z1zz33SJIOPfRQPfbYY+ELCt+rra3VmDFjtHr1aklSz549tWLFCh199NHuBkPcuueee5wVZUuvq2/btk1jxoxRbW2tunfvrg8++ICJRxO+L9EDXw/NzMwMuK1r16467rjjJEmbN29u97Hy8/O1ceNGSdLtt9+unj17hi8ofK2hoUHXXXedli1bJkk6/vjjtWLFCh1//PEuJ0M8S09P18iRIyVJ77zzjn755Rfntj179uiyyy7TP//8o+TkZL3//vvq0aOHW1FjFiXayqaiRo2r0fZWovv27VNOTo4k6aijjtIDDzwQtozA5MmT9eabb0qSjj76aBUWFiotLc3lVPCCxulZfX295syZ41y++uqrncVDXl6esrKyXMsYyyjR/ytRy7KUkZHR7PbGEv3rr7/0999/t/o4c+bM0fbt2yVJDz74oI488shwR4VPTZ06VfPnz5ckderUSQUFBerbt6/LqeAV5557rrP5cd68eaqoqNBdd93l7MTNyclhJ25b3H5R1m3HHnusLck+5ZRTWrw9NzfXeeF9zZo1LZ6ze/duu2vXrrYku3fv3nZNTU0kI/vaHXfcYT/yyCP2xx9/bP/xxx9ux4m4hx56yPn9S0lJsdeuXet2JN8pLCy0b7vtNjsvL88uKSmxa2tr3Y4UdkuXLnV+zy644ALn8ujRo+2Ghga348W0JLfKOxb8/vvv+uOPPyS1PMqV/v+9otL+ke7QoUObnfPoo4/q33//lSQ98cQT6tChQ/jDQpJUUFCgrVu3OtdTU1N15plnBvwcc8wxLiYMn9mzZ+uhhx6SJB1yyCFaunSpzjnnHHdD+dCuXbv0wgsvONeTk5M1YMAADRw40Pmd69u3r5KS4vfP6ahRo3Tqqadq69at+uyzzyTtf+vUokWLeCtLO+L3WQ+Dxk1AUvNNRY0O3KHb0uaisrIyPf/885L2j0VGjRoV3pBoU1lZmcrKyvTee+85x7xQrHl5ebrrrrskSYmJiXr99deVnZ3tcipIUnV1tdavX6/169c7x+K9WC3L0rRp0zRx4kRJ0nHHHaf333+fnbhBiI9nOELa21Qk7X8NqkePHtq5c2eLm4umT5+u6upqWZalp59+OkJJEYp4L9a33npLkyZNkm3bsixLCxYs0OjRo92OhTZ4oVh79erlXJ48eTI7cYPl9jzZTVdddZUz+9+1a1er52VnZ9uS7BNOOCHgeElJiZ2QkGBLsseNGxfpuG26//77nf8WfoL7SU1NtUeNGmUXFha6+twdaNmyZc4b2xXCh3y4obKy0vXnMN5+kpOT7UGDBtlTpkxx++lr5plnnnFyvvfee27HiRu+LtHevXvbkuxjjjmmzfOmTp3q/HLt2bPHOd5YrsnJyfaOHTsiHbdNbv9xiOefnJwcV5+7RqtXr7YPO+wwJ9fMmTPdjtSm3bt3u/7cxetPUlKS209fM9dff72Tb/v27W7HiRu+fYtLZWWlfvzxR0mtj3IbtfS66MqVK7V8+XJJ0p133qkTTzwxMkERcQkJ7v9vsGHDBl122WWqqqqStP+9e7zXGNHU+PJWly5ddNJJJ7kbJo7E5nA+CkpKStTQ0CCp9U1FjZp+/F9WVpamTZsmSerWrZvuu+++yAUNUkFBgbOrzsuefPLJg34My7LUu3dv57WqSy65JAzJzJWWlmr48OEqLy+XJN1888164oknXM0UjMMPP9z5/8DLCgsLA/ZPmOrcubPz2ujAgQMPPlgY1dbWOns+BgwY4HKa+OLbEg1mU1Gjvn37yrIs2batzZs364033tDXX38tSZoxY4Y6deoUwaTByc7O9sXuzQ8++CDgLS7taVqYZ555pjIzM9WxY8cIpgzetm3bdNFFF+mff/6RJI0bNy7g7RSxrEOHDpo1a5bbMSJu0aJFmjBhQkj3ObAwG0szLS0tZt8usnnzZtXU1Ehq/+8hAlGiav+X5ogjjtBJJ52k7du3q7i42PnuvT59+jjfyQf3xXphNvXrr7/qwgsv1G+//SZJuvzyy/Xyyy/HxHgZwYu3wmxJKH8PEcj3JZqSkqLevXu3e356erq2b9+uwsJC59isWbNidru618VbYTa1e/duXXjhhc5HRZ522mnKycnRli1bgn6M1NRUde7cOUIJ0RIvFGZLSkpKnMuUaGh82QANDQ367rvvJEn9+/cP6l/+6enpzjdoSNKwYcN06aWXRiwjWjZ9+nSdeOKJcVWYLVm+fHlAYW7ZsiXk18mKiop09tlnhzsamsjMzNTixYs9U5gtaVxUHHLIIXwuc4h8WaLff/99q98h2poDNxclJCToqaeeikQ0tOPaa691O0JYfPvttwd1/8TERFYMUdK/f3/179/f7RgR1bgS7du3Lx9bGiLLtm3b7RAAAMQjdjAAAGCIEgUAwBAlCgCAIUoUAABDlCgAAIYoUQAADFGiAAAYokQBADBEiQIAYIgSBQDAECUKAIAhShQAAEOUKAAAhihRAAAMUaIAABiiRAEAMESJAgBgiBIFAMAQJQoAgCFKFAAAQ5QoAACGKFEAAAxRogAAGKJEAQAwRIkCAGCIEgUAwBAlCgCAIUoUAABDlCgAAIYoUQAADFGiAAAYokQBADBEiQIAYIgSBQDAECUKAIAhShQAAEOUKAAAhihRAAAMUaIAABiiRAEAMESJAgBgiBIFAMAQJQoAgCFKFAAAQ5QoAACGKFEAAAxRogAAGKJEAQAwRIkCAGCIEgUAwBAlCgCAIUoUAABDlCgAAIYoUQAADFGiAAAYokQBADBEiQIAYIgSBQDAECUKAIAhShQAAEOUKAAAhihRAAAMUaIAABiiRAEAMESJAgBgiBIFAMAQJQoAgKH/AcCXPLig+oBWAAAAAElFTkSuQmCC", "text/plain": [ - "<Figure size 598.425x303.15 with 1 Axes>" + "<Figure size 425.197x228.346 with 1 Axes>" ] }, "metadata": {}, @@ -165,6 +176,18 @@ "pgm.render();" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{note}\n", + "The assumptions embodied in the DAG are:\n", + "1. The IV is independent of the confounders $\\mathbf{X}$.\n", + "2. The IV causally influences the treatment $Z$.\n", + "3. The IV does not causally influence the outcome $Y$, other than through the treatment $Z$.\n", + ":::" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -187,12 +210,12 @@ "source": [ "## Interrupted Time Series\n", "\n", - "A causal DAG for interrupted time series is given by {cite:t}`huntington2021effect`, though that book refers to it as [Event Studies](https://theeffectbook.net/ch-EventStudies.html). These kinds of studies are suited to situations where an intervention is made at a given point in time and any causal effect is assumed to have a lasting (not a transient) effect. Here's the causal DAG - note that $\\text{time}$ represents all the things changing over time such as the time index as well as time-varying predictor variables." + "A causal DAG for interrupted time series is given in Chapter 17 of {cite:t}`huntington2021effect`, though that book refers to it as [Event Studies](https://theeffectbook.net/ch-EventStudies.html). These kinds of studies are suited to situations where an intervention is made at a given point in time and any causal effect is assumed to have a lasting (not a transient) effect. Here's the causal DAG - note that $\\text{time}$ represents all the things changing over time such as the time index as well as time-varying predictor variables." ] }, { "cell_type": "code", - "execution_count": 99, + "execution_count": 6, "metadata": { "tags": [ "remove-input" @@ -248,7 +271,7 @@ }, { "cell_type": "code", - "execution_count": 93, + "execution_count": 7, "metadata": { "tags": [ "remove-input" @@ -257,9 +280,9 @@ "outputs": [ { "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAATMAAAEzCAYAAABdWOReAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/H5lhTAAAACXBIWXMAAB7CAAAewgFu0HU+AAAbmklEQVR4nO3deXCU9R3H8c8mJCEhkN1AMaYkIExBkgCBcBjRhiPIIFBgoAUrCM5EpDPFIgiKyiVQQKpDWyx1OEwLgigKhOEo9yEqhwMaEWQ4AiEcUhIoOSDHfvsH7mM22c1ukt19nv3t5zWzM2Gf3fBNeHjzXHkwiYiAiMjPBek9ABGRJzBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREpgTEjIiUwZkSkBMaMiJTAmBGREhgzIlICY0ZESmDMiEgJjBkRKYExIyIlMGZEpATGjIiUwJgRkRIYMyJSAmNGREpgzIhICYwZESmBMSMiJTBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREpgTEjIiUwZkSkBMaMiJTAmBGREhgzIlICY0ZESmDMiEgJjBkRKYExIyIlMGZEpATGjIiUwJgRkRIYMyJSAmNGREpgzIhICYwZESmBMSMiJTBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREpgTEjIiUwZkSkBMaMiJTAmBGREhgzIlJCwMVs7969aNq0KYYNGwYR0XscIvKQgIvZxo0bkZ+fj02bNiE/P1/vcYjIQxroPYCv/eEPf8CJEyfw5JNPomnTpnqPQ0QeYhJF97XOnj2Ljz76CDNnztR7FCLyAWV3MxcuXAir1ar3GETkI0rGbMeOHcjMzNR7DCLyIeVitmfPHvz2t7/lmUqiAKNUzN59912MHDkSJSUlAB7saprNZpjNZnzwwQewWq3YsWMHhg0bhl/96lfV3n/9+nW88cYbiI6ORk5ODgBg5cqVSEhIQKNGjfDkk0/ixIkT2uu//vprDBo0CE2aNEFMTAxmzJjhdNe2uLgY8+fPR3JyMqKiomA2m9G/f38cOHDA898I8pri4mLMmTMHCQkJaNy4MSwWC37zm9/g2LFj1V578+ZNLFq0CK1bt0ZmZibu3r2L5557Dk2aNMHgwYNx//597bUVFRXIzMzEr3/9a8TGxqJx48bo0qULFi9ebPc6ALBYLDCZTNqj8l7IokWL7Jb16tXL7r3nz5/HtGnT0KxZM+Tk5KCoqAgvv/wyHn74YURGRqJfv34Ovxa/IApKS0sTADJr1iztuaNHj0r//v2lQYMGAkBatmypLbtz545kZGRIWFiYABAAcuHCBRk/frw0atRIYmJitOebN28uN2/elC1btkjDhg2lRYsW0rBhQ235/Pnzq81z5coVSUpKknnz5smdO3ektLRUli9fLmFhYRIUFCQffPCB978pVG8FBQXSqVMniYyMlK1bt0pFRYUcP35cWrVqJSaTScLDwyUqKkosFoskJydLdHS0tl6sWrVKBgwYIJGRkdpze/bsEZEH61///v2lbdu2cvToURERycvLk5EjRwoASUpKkhs3btjNsnXrVu3zVF1/8vPzpVu3bgJA0tLStOfGjBljt45///330qNHD2natKk0b95ce75hw4ayf/9+r38/PS1gYmYzderUajErLy+XvLw8+fvf/679gY4bN07ee+89uXfvnoiIbNq0SUwmkwCQsWPHyoABA+T7778XEZGSkhIZNGiQAJCHHnrI7vezWq2Smpoqr7zySrVZZs6cqa08eXl5nvsGkFc8//zzAkBef/11u+c3btwoAKRBgwZy7tw57flr165JSEiIAJDHH39cPvnkE8nPz5cXX3xRhg4dKoWFhSIiMnr0aAkKCpLs7Gy7z1tRUSE9e/YUANKjRw8pLy+3Wx4bG+swZiIi06dPt4uZzZYtW7R1fNiwYbJy5UqxWq0iIrJ27VoJDQ0VABIfHy/379+v67dKFwEXs3/84x/VYmaTnZ2t/UEfOHCg2vLU1FQBIL179662Yu3fv197b35+vvZ8VlaWAJAzZ85U+3zbt2/X3rNkyZLaf6HkM/fv39e2wDdu3Gi3zGq1SlRUlMMtc9tW/ZgxYxx+3kOHDgkA6d69u8Plhw8fttu6q6xly5ZOYzZr1iyHMTt37pz2+VavXl3tfXPnztWWr1u3zuFMRqXUMTN3NGzY0OmyiIgI7eP4+Phqy1u1aqUtCw4OtlsWGxurfVxYWKh9vHHjRgBAjx49tON3tsfvfvc7hIWFISwsDBcuXKjT10O+kZ+fj3v37gEATCaT3TKTyYRHHnkEAJCbm2u3LDQ0FADQp08fh5935cqVAIBOnTo5XP74449r692///3vug1fSeX19oknnqi2/I9//CPCwsIAAPv376/37+dLAfcTAFVXxMqCgmpuu23FdCQkJET7uKKiQvs4OzsbwIOTCzWFlIytefPmaNSoEYqKinDx4sVqy+Wns+cPPfSQ3fM1rW8AcPDgQQBAVFSU09d069YNOTk5OHnyZC2nrj2z2Yzk5GQcOXKkWpiNLuC2zHzt9u3bAIC8vDx9B6F6CQoKwgsvvAAA+Ne//mV36c+dO3dw7tw5BAcHY+TIkbX6vLb1ori42OlrbFv9Vc9qektcXBwA+N3lTYyZl9l2XV1tslfemiNjWrhwIQYNGoSTJ09i8uTJKCgowI0bN5CRkYHi4mL85S9/Qfv27Wv1ORs1agQADrf2bMLDwwH8HBlvs+2BVN3KNDrGzMvatWsHAPjb3/6G8vJyh6+5fPky5s2b58uxqA7CwsLw1ltvISEhAUePHkWbNm3QsWNHFBcXY+fOnZg0aVKtP2fXrl0BAF999ZXT9ePu3bsAgN69e9s936DBg6NENf3YXk1bV86W3bhxAwCQmprq9L1GpGTMbMe+bAds9TR48GAAwLfffouJEydWW4GsViteeuklpKen6zEe1UJ2djb69OmDzZs34/Dhw8jPz8eNGzewdevWOv/5jR07FgBQUFCALVu2OHzN6dOnAQDjxo2ze75x48YAHhyPrerSpUsAgNLSUqe/d1FRUbXnSktL8fXXX6NRo0YYPny46y/AQJSMme0P+fPPP0dFRQXOnTuHt956C8DPxyZsPyVQWeXnHIXQ9i+noxWk8m5i5c8zatQoJCYmAgD++c9/Ii0tDZ9++im++eYbZGVl4amnnkJZWRl69uxZ66+TfGvChAkoKirCsWPH8O233+LMmTM4e/Yszp8/j7y8PIfHtGzrjLN7540aNUo7q/jGG29UW+9yc3Nx6NAhjB07Fo899pjdskcffRTAg7OctmNvV69exZgxY7QTT+fPn8f9+/cdbr05OvSxcuVK3L59G9OnT/e/W2TpeV2It/z5z3+2u2I/JiZGcnJy5N69e5Kenq4t27dvn/Yeq9WqXZsDQObOnatdTCgikpubK7/85S8FgLRq1UquXLli99558+Zp750zZ45UVFRoy8+ePSvx8fHa8sqPdu3ayY8//uiT7wvVT5cuXRz+GdoeoaGhMnjwYLl8+bKIiOzYsUOCgoIEgLRv314uXbrk8PP++OOPkpiYKACkX79+2oW33333nXTu3FkGDhwod+/erfa+Xbt2aRdyh4SESHx8vISFhUlmZqbduty6dWt55513RETk4sWL2vPR0dHy4Ycfyv3796W8vFzWrl0rERERMnLkSLv1118oGbPi4mJ59tlnJSIiQrp37y7Hjh2Tffv2aT/KVPnRt29fKS8vt/sxD9sjLCxMLl26JOPHj6+2zGQyydixY+XSpUvaVdNV31vZf//7X5k8ebK0atVKQkNDJS4uTiZNmmR3gS0Z27Vr1yQ+Pl46duwoMTExEhERocWq8iMpKUmSk5MdBq/qTw/YFBcXy7x58yQpKUnCw8OlRYsWkpaWJqtXr64xLKtXr5Y2bdpIeHi4PPHEE9o/0LNmzZLExERZtmyZ9pMGIvYx2759uwwcOFCioqLEbDZLt27dZMWKFX4ZMhERZW/OSORpb7/9NgoKCrBgwYJqy8rLy1FQUIADBw5gzJgxOHr0KDp06KDDlDXLycnRLvC9ePGidkGuCgLuolmiuti2bRsWLFiAq1evOlzeoEED/OIXv8CIESOwYMEClxdgk+fxO07kgtVqxYQJExASEuIyUsePH0dpaSkSEhJ8NB3ZMGZELvzvf//D1atXcfPmTaSlpWH79u3Vzjrevn0b77//PoYMGYIVK1a4/DEmvVQ+4+rojL4/Y8yIXDCbzVi6dCnCwsJw5MgRPP3004iMjERcXBzatm2L2NhYREdHY8GCBdi8eTN69Oih98gOVVRUYO3atdqv16xZo9RPnvAEAJGbLl68iGXLlmHnzp3a9VtNmzZFcnIyhg4dinHjxml3nDCaPXv2YMCAASgrK7N7PiQkBAcPHqx2DZs/YsyISAnczSQiJTBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREpgTEjIiUwZkSkBMaMiJTAmBGREhgzAxs9ejQWLVqk9xj0k8TEROzcuVPvMcgJ3jbboH7/+99j3bp12q9fffVVHachi8WC27dvo3///vjPf/6Dp556Su+RqAreAsiARKTa7ZkXLlzIoOnEFjKbhIQEnDp1Sr+ByCHGzKBu374Ni8Vi9xyD5ntVQ2axWJz+h76kLx4zMyiz2YyCggK751577TUeQ/Mhhsy/MGYGxqDphyHzP4yZwTFovseQ+SfGzA8waL7DkPkvxsxPMGjex5D5N8bMjzBo3sOQ+T/GzM8waJ7HkKmBMfNDDJrnMGTq4EWzbsjJycGGDRsAAKmpqejZs6fOEz3AC2vrx+ghW7NmDa5fvw6TyYQpU6boPY7hMWZumDZtGhYvXqz92kjfMgatboweMgAwmUzax0Za54yKu5l+jructecPIaPaY8wUwKC5jyFTF2OmCAbNNYZMbYyZQhg05xgy9TFmimHQqmPIAgNjpiAG7WcMWeBgzBTFoDFkgYYxU1ggB40hCzyMmeICMWgMWWBizAJAIAWNIQtcjFmACISgMWSBjTELICoHjSEjxizAqBg0howAxiwgqRQ0hoxsGLMApULQGDKqjDELYP4cNIaMqmLMApw/Bo0hI0cYM/KroDFk5AxjRgD8I2gMGdWEMSONkYPGkJErjBnZMWLQGDJyB2NG1RgpaAwZuYsxI4eMEDSGjGqDMSOn9AwaQ0a1xZhRjfQIGkNGdcGYkUu+DBpDRnXFmJFbfBE0hozqgzEjt3kzaAwZ1RdjRrXijaAxZOQJjBnVmieDxpCRpzBmVCeeCBpDRp7EmFGd1SdoDBl5GmNG9VKXoDFk5A2MGdVbbYLGkJG3MGbkEe4EjSEjb2LMyGNqChpDRt7GmJFHOQsaQ0bexpiRxzkKmg1DRt7CmJFXmM1mREVFVXv+1Vdf1WEaCgSMGXmFxWLBnTt3qj2v9y24SV2MGXlc1YP9ZrPZbjmDRt7AmJFHOTprWVBQoPstuEl9jBl5TE2XXxjh/xQgtTFm5BHuXEfGoJE3MWZUb7W5IJZBI29hzKhe6nJlP4NG3sCYUZ3V50eUGDTyNMaM6sQTP2vJoJEnMWZUa578oXEGjTyFMaNa8cbdLxg08gTGjNzmzdv4MGhUX4wZucUX9yNj0Kg+GDNyyZc3VmTQqK4YM6qRHneIZdCoLhgzckrPW10zaFRbjBk5ZIR79jNoVBsN9B7AiPbu3YusrCzt1ytWrLBbPmnSJO3jpKQkZGRk+Go0nzBCyGxsQbNYLNpzr732GgC17lorIpg6dSrKy8sdLq+8zgHAtGnTEBsb64PJ/IhQNVlZWQLArcfUqVP1HtejzGaz3ddnsVj0HklERAoKCqp97xcuXKj3WB7VoUMHt9a54OBgKS4u1ntcw+FupgMpKSluv7Zr165enMS3jLRFVlUg7HK6u94lJiYiPDzcy9P4H8bMgdjYWMTExLj12tqEz8iMHDIb1YPm7rqkyjrnaYyZE+6sMGazGa1bt/bBNN7lDyGzUTlojFn9MGZOuLPCdOnSBSaTyQfTeI8/hcxG1aB16tQJQUGu/0oyZo4xZk64cyzM31cqfwyZjYpBi4iIQGJiYo2vCQ4ORqdOnXw0kX9hzJxwJ1T+fPDfn0Nmo2LQXK13PPjvHGPmhDsnAfx1y0yFkNmoFjRX65S/rnO+wJjVoKYVx18P/qsUMhuVgsaY1R1jVoOaVhx/PPivYshsVAmaq5MAjJlzjFkNajom5m8rlcohs1EhaDWdBODB/5oxZjWoKVj+dPA/EEJmo0LQnK13PPhfM8asBjWdBPCXLbNACpmNvwfN2brlL+ucXhgzFxytQEY7+P/xxx9j79691Z4PxJDZ1CZoCxcuxKVLl3w1mkuMWR3p/ZPuRjdz5sxqdy3o06eP3mNprl27JhaLRcLDw2XPnj3a80a9+4WvubrbxqxZswSApKeni9Vq1XHSnxUVFUlQUFC1ub/88ku9RzM0xswFR7cDMsptf6xWqwwZMkSbyxY0hsyes6DZQmZ7vP/++3qPqql6OyDe9sc1xsyFvLy8an8R1q9fr/dYIiKyZs0al/e+CvSQ2TgKWtVHZGSk5OTk6D2qiIiMGzfObraOHTvqPZLh8ZiZC45OAhjh2MX169cxceLEGl8TSMfIXHF0DK2qwsJCZGRkQER8NJVzVdcxI6xzRseYuaHyimSEg/8iggkTJrj8y7lhwwYfTeQfzGYzpkyZUuNrdu/ejeXLl/toIucYs9pjzNxQeUUywpX/a9euxebNm12+btCgQQ7Pcgaq2bNn45133nH5uilTpuh+drPqTwIwZq4xZm7o1asXunfvjpSUFAwaNEjXWdzZvbQpKSlh0H4ye/ZszJkzx63XGmF3MyIiAsOHD0dKSgq6d+/OK//dwP+dyQ29e/fGkSNH9B7D7d3LysLDw3Hr1i0vTmV85eXlKCoqQlBQEKxWq1vvse1ujh8/3svTOffxxx/r9nv7I5MY4WgnueXDDz/E6NGj3X790KFDsWzZMrf/PwPVffnll3j++efxww8/uPX6yMhIfPfdd2jZsqWXJyNP4G6mn6jN7mV0dDTWrl2Lzz77jCGrJDU1FSdOnMArr7zi1u2pjbC7Se5jzPxAbXYvhw4dilOnTuGZZ57R/USFEYWHh2Px4sX4/PPP0a5dO5evN8rZTXKNu5l+wJ3dy+joaCxduhSjRo1ixNxUUlKCmTNn4t13363xWBp3N/0DY2Zw169fR0JCQo1bZTw2Vj/uHEtLT0/Hzp07+Q+FgXE308Bc7V7y2JhnuHMsjbubxsctMwOrafeSW2PeUdNWGnc3jY1bZgbl7Owlt8a8q6atNJ7dNDbGzICc7V7yTKVv1HTGk7ubxsWYGVBhYSGys7O1X3NrTB/OttIOHz6s41TkDI+ZGdTly5fRu3dvdOzYkcfGDMB2LC0pKQnr1q1DSEiI3iNRFYyZgd29exeRkZHcpTSIkpISNGjQgCEzKMaMiJTAY2ZEpATGjIiUwJgRkRIYMyJSAmNGREpgzIhICYwZESmBMSMiJTBmRKQExoyIlMCYEZESGDMiUgJjVsXbb78Nk8lU58eSJUv0/hLIjxQUFKBx48YwmUyIi4tDeXm5y/dUVFRgwIAB2jq3bt06H0xqfIxZFcePH6/X+zt06OChSSgQWCwWvPDCCwCAK1euYMOGDS7fM2XKFOzYsQMA8Oabb+KZZ57x6oz+grcAquLChQsoLi5267V3797FyJEjkZubCwDo0qULDh06hIiICG+OSIrJzc1FmzZtUFZWhh49euCrr75y+trly5dj/PjxAIDhw4fjk08+4f3ubITqpKSkRHr16iUABIC0b99ebt68qfdY5Keee+45bV364osvHL5m3759EhISIgCkc+fOUlRU5OMpjY27mXVQVlaGESNGYP/+/QCARx55BLt370azZs30HYz81rRp07QtLEfHXc+fP48RI0agrKwMMTExyMrK4h5AFYxZLVmtVowZMwZbt24FAMTGxmL37t2IjY3VeTLyZ4mJiXj66acBAJ9++ikuX76sLbtz5w4GDx6MW7duoWHDhti8eTNatGih16iGxZjV0osvvoj169cDAJo1a4Zdu3ahdevWOk9FKpg2bRqAB2crly5dqn08atQonD59GgCwatUqdO/eXbcZDU3v/Vx/MnnyZO24RpMmTeT48eN6jxRw/vSnP8ncuXNl27ZtcuPGDb3H8bjHHntMAIjZbJbCwkJ56aWXtHVuxowZeo9naIyZm2bPnq2tVBEREXLo0CG9RwpI7dq10/4cAEhcXJwMHTpUmcB99tln2tfWt29f7ePhw4eL1WrVezxD46UZbliyZAlefvllAEBoaCiysrLQv39/nacKTI8++ih++OGHGl8TFxeHlJQUu0fz5s19NGH9iAjat29v9zXykh/3MGYurFq1ChkZGRARBAcHY/369Rg+fLjeYwUsd2LmiD8FbuXKlcjIyAAAPPzwwzh69CgP+LtDz81Co1u/fr0EBQUJADGZTJKZman3SE5Nnz7dbveLD9cP2y7qrl279P7js7Nv3z5txtmzZ+s9jt/g2Uwntm3bhtGjR8NqtQIA/vrXv2Ls2LE6T+XcggUL9B7B7+Tm5mLTpk04ePCg3qPYOXnypPZxcnKybnP4G8bMgQMHDmgXKALA/PnzMXHiRJ2nIm8JCjLWX4NvvvlG+5gxc18DvQcwmuPHj2Pw4MEoKSkB8ODan9dff13nqVzbsWMH9uzZo/cYXrd48eJ6fw6TyYS2bduia9euSElJwcCBAz0wmefYtswsFgtatmyp7zB+hCcAKjl16hTS0tJw69YtAMCECROwbNkynaeiymp7AqBquFJSUtC5c2c0btzYi1PWXVlZGSIjI1FaWopevXph3759eo/kN7hl9pPz58+jX79+WsieffZZvPfeezpPRbXhb+Fy5PTp0ygtLQXAXczaYswA5OXlIT09HdeuXQMADBkyBJmZmYY7lkI/UyFcjvDgf90FfMwKCgqQnp6OnJwcAA92Y2bMmIEzZ864/Tni4uIQFRXlpQmpsjfffBPx8fFKhMsRHvyvu4A/ZvbRRx/V+06dX3zxBVJTUz00EQWyvn37Yu/evQgNDUVhYSFCQkL0HslvBPx+VHZ2dr3eHxwczH9ByWNsW2YJCQkMWS0F/JYZEakh4LfMiEgNjBkRKYExIyIlMGZEpATGjIiUwJgRkRIYMyJSAmNGREpgzIhICYwZESmBMSMiJTBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREpgTEjIiUwZkSkBMaMiJTAmBGREhgzIlICY0ZESmDMiEgJjBkRKYExIyIlMGZEpATGjIiUwJgRkRIYMyJSAmNGREpgzIhICYwZESmBMSMiJTBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREpgTEjIiUwZkSkBMaMiJTAmBGREhgzIlICY0ZESmDMiEgJjBkRKYExIyIlMGZEpATGjIiUwJgRkRIYMyJSAmNGREpgzIhICYwZESmBMSMiJTBmRKQExoyIlMCYEZESGDMiUgJjRkRKYMyISAmMGREp4f/x46BQsXZs+AAAAABJRU5ErkJggg==", + "image/png": "", "text/plain": [ - "<Figure size 267.717x267.717 with 1 Axes>" + "<Figure size 582.677x275.591 with 1 Axes>" ] }, "metadata": {}, @@ -269,16 +292,30 @@ "source": [ "pgm = daft.PGM(dpi=DPI, grid_unit=GRID_UNIT, node_ec=NODE_EC)\n", "\n", + "x_offset = 2\n", + "# time back door\n", "pgm.add_node(\"z\", \"$Z$\", 0, 0)\n", "pgm.add_node(\"y\", \"$Y$\", 1, 0)\n", "pgm.add_node(\"t\", \"time\", 0, 1)\n", - "pgm.add_node(\"g\", \"group\", 1, 1)\n", - "\n", + "# pgm.add_node(\"g\", \"group\", 1, 1)\n", "pgm.add_edge(\"t\", \"z\")\n", "pgm.add_edge(\"t\", \"y\")\n", - "pgm.add_edge(\"g\", \"z\")\n", - "pgm.add_edge(\"g\", \"y\")\n", + "# pgm.add_edge(\"g\", \"z\")\n", + "# pgm.add_edge(\"g\", \"y\")\n", "pgm.add_edge(\"z\", \"y\")\n", + "pgm.add_text(0, 1.3, \"Time back door\")\n", + "\n", + "# DAG for DiD\n", + "pgm.add_node(\"z2\", \"$Z$\", 0 + x_offset, 0)\n", + "pgm.add_node(\"y2\", \"$Y$\", 1 + x_offset, 0)\n", + "pgm.add_node(\"t2\", \"time\", 0 + x_offset, 1)\n", + "pgm.add_node(\"g2\", \"group\", 1 + x_offset, 1)\n", + "pgm.add_edge(\"t2\", \"z2\")\n", + "pgm.add_edge(\"t2\", \"y2\")\n", + "pgm.add_edge(\"g2\", \"z2\")\n", + "pgm.add_edge(\"g2\", \"y2\")\n", + "pgm.add_edge(\"z2\", \"y2\")\n", + "pgm.add_text(x_offset, 1.3, \"DAG for DiD\")\n", "\n", "pgm.render();" ] @@ -287,7 +324,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Readers are referred to {cite:t}`huntington2021effect` for more details." + "Readers are referred to Chapter 18 of {cite:t}`huntington2021effect` for more details." ] }, { @@ -301,7 +338,7 @@ }, { "cell_type": "code", - "execution_count": 83, + "execution_count": 8, "metadata": { "tags": [ "remove-input" @@ -310,9 +347,9 @@ "outputs": [ { "data": { - "image/png": "", + "image/png": "", "text/plain": [ - "<Figure size 834.646x393.701 with 1 Axes>" + "<Figure size 582.677x275.591 with 1 Axes>" ] }, "metadata": {}, From 68b59d0a0c7be661185f7fca858763923ea6201b Mon Sep 17 00:00:00 2001 From: "Benjamin T. Vincent" <inferencelab@gmail.com> Date: Mon, 29 Apr 2024 17:00:24 +0100 Subject: [PATCH 03/10] expand and improve multiple sections --- docs/source/quasi_dags.ipynb | 109 +++++++++++++++++++---------------- 1 file changed, 59 insertions(+), 50 deletions(-) diff --git a/docs/source/quasi_dags.ipynb b/docs/source/quasi_dags.ipynb index 475f49d9..b1f6b68e 100644 --- a/docs/source/quasi_dags.ipynb +++ b/docs/source/quasi_dags.ipynb @@ -11,13 +11,17 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This page provides an overview of structural causal models for some of the most common quasi-experiments. It takes inspiration from a paper by {cite:t}`steiner2017graphical`, and the books by {cite:t}`cunningham2021causal` and {cite:t}`huntington2021effect`, and readers are encouraged to consult these sources for more details." + "This page provides an overview of causal Directed Acyclic Graphs (DAG's) for some of the most common quasi-experiments. It takes inspiration from a paper by {cite:t}`steiner2017graphical`, and the books by {cite:t}`cunningham2021causal` and {cite:t}`huntington2021effect`, and readers are encouraged to consult these sources for more details." ] }, { "cell_type": "code", "execution_count": 1, - "metadata": {}, + "metadata": { + "tags": [ + "remove-input" + ] + }, "outputs": [], "source": [ "import daft\n", @@ -46,7 +50,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Before we take a look at randomized controlled trials (RCTs) and quasi-experiments, let's first consider the concept of confounding. Confounding occurs when a variable (or variables) causally influence both the treatment and the outcome and is very common in observational studies. This can lead to biased estimates of the treatment effect (the causal effect of $Z \\rightarrow Y$). The following causal DAG illustrates the concept of confounding." + "Before we take a look at randomized controlled trials (RCTs) and quasi-experiments, let's first consider the concept of confounding. Confounding occurs when a variable (or variables) causally influence both the treatment and the outcome and is very common in observational studies. This can lead to biased estimates of the treatment effect (the causal effect of $Z \\rightarrow Y$). The following causal DAG illustrates the concept of confounding. Note that the confounder is written as a vector because there may be multiple confounding variables, $\\mathbf{X}=x_1, x_2,x_3$." ] }, { @@ -87,7 +91,20 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Randomized controlled trials (RCTs) are considered the gold standard for estimating causal effects. One reason for this is that we (as experimenters) intervene in the system by randomly assigning subjects to treatment groups. This ensures that the treatment is independent of any confounding variables. Importantly, this act of intervention breaks the causal link of the confounders $\\mathbf{X}$ upon the treatment $Y$. The following causal DAG illustrates the structure of an RCT." + "One way to tell that our estimate of the causal relationship $Z \\rightarrow Y$ may be biased is the presence of a backdoor path, $Z \\leftarrow \\mathbf{X} \\rightarrow Y$. This path type is known as a \"fork\". Because $\\mathbf{X}$ is a common cause of $Z$ and $Y$, any observed statistical relation between $Z$ and $Y$ may be due to the confounding effect of $\\mathbf{X}$. \n", + "\n", + "Backdoor paths are problematic because they introduce _statistical associations_ between variables that do not reflect the true causal relationships, potentially leading to biased causal estimates. For example, if we ran a regression of the form `y ~ z`, and observe a main effect of $Z$ on $Y$, we have no way of knowing if this represents a true causal impact of $Z$ on $Y$, or if it is due to the confounding effect of $\\mathbf{X}$. \n", + "\n", + "One approach is to \"close the backdoor path\" by conditioning on the confounding variables. Practically, this could involve including confounders $\\mathbf{X}$ as a covariate in a regression model such as: `y ~ z + x₁ + x₂ + x₃`. Without explaining why, the coefficient for the main effect of $Z$ would now be an unbiased estimate of the _causal_ effect of $Z \\rightarrow Y$.\n", + "\n", + "However, unless we are very sure that we have accurate measures of _all_ confounding variables (maybe there is an $x_4$ that we don't know about or couldn't measure), it is still possible that our estimate of the causal effect is biased." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This leads us to Randomized Controlled Trials (RCTs) which are considered the gold standard for estimating causal effects. One reason for this is that we (as experimenters) intervene in the system by assigning units to treatment by {term}`random assignment`. Because of this intervention, any causal influence of the confounders upon the treatment $\\mathbf{X} \\rightarrow Z$ is broken - treamtent is now soley determined by the randomisation process, $R \\rightarrow T$. The following causal DAG illustrates the structure of an RCT." ] }, { @@ -129,7 +146,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The new variable $R$ represents the random assignment of units to the treatment group. So now $Z$ is entirely causally influenced by $R$, and not by any other variables. This means that the treatment effect $Z \\rightarrow Y$ can be estimated without bias." + "The new variable $R$ represents the random assignment of units to the treatment group. This means that the treatment effect $Z \\rightarrow Y$ can be estimated without bias." ] }, { @@ -180,28 +197,18 @@ "cell_type": "markdown", "metadata": {}, "source": [ - ":::{note}\n", - "The assumptions embodied in the DAG are:\n", - "1. The IV is independent of the confounders $\\mathbf{X}$.\n", - "2. The IV causally influences the treatment $Z$.\n", - "3. The IV does not causally influence the outcome $Y$, other than through the treatment $Z$.\n", - ":::" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Readers are referred to {cite:t}`steiner2017graphical` for a more in-depth discussion of the IV approach from the causal DAG and SCM perspective." + "Let's try to get some intuition of why having the $IV$ helps:\n", + "* The presence of $\\mathbf{X}$ is a confounder because it influences both $Z$ and $Y$.\n", + "* But the $IV$ helps overcome this confounding because it is not influenced by $\\mathbf{X}$.\n", + "* Any association between the $IV$ and $Y$ must be through the treatment $Z$.\n", + "* This means that the $IV$ can be used to estimate the causal effect of $Z \\rightarrow Y$, without being confounded by $\\mathbf{X}$. Informally, the $IV$ causes some variation in the treatment $Z$ that is not due to $\\mathbf{X}$, and this variation can be used to estimate the causal effect of $Z \\rightarrow Y$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - ":::{note}\n", - "TODO: Explain the intuition behind how the IV approach works.\n", - ":::" + "Readers are referred to {cite:t}`steiner2017graphical,cunningham2021causal` or {cite:t}`huntington2021effect` for a more in-depth discussion of the IV approach from the causal DAG perspective." ] }, { @@ -210,7 +217,7 @@ "source": [ "## Interrupted Time Series\n", "\n", - "A causal DAG for interrupted time series is given in Chapter 17 of {cite:t}`huntington2021effect`, though that book refers to it as [Event Studies](https://theeffectbook.net/ch-EventStudies.html). These kinds of studies are suited to situations where an intervention is made at a given point in time and any causal effect is assumed to have a lasting (not a transient) effect. Here's the causal DAG - note that $\\text{time}$ represents all the things changing over time such as the time index as well as time-varying predictor variables." + "A causal DAG for interrupted time series is given in Chapter 17 of {cite:t}`huntington2021effect`, though uses the [Event Study](https://theeffectbook.net/ch-EventStudies.html) label. These kinds of studies are suited to situations where an intervention is made at a given point in time at which we move from untreated to treated. Typically, we consider situations where there are a 'decent' number of observations over time. Here's the causal DAG - note that $\\text{time}$ represents all the things changing over time such as the time index as well as time-varying predictor variables." ] }, { @@ -255,7 +262,9 @@ "source": [ "What we want to understand is the causal effect of the treatment upon the outcome, $Z \\rightarrow Y$. But we have a back door path between $Z$ and $Y$ which will make this hard, $Z \\leftarrow \\text{after treatment} \\leftarrow \\text{time} \\rightarrow Y$.\n", "\n", - "The approach taken is to use the pre-treatment data only to create a prediction of what would have happened in the absence of treatment (i.e. the counterfactual). If we can assume that in the absence of the treatment, nothing would have changed, then this counterfactual estimate will be unbiased and we can estimate the treatment effect by comparing the observed (post-treatment) data with the counterfactual." + "The approach taken is:\n", + "1. Use the pre-treatment data only to create a prediction of what would have happened in the absence of treatment (i.e. the counterfactual). Splitting the dataset like this breaks the back door by removing any variation in $\\text{after treatment}$, all values are 0.\n", + "2. If we can assume that in the absence of the treatment, nothing would have changed, then this counterfactual estimate will be unbiased and we can estimate the treatment effect by comparing the observed (post-treatment) data (where all values of $\\text{after treatment}$ are 1) with the counterfactual. " ] }, { @@ -264,14 +273,12 @@ "source": [ "## Difference in Differences\n", "\n", - ":::{warning}\n", - "This section, including the DAG is a work in progress.\n", - ":::" + "Difference in Difference studies involve comparing the change in outcomes over time between a treatment and control group. The causal DAG for this is given in Chapter 18 of {cite:t}`huntington2021effect`:" ] }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 8, "metadata": { "tags": [ "remove-input" @@ -280,9 +287,9 @@ "outputs": [ { "data": { - "image/png": "", + "image/png": "", "text/plain": [ - "<Figure size 582.677x275.591 with 1 Axes>" + "<Figure size 267.717x267.717 with 1 Axes>" ] }, "metadata": {}, @@ -292,31 +299,15 @@ "source": [ "pgm = daft.PGM(dpi=DPI, grid_unit=GRID_UNIT, node_ec=NODE_EC)\n", "\n", - "x_offset = 2\n", - "# time back door\n", "pgm.add_node(\"z\", \"$Z$\", 0, 0)\n", "pgm.add_node(\"y\", \"$Y$\", 1, 0)\n", "pgm.add_node(\"t\", \"time\", 0, 1)\n", - "# pgm.add_node(\"g\", \"group\", 1, 1)\n", + "pgm.add_node(\"g\", \"group\", 1, 1)\n", "pgm.add_edge(\"t\", \"z\")\n", "pgm.add_edge(\"t\", \"y\")\n", - "# pgm.add_edge(\"g\", \"z\")\n", - "# pgm.add_edge(\"g\", \"y\")\n", + "pgm.add_edge(\"g\", \"z\")\n", + "pgm.add_edge(\"g\", \"y\")\n", "pgm.add_edge(\"z\", \"y\")\n", - "pgm.add_text(0, 1.3, \"Time back door\")\n", - "\n", - "# DAG for DiD\n", - "pgm.add_node(\"z2\", \"$Z$\", 0 + x_offset, 0)\n", - "pgm.add_node(\"y2\", \"$Y$\", 1 + x_offset, 0)\n", - "pgm.add_node(\"t2\", \"time\", 0 + x_offset, 1)\n", - "pgm.add_node(\"g2\", \"group\", 1 + x_offset, 1)\n", - "pgm.add_edge(\"t2\", \"z2\")\n", - "pgm.add_edge(\"t2\", \"y2\")\n", - "pgm.add_edge(\"g2\", \"z2\")\n", - "pgm.add_edge(\"g2\", \"y2\")\n", - "pgm.add_edge(\"z2\", \"y2\")\n", - "pgm.add_text(x_offset, 1.3, \"DAG for DiD\")\n", - "\n", "pgm.render();" ] }, @@ -324,7 +315,16 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Readers are referred to Chapter 18 of {cite:t}`huntington2021effect` for more details." + "Readers are referred to Chapter 18 of {cite:t}`huntington2021effect` for more discussion on the causal DAG for Difference in Differences studies." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{warning}\n", + "This section is unfinished\n", + ":::" ] }, { @@ -338,7 +338,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 9, "metadata": { "tags": [ "remove-input" @@ -398,6 +398,15 @@ "Readers are referred to {cite:t}`steiner2017graphical` and {cite:t}`cunningham2021causal` who discuss limiting graphs in more detail. " ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{warning}\n", + "This section is unfinished\n", + ":::" + ] + }, { "cell_type": "markdown", "metadata": {}, From d5d61dfedd9caa362a81c726c4e421a46556c8b1 Mon Sep 17 00:00:00 2001 From: "Benjamin T. Vincent" <inferencelab@gmail.com> Date: Mon, 29 Apr 2024 17:10:50 +0100 Subject: [PATCH 04/10] move daft into an optional dependencies --- pyproject.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pyproject.toml b/pyproject.toml index a4dcaa48..f72f8a01 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -28,7 +28,6 @@ requires-python = ">=3.10" # https://packaging.python.org/discussions/install-requires-vs-requirements/ dependencies = [ "arviz>=0.14.0", - "daft", "graphviz", "ipython!=8.7.0", "matplotlib>=3.5.3", @@ -53,6 +52,7 @@ dependencies = [ dev = ["pathlib", "pre-commit", "twine", "interrogate"] docs = [ "ipykernel", + "daft", "linkify-it-py", "myst-nb", "pathlib", From 447970b917c6735dea3387801c2e0c0785454639 Mon Sep 17 00:00:00 2001 From: "Benjamin T. Vincent" <inferencelab@gmail.com> Date: Tue, 30 Apr 2024 11:30:37 +0100 Subject: [PATCH 05/10] perfecting the ITS section --- docs/source/quasi_dags.ipynb | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/docs/source/quasi_dags.ipynb b/docs/source/quasi_dags.ipynb index b1f6b68e..a57e3806 100644 --- a/docs/source/quasi_dags.ipynb +++ b/docs/source/quasi_dags.ipynb @@ -217,7 +217,7 @@ "source": [ "## Interrupted Time Series\n", "\n", - "A causal DAG for interrupted time series is given in Chapter 17 of {cite:t}`huntington2021effect`, though uses the [Event Study](https://theeffectbook.net/ch-EventStudies.html) label. These kinds of studies are suited to situations where an intervention is made at a given point in time at which we move from untreated to treated. Typically, we consider situations where there are a 'decent' number of observations over time. Here's the causal DAG - note that $\\text{time}$ represents all the things changing over time such as the time index as well as time-varying predictor variables." + "A causal DAG for interrupted time series quasi-experiment is given in Chapter 17 of {cite:t}`huntington2021effect`, though they are labelled as [Event Studies](https://theeffectbook.net/ch-EventStudies.html). These kinds of studies are suited to situations where an intervention is made at a given point in time at which we move from untreated to treated. Typically, we consider situations where there are a 'decent' number of observations over time. Here's the causal DAG - note that $\\text{time}$ represents all the things changing over time such as the time index as well as time-varying predictor variables." ] }, { @@ -262,9 +262,13 @@ "source": [ "What we want to understand is the causal effect of the treatment upon the outcome, $Z \\rightarrow Y$. But we have a back door path between $Z$ and $Y$ which will make this hard, $Z \\leftarrow \\text{after treatment} \\leftarrow \\text{time} \\rightarrow Y$.\n", "\n", - "The approach taken is:\n", - "1. Use the pre-treatment data only to create a prediction of what would have happened in the absence of treatment (i.e. the counterfactual). Splitting the dataset like this breaks the back door by removing any variation in $\\text{after treatment}$, all values are 0.\n", - "2. If we can assume that in the absence of the treatment, nothing would have changed, then this counterfactual estimate will be unbiased and we can estimate the treatment effect by comparing the observed (post-treatment) data (where all values of $\\text{after treatment}$ are 1) with the counterfactual. " + ":::{note}\n", + "Below is an attempt to explain one way that we can deal with this. Though it is a bit of a brain-twister and can take some time to get your head around. Thanks to Nick Huntington-Klein for some clarification in [this twitter thread](https://twitter.com/inferencelab/status/1783882438063661374).\n", + ":::\n", + "\n", + "One approach we can use is:\n", + "1. We want to close the backdoor path, and one way to do this is to split the dataset into two parts: pre-treatment and post-treatment. By fitting a model only to the pre-treatment data, we have removed any variation in $\\text{after treatment}$ (all values are $0$), so there is now no variation in $Z$ caused by $\\text{time}$. This is one way to close a backdoor path, and means that a model fitted to this data (e.g. $Y_{\\text{pre}} \\sim f(\\text{time}_{\\text{pre}})$) will not be biased by the backdoor path.\n", + "2. However, our goal is to estimate the causal effects of the treatment $Z \\rightarrow Y$, but we have just removed any variation in $Z$ and it does not appear in the aforementioned model, $Y_{\\text{pre}} \\sim f(\\text{time}_{\\text{pre}})$, so our work is not done. One way to deal with this is to use the model to predict what would have happened in the post-treatment era if no treatment had been given. If we make the assumption that nothing would have changed in the absence of treatment, then this will be an unbiased estimate of the counterfactual. By comparing the counterfactual with the observed post-treatment data, we can estimate the treatment effect $Z \\rightarrow Y$. By focussing only on the post-treatment data we are looking at empirical outcomes $Y_\\text{post}$ which are affected by treatment $Z = 1$, but have closed the back door because all $\\text{after treatment} = 1$. The final comparison (subtraction) between the counterfactual estimate and the observed post-treatment data gives us the estimated treatment effect." ] }, { From 33408919e555ddb72a0df4686df888be231d1033 Mon Sep 17 00:00:00 2001 From: "Benjamin T. Vincent" <inferencelab@gmail.com> Date: Tue, 30 Apr 2024 12:16:44 +0100 Subject: [PATCH 06/10] finish the DID explanation --- docs/source/quasi_dags.ipynb | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/docs/source/quasi_dags.ipynb b/docs/source/quasi_dags.ipynb index a57e3806..e2304790 100644 --- a/docs/source/quasi_dags.ipynb +++ b/docs/source/quasi_dags.ipynb @@ -319,16 +319,19 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Readers are referred to Chapter 18 of {cite:t}`huntington2021effect` for more discussion on the causal DAG for Difference in Differences studies." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - ":::{warning}\n", - "This section is unfinished\n", - ":::" + ":::{note}\n", + "For our explanation below, we will assume we are dealing with the simplest case of a two-group, two-time period design, the so called \"classical\" 2$\\times$2 difference-in-differences design. \n", + ":::\n", + "\n", + "Our goal is to estimate the causal effect of the treatment on the outcome, $Z \\rightarrow Y$, but now we have _two_ backdoor paths:\n", + "1. $Z \\leftarrow \\text{time} \\rightarrow Y$\n", + "2. $Z \\leftarrow \\text{group} \\rightarrow Y$\n", + "\n", + "From a regression point of view, both $time$ and $group$ are binary variables. In this situation, treatment is given to the treatment group ($\\text{group}=1$) at time $\\text{time}=1$.\n", + "\n", + "The causal effect of the treatment upon the outcome is typically estimated by fitting a regression model of the form `y ~ time + group + time:group`. The interaction term `time:group` captures the causal effect of $Z \\rightarrow Y$. \n", + "\n", + "We can note that this interaction term $\\text{time} \\times \\text{group}$ encodes the values of $Z$, which as we said above, is equal to 1 for only the treatment group at time 1. So another way to think about the inclusion of an interaction effect is that we are simply conditioning on all the observed data ($Z$, $\\text{time}$, $\\text{group}$, $Y$) to estimate the causal effect of $Z \\rightarrow Y$." ] }, { From 3ebe0b87ce732fe173e2f62230527085aae740ff Mon Sep 17 00:00:00 2001 From: "Benjamin T. Vincent" <inferencelab@gmail.com> Date: Tue, 30 Apr 2024 12:22:08 +0100 Subject: [PATCH 07/10] synthetic control placeholder --- docs/source/quasi_dags.ipynb | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/docs/source/quasi_dags.ipynb b/docs/source/quasi_dags.ipynb index e2304790..84081401 100644 --- a/docs/source/quasi_dags.ipynb +++ b/docs/source/quasi_dags.ipynb @@ -334,6 +334,17 @@ "We can note that this interaction term $\\text{time} \\times \\text{group}$ encodes the values of $Z$, which as we said above, is equal to 1 for only the treatment group at time 1. So another way to think about the inclusion of an interaction effect is that we are simply conditioning on all the observed data ($Z$, $\\text{time}$, $\\text{group}$, $Y$) to estimate the causal effect of $Z \\rightarrow Y$." ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Synthetic Control\n", + "\n", + ":::{warning}\n", + "While many texts cover the synthetic control method, they typically do not provide a causal DAG-based treatment. So this section is pending - we hope to update it soon.\n", + ":::" + ] + }, { "cell_type": "markdown", "metadata": {}, From 943508ade58d220e918ce1fa1f219e831c60d947 Mon Sep 17 00:00:00 2001 From: "Benjamin T. Vincent" <inferencelab@gmail.com> Date: Tue, 30 Apr 2024 14:39:57 +0100 Subject: [PATCH 08/10] finish RDD section --- docs/source/quasi_dags.ipynb | 29 +++++++++++++---------------- 1 file changed, 13 insertions(+), 16 deletions(-) diff --git a/docs/source/quasi_dags.ipynb b/docs/source/quasi_dags.ipynb index 84081401..c6f88b05 100644 --- a/docs/source/quasi_dags.ipynb +++ b/docs/source/quasi_dags.ipynb @@ -351,12 +351,12 @@ "source": [ "## Regression Discontinuity\n", " \n", - "The causal graph for the regression discontinuity design is shown below (left). $A$ is a continuous running variable which determines the treatment assignment $A \\rightarrow Z$. Assignment is based on a cutoff value $a_c$. The running variable may also influence the outcome $A \\rightarrow Y$. The running variable may also be associated with a set of variables $\\mathbf{X}$ that influence the outcome, $A - - - - \\mathbf{X}$." + "The regression discontinuity design is similar to the interrupted time series design, but rather than the the treatment being at a specific point in time, treatment is based on a cutoff value $\\lambda$ along some running variable $RV$. This running variable could be a test score, age, spatial location, etc. The running variable may also influence the outcome $RV \\rightarrow Y$. The running variable may also be associated with a set of variables $\\mathbf{X}$ that influence the outcome, $RV - - - - \\mathbf{X} \\rightarrow Y$." ] }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 12, "metadata": { "tags": [ "remove-input" @@ -365,7 +365,7 @@ "outputs": [ { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "<Figure size 582.677x275.591 with 1 Axes>" ] @@ -378,7 +378,7 @@ "pgm = daft.PGM(dpi=DPI, grid_unit=GRID_UNIT, node_ec=NODE_EC)\n", "\n", "# data generating graph\n", - "pgm.add_node(\"a\", \"$A$\", 0, 1)\n", + "pgm.add_node(\"a\", \"$RV$\", 0, 1)\n", "pgm.add_node(\"z\", \"$Z$\", 0, 0)\n", "pgm.add_node(\"x\", \"$\\mathbf{X}$\", 1, 1)\n", "pgm.add_node(\"y\", \"$Y$\", 1, 0)\n", @@ -395,7 +395,7 @@ "\n", "# limiting graph\n", "x_offset = 2\n", - "pgm.add_node(\"a2\", r\"$A \\rightarrow a_c$\", 0 + x_offset, 1)\n", + "pgm.add_node(\"a2\", r\"$RV \\rightarrow \\lambda$\", 0 + x_offset, 1)\n", "pgm.add_node(\"z2\", \"$Z$\", 0 + x_offset, 0)\n", "pgm.add_node(\"x2\", \"$\\mathbf{X}$\", 1 + x_offset, 1)\n", "pgm.add_node(\"y2\", \"$Y$\", 1 + x_offset, 0)\n", @@ -411,18 +411,15 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The causal effect of $Z \\rightarrow Y$ is identified by comparing the outcome for units just above and just below the cutoff value, $A \\rightarrow a_c$.\n", + "We can see from the data generating graph (left) that the $RV$ is a confounding variable as it influences both the treatment $Z$ and the outcome $Y$. \n", "\n", - "Readers are referred to {cite:t}`steiner2017graphical` and {cite:t}`cunningham2021causal` who discuss limiting graphs in more detail. " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - ":::{warning}\n", - "This section is unfinished\n", - ":::" + "If we tried to identify the causal effect of $Z \\rightarrow Y$ by conditioning on the running variable ($RV=rv$), we would eliminate any variation in $Z$ or $Y$ caused by $RV$. And because $Z$ is constant for any given value of $RV$, then the $Z \\rightarrow Y$ path would disappear and we could not estimate the causal effect.\n", + "\n", + "Identification of the causal effect of $Z \\rightarrow Y$ is done with a limiting graph (left). The $RV$ node is replaced by a subset of the data where $RV$ is close to the cutoff value $\\lambda$, hence the name \"limiting graph\" and the symbol $RV \\rightarrow \\lambda$.\n", + "\n", + "In the limit, this eliminates variation in the running variable and so breaks the $RV \\rightarrow Y$ path. The causal effect of $Z \\rightarrow Y$ can be estimated by comparing the outcomes of units just above and just below the cutoff value $\\lambda$.\n", + "\n", + "Readers are referred to {cite:t}`steiner2017graphical` and [Chapter 6](https://mixtape.scunning.com/06-regression_discontinuity) of {cite:t}`cunningham2021causal` who discuss limiting graphs in more detail. Chapter 20 of {cite:t}`huntington2021effect` also covers regression discontinuity designs, but presents simplified (and non-kosher, in his own words) causal DAG." ] }, { From 2b31999b3f98c10c383207ab19bb4b282e986d49 Mon Sep 17 00:00:00 2001 From: "Benjamin T. Vincent" <inferencelab@gmail.com> Date: Sun, 5 May 2024 10:09:53 +0100 Subject: [PATCH 09/10] left -> right --- docs/source/quasi_dags.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/quasi_dags.ipynb b/docs/source/quasi_dags.ipynb index c6f88b05..6c2b942a 100644 --- a/docs/source/quasi_dags.ipynb +++ b/docs/source/quasi_dags.ipynb @@ -415,7 +415,7 @@ "\n", "If we tried to identify the causal effect of $Z \\rightarrow Y$ by conditioning on the running variable ($RV=rv$), we would eliminate any variation in $Z$ or $Y$ caused by $RV$. And because $Z$ is constant for any given value of $RV$, then the $Z \\rightarrow Y$ path would disappear and we could not estimate the causal effect.\n", "\n", - "Identification of the causal effect of $Z \\rightarrow Y$ is done with a limiting graph (left). The $RV$ node is replaced by a subset of the data where $RV$ is close to the cutoff value $\\lambda$, hence the name \"limiting graph\" and the symbol $RV \\rightarrow \\lambda$.\n", + "Identification of the causal effect of $Z \\rightarrow Y$ is done with a limiting graph (right). The $RV$ node is replaced by a subset of the data where $RV$ is close to the cutoff value $\\lambda$, hence the name \"limiting graph\" and the symbol $RV \\rightarrow \\lambda$.\n", "\n", "In the limit, this eliminates variation in the running variable and so breaks the $RV \\rightarrow Y$ path. The causal effect of $Z \\rightarrow Y$ can be estimated by comparing the outcomes of units just above and just below the cutoff value $\\lambda$.\n", "\n", From 245982d3d70c8476a1ff531c31b429e3ad5e6f69 Mon Sep 17 00:00:00 2001 From: "Benjamin T. Vincent" <inferencelab@gmail.com> Date: Sun, 5 May 2024 10:27:57 +0100 Subject: [PATCH 10/10] add admonition box about the parallel trends assumption + add glossary term --- docs/source/glossary.rst | 3 +++ docs/source/quasi_dags.ipynb | 6 +++++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/docs/source/glossary.rst b/docs/source/glossary.rst index c6aedbb2..6f31530d 100644 --- a/docs/source/glossary.rst +++ b/docs/source/glossary.rst @@ -53,6 +53,9 @@ Glossary One-group posttest-only design A design where a single group is exposed to a treatment and assessed on an outcome measure. There is no pretest measure or comparison group. + Parallel trends assumption + An assumption made in difference in differences designs that the trends (over time) in the outcome variable would have been the same between the treatment and control groups in the absence of the treatment. + Panel data Time series data collected on multiple units where the same units are observed at each time point. diff --git a/docs/source/quasi_dags.ipynb b/docs/source/quasi_dags.ipynb index 6c2b942a..dff18e53 100644 --- a/docs/source/quasi_dags.ipynb +++ b/docs/source/quasi_dags.ipynb @@ -331,7 +331,11 @@ "\n", "The causal effect of the treatment upon the outcome is typically estimated by fitting a regression model of the form `y ~ time + group + time:group`. The interaction term `time:group` captures the causal effect of $Z \\rightarrow Y$. \n", "\n", - "We can note that this interaction term $\\text{time} \\times \\text{group}$ encodes the values of $Z$, which as we said above, is equal to 1 for only the treatment group at time 1. So another way to think about the inclusion of an interaction effect is that we are simply conditioning on all the observed data ($Z$, $\\text{time}$, $\\text{group}$, $Y$) to estimate the causal effect of $Z \\rightarrow Y$." + "We can note that this interaction term $\\text{time} \\times \\text{group}$ encodes the values of $Z$, which as we said above, is equal to 1 for only the treatment group at time 1. So another way to think about the inclusion of an interaction effect is that we are simply conditioning on all the observed data ($Z$, $\\text{time}$, $\\text{group}$, $Y$) to estimate the causal effect of $Z \\rightarrow Y$.\n", + "\n", + ":::{warning}\n", + "Achieving an unbiased estimate is strongly dependent upon the {term}`parallel trends assumption`. That is, we assume that the treatment and control groups would have followed the same trajectory (over time) in the absence of treatment. This is a strong assumption and should be carefully considered when interpreting the results of a difference-in-differences study. In the case of the classic 2$\\times$2 design we cannot assess the validity of this assumption empirically, so it is important to consider the plausibility of this assumption in the context of the particular example. \n", + ":::" ] }, {