From 3c411598d34b019a251a14a36145dcb8d945980e Mon Sep 17 00:00:00 2001
From: David <daved@alum.mit.edu>
Date: Tue, 25 Apr 2017 10:44:34 +0000
Subject: [PATCH] Added sections on embeddings for medical ontologies and
 causal inference (#339)

This build is based on
https://github.com/greenelab/deep-review/commit/22c54f0e6e1d7530c34b1a5000cff8abb1365d13.

This commit was created by the following Travis CI build and job:
https://travis-ci.org/greenelab/deep-review/builds/225555590
https://travis-ci.org/greenelab/deep-review/jobs/225555591

[ci skip]

The full commit message that triggered this build is copied below:

Added sections on embeddings for medical ontologies and causal inference (#339)

* Add reference to Multi-task Prediction of Disease Onsets from Longitudinal Lab Tests in Electronic Helath Records data set.

* Added in the anchor and learn framework.  This isn't strictly deep learning, so I don't know if it should be included, but it is relevant ot the section.

* Added in references to neural embeddings in medical coding

* Added causal inference references

* Changed pmid to doi

* Changed PMC id to regular PubMed id.
---
 all-sections.md         |  37 +-
 bibliography.bib        |  48 +++
 bibliography.json       | 794 ++++++++++++++++++++++++++++++++++++++++
 citations.json          | 788 +++++++++++++++++++++++++++++++++++++++
 processed-citations.tsv |   8 +
 5 files changed, 1671 insertions(+), 4 deletions(-)

diff --git a/all-sections.md b/all-sections.md
index c1a2df78..d9d8fcb0 100644
--- a/all-sections.md
+++ b/all-sections.md
@@ -316,6 +316,22 @@ This indicates a potential strength of deep methods. It may be possible to
 repurpose features from task to task, improving overall predictions as the field
 tackles new challenges.
 
+Several authors have created reusable feature sets for medical terminologies using
+neural embeddings, as popularized by word2Vec [@1GhHIDxuW]. This approach
+was first used on free text medical notes by De Vine et al.
+[@XQtuRkTU] with results at or better than traditional methods.
+Y. Choi et al.[@1qa47hoP] built embeddings of standardized
+terminologies, such as ICD and NDC, used in widely available administrative
+claims data. By learning terminologies for different entities in the same
+vector space, they can potentially find relationships between different
+domains (e.g. drugs and the diseases they treat). Medical claims data does not
+have the natural document structure of clinical notes, and this issue was
+addressed by E. Choi et al. [@TwvauiTv], who built
+embeddings using a multi-layer network architecture which mimics the structure
+of claims data. While promising, difficulties in evaluating the quality of
+these kinds of features and variations in clinical coding practices remain as
+challenges to using them in practice.
+
 Identifying consistent subgroups of individuals and individual health
 trajectories from clinical tests is also an active area of research. Approaches
 inspired by deep learning have been used for both unsupervised feature
@@ -331,9 +347,10 @@ scale analysis of an electronic health records system found that a deep
 denoising autoencoder architecture applied to the number and co-occurrence of
 clinical test events, though not the results of those tests, constructed
 features that were more useful for disease prediction than other existing
-feature construction methods [@WrNCJ9sO].  Taken together, these
-results support the potential of unsupervised feature construction in this
-domain. However, numerous challenges including data integration (patient
+feature construction methods [@WrNCJ9sO].  Razavian et al.
+[@c6MfDdWP] used a set of 18 common lab tests to predict disease onset
+using both CNN and LSTM architectures and demonstrated and improvement over baseline
+regression models. However, numerous challenges including data integration (patient
 demographics, family history, laboratory tests, text-based patient records,
 image analysis, genomic data) and better handling of streaming temporal data
 with many features, will need to be overcome before we can fully assess the
@@ -410,7 +427,9 @@ making methodological choices that either reduce the need for labeled examples
 or that use transformations to training data to increase the number of times it
 can be used before overfitting occurs. For example, the unsupervised and
 semi-supervised methods that we've discussed reduce the need for labeled
-examples [@5x3uMSKi]. The adversarial training example
+examples [@5x3uMSKi]. The anchor and learn framework
+[@A9JeoGV8] uses expert knowledge to identify high confidence
+observations from which labels can be inferred. The adversarial training example
 strategies that we've mentioned can reduce overfitting, if transformations are
 available that preserve the meaningful content of the data while transforming
 irrelevant features [@Xxb4t3zO]. While adversarial training examples
@@ -1262,6 +1281,16 @@ interpretability of deep learning models, fitting deep models to limited and
 heterogeneous data, and integrating complex predictive models into a dynamic
 clinical environment.
 
+A critical challenge in moving from prediction to treatment recommendations
+is the necessity to establish a causal relationship for a recommendation.
+Causal inference is often framed in terms of counterfactual question
+[@cpNVdlL7]. Johansson et al [@173ftiSzF] use deep neural
+networks to create representation models for covariates that capture nonlinear
+effects and show significant performance improvements over existing models. In
+a less formal approach, Kale et al [@FUIfIdE] first create a deep neural
+network to model clinical time series and then analyze the relationship of the
+hidden features to the output using a causal approach.
+
 #### Applications
 
 ##### Trajectory Prediction for Treatment
diff --git a/bibliography.bib b/bibliography.bib
index 2a32b6e2..6069477a 100644
--- a/bibliography.bib
+++ b/bibliography.bib
@@ -1690,3 +1690,51 @@ @article{xl1ijigK
  year = {2017}
 }
 
+
+@article{173ftiSzF,
+ abstract = {Observational studies are rising in importance due to the widespread
+accumulation of data in fields such as healthcare, education, employment and
+ecology. We consider the task of answering counterfactual questions such as, "Would this patient have lower blood sugar had she received a different
+medication?". We propose a new algorithmic framework for counterfactual
+inference which brings together ideas from domain adaptation and representation
+learning. In addition to a theoretical justification, we perform an empirical
+comparison with previous approaches to causal inference from observational
+data. Our deep learning algorithm significantly outperforms the previous
+state-of-the-art.},
+ archiveprefix = {arXiv},
+ author = {Fredrik D. Johansson and Uri Shalit and David Sontag},
+ eprint = {1605.03661v2},
+ file = {1605.03661v2.pdf},
+ link = {http://arxiv.org/abs/1605.03661v2},
+ month = {May},
+ primaryclass = {stat.ML},
+ title = {Learning Representations for Counterfactual Inference},
+ year = {2016}
+}
+
+
+@article{c6MfDdWP,
+ abstract = {Disparate areas of machine learning have benefited from models that can take
+raw data with little preprocessing as input and learn rich representations of
+that raw data in order to perform well on a given prediction task. We evaluate
+this approach in healthcare by using longitudinal measurements of lab tests, one of the more raw signals of a patient's health state widely available in
+clinical data, to predict disease onsets. In particular, we train a Long
+Short-Term Memory (LSTM) recurrent neural network and two novel convolutional
+neural networks for multi-task prediction of disease onset for 133 conditions
+based on 18 common lab tests measured over time in a cohort of 298K patients
+derived from 8 years of administrative claims data. We compare the neural
+networks to a logistic regression with several hand-engineered, clinically
+relevant features. We find that the representation-based learning approaches
+significantly outperform this baseline. We believe that our work suggests a new
+avenue for patient risk stratification based solely on lab results.},
+ archiveprefix = {arXiv},
+ author = {Narges Razavian and Jake Marcus and David Sontag},
+ eprint = {1608.00647v3},
+ file = {1608.00647v3.pdf},
+ link = {http://arxiv.org/abs/1608.00647v3},
+ month = {Aug},
+ primaryclass = {cs.LG},
+ title = {Multi-task Prediction of Disease Onsets from Longitudinal Lab Tests},
+ year = {2016}
+}
+
diff --git a/bibliography.json b/bibliography.json
index 85ad31c8..a63109c9 100644
--- a/bibliography.json
+++ b/bibliography.json
@@ -25550,6 +25550,744 @@
     ],
     "id": "ppGS5h4v"
   },
+  {
+    "indexed": {
+      "date-parts": [
+        [
+          2017,
+          4,
+          2
+        ]
+      ],
+      "date-time": "2017-04-02T04:48:24Z",
+      "timestamp": 1491108504137
+    },
+    "reference-count": 0,
+    "publisher": "American Psychological Association (APA)",
+    "issue": "5",
+    "content-domain": {
+      "domain": [],
+      "crossmark-restriction": false
+    },
+    "short-container-title": [
+      "Journal of Educational Psychology"
+    ],
+    "published-print": {
+      "date-parts": [
+        [
+          1974
+        ]
+      ]
+    },
+    "DOI": "10.1037/h0037350",
+    "type": "article-journal",
+    "created": {
+      "date-parts": [
+        [
+          2006,
+          6,
+          8
+        ]
+      ],
+      "date-time": "2006-06-08T01:08:10Z",
+      "timestamp": 1149728890000
+    },
+    "page": "688-701",
+    "source": "Crossref",
+    "is-referenced-by-count": 1615,
+    "title": "Estimating causal effects of treatments in randomized and nonrandomized studies.",
+    "prefix": "10.1037",
+    "volume": "66",
+    "author": [
+      {
+        "given": "Donald B.",
+        "family": "Rubin",
+        "affiliation": []
+      }
+    ],
+    "member": "15",
+    "container-title": "Journal of Educational Psychology",
+    "original-title": [],
+    "deposited": {
+      "date-parts": [
+        [
+          2011,
+          8,
+          23
+        ]
+      ],
+      "date-time": "2011-08-23T13:47:16Z",
+      "timestamp": 1314107236000
+    },
+    "score": 1.0,
+    "subtitle": [],
+    "short-title": [],
+    "issued": {
+      "date-parts": [
+        [
+          1974
+        ]
+      ]
+    },
+    "references-count": 0,
+    "alternative-id": [
+      "1975-06502-001"
+    ],
+    "URL": "http://dx.doi.org/10.1037/h0037350",
+    "relation": {},
+    "issn-type": [
+      {
+        "value": "0022-0663",
+        "type": "print"
+      }
+    ],
+    "subject": [
+      "Education",
+      "Developmental and Educational Psychology"
+    ],
+    "id": "cpNVdlL7"
+  },
+  {
+    "indexed": {
+      "date-parts": [
+        [
+          2017,
+          4,
+          1
+        ]
+      ],
+      "date-time": "2017-04-01T03:11:56Z",
+      "timestamp": 1491016316606
+    },
+    "reference-count": 31,
+    "publisher": "Oxford University Press (OUP)",
+    "issue": "4",
+    "content-domain": {
+      "domain": [],
+      "crossmark-restriction": false
+    },
+    "short-container-title": [
+      "J Am Med Inform Assoc"
+    ],
+    "published-print": {
+      "date-parts": [
+        [
+          2016,
+          7
+        ]
+      ]
+    },
+    "DOI": "10.1093/jamia/ocw011",
+    "type": "article-journal",
+    "created": {
+      "date-parts": [
+        [
+          2016,
+          4,
+          24
+        ]
+      ],
+      "date-time": "2016-04-24T00:18:19Z",
+      "timestamp": 1461457099000
+    },
+    "page": "731-740",
+    "source": "Crossref",
+    "is-referenced-by-count": 2,
+    "title": "Electronic medical record phenotyping using the anchor and learn framework",
+    "prefix": "10.1093",
+    "volume": "23",
+    "author": [
+      {
+        "given": "Yoni",
+        "family": "Halpern",
+        "affiliation": []
+      },
+      {
+        "given": "Steven",
+        "family": "Horng",
+        "affiliation": []
+      },
+      {
+        "given": "Youngduck",
+        "family": "Choi",
+        "affiliation": []
+      },
+      {
+        "given": "David",
+        "family": "Sontag",
+        "affiliation": []
+      }
+    ],
+    "member": "286",
+    "published-online": {
+      "date-parts": [
+        [
+          2016,
+          4,
+          23
+        ]
+      ]
+    },
+    "container-title": "Journal of the American Medical Informatics Association",
+    "original-title": [],
+    "deposited": {
+      "date-parts": [
+        [
+          2017,
+          1,
+          23
+        ]
+      ],
+      "date-time": "2017-01-23T19:48:54Z",
+      "timestamp": 1485200934000
+    },
+    "score": 1.0,
+    "subtitle": [],
+    "short-title": [],
+    "issued": {
+      "date-parts": [
+        [
+          2016,
+          4,
+          23
+        ]
+      ]
+    },
+    "references-count": 31,
+    "alternative-id": [
+      "10.1093/jamia/ocw011"
+    ],
+    "URL": "http://dx.doi.org/10.1093/jamia/ocw011",
+    "relation": {},
+    "issn-type": [
+      {
+        "value": "1067-5027",
+        "type": "print"
+      },
+      {
+        "value": "1527-974X",
+        "type": "electronic"
+      }
+    ],
+    "subject": [
+      "Health Informatics"
+    ],
+    "id": "A9JeoGV8"
+  },
+  {
+    "indexed": {
+      "date-parts": [
+        [
+          2017,
+          3,
+          31
+        ]
+      ],
+      "date-time": "2017-03-31T16:59:20Z",
+      "timestamp": 1490979560540
+    },
+    "publisher-location": "New York, New York, USA",
+    "reference-count": 10,
+    "publisher": "ACM Press",
+    "license": [
+      {
+        "URL": "http://www.acm.org/publications/policies/copyright_policy#Background",
+        "start": {
+          "date-parts": [
+            [
+              2014,
+              4,
+              7
+            ]
+          ],
+          "date-time": "2014-04-07T00:00:00Z",
+          "timestamp": 1396828800000
+        },
+        "delay-in-days": 96,
+        "content-version": "vor"
+      }
+    ],
+    "content-domain": {
+      "domain": [],
+      "crossmark-restriction": false
+    },
+    "short-container-title": [],
+    "published-print": {
+      "date-parts": [
+        [
+          2014
+        ]
+      ]
+    },
+    "DOI": "10.1145/2567948.2577348",
+    "type": "paper-conference",
+    "created": {
+      "date-parts": [
+        [
+          2016,
+          2,
+          5
+        ]
+      ],
+      "date-time": "2016-02-05T19:44:31Z",
+      "timestamp": 1454701471000
+    },
+    "source": "Crossref",
+    "is-referenced-by-count": 19,
+    "title": "Learning semantic representations using convolutional neural networks for web search",
+    "prefix": "10.1145",
+    "author": [
+      {
+        "given": "Yelong",
+        "family": "Shen",
+        "affiliation": [
+          {
+            "name": "Kent State University, Kent, OH, USA"
+          }
+        ]
+      },
+      {
+        "given": "Xiaodong",
+        "family": "He",
+        "affiliation": [
+          {
+            "name": "Microsoft, Redmond, WA, USA"
+          }
+        ]
+      },
+      {
+        "given": "Jianfeng",
+        "family": "Gao",
+        "affiliation": [
+          {
+            "name": "Microsoft, Redmond, WA, USA"
+          }
+        ]
+      },
+      {
+        "given": "Li",
+        "family": "Deng",
+        "affiliation": [
+          {
+            "name": "Microsoft, Redmond, WA, USA"
+          }
+        ]
+      },
+      {
+        "given": "Grégoire",
+        "family": "Mesnil",
+        "affiliation": [
+          {
+            "name": "University of Montréal, Montréal, Canada"
+          }
+        ]
+      }
+    ],
+    "member": "320",
+    "container-title": "Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion",
+    "original-title": [],
+    "deposited": {
+      "date-parts": [
+        [
+          2016,
+          12,
+          6
+        ]
+      ],
+      "date-time": "2016-12-06T21:46:47Z",
+      "timestamp": 1481060807000
+    },
+    "score": 1.0,
+    "subtitle": [],
+    "short-title": [],
+    "issued": {
+      "date-parts": [
+        [
+          2014
+        ]
+      ]
+    },
+    "references-count": 10,
+    "URL": "http://dx.doi.org/10.1145/2567948.2577348",
+    "relation": {
+      "cites": []
+    },
+    "id": "1qa47hoP"
+  },
+  {
+    "indexed": {
+      "date-parts": [
+        [
+          2017,
+          4,
+          1
+        ]
+      ],
+      "date-time": "2017-04-01T02:45:31Z",
+      "timestamp": 1491014731095
+    },
+    "publisher-location": "New York, New York, USA",
+    "reference-count": 17,
+    "publisher": "ACM Press",
+    "license": [
+      {
+        "URL": "http://www.acm.org/publications/policies/copyright_policy#Background",
+        "start": {
+          "date-parts": [
+            [
+              2014,
+              11,
+              3
+            ]
+          ],
+          "date-time": "2014-11-03T00:00:00Z",
+          "timestamp": 1414972800000
+        },
+        "delay-in-days": 306,
+        "content-version": "vor"
+      }
+    ],
+    "content-domain": {
+      "domain": [],
+      "crossmark-restriction": false
+    },
+    "short-container-title": [],
+    "published-print": {
+      "date-parts": [
+        [
+          2014
+        ]
+      ]
+    },
+    "DOI": "10.1145/2661829.2661974",
+    "type": "paper-conference",
+    "created": {
+      "date-parts": [
+        [
+          2014,
+          11,
+          7
+        ]
+      ],
+      "date-time": "2014-11-07T17:10:54Z",
+      "timestamp": 1415380254000
+    },
+    "source": "Crossref",
+    "is-referenced-by-count": 4,
+    "title": "Medical Semantic Similarity with a Neural Language Model",
+    "prefix": "10.1145",
+    "author": [
+      {
+        "given": "Lance",
+        "family": "De Vine",
+        "affiliation": []
+      },
+      {
+        "given": "Guido",
+        "family": "Zuccon",
+        "affiliation": []
+      },
+      {
+        "given": "Bevan",
+        "family": "Koopman",
+        "affiliation": []
+      },
+      {
+        "given": "Laurianne",
+        "family": "Sitbon",
+        "affiliation": []
+      },
+      {
+        "given": "Peter",
+        "family": "Bruza",
+        "affiliation": []
+      }
+    ],
+    "member": "320",
+    "container-title": "Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM '14",
+    "original-title": [],
+    "deposited": {
+      "date-parts": [
+        [
+          2016,
+          12,
+          7
+        ]
+      ],
+      "date-time": "2016-12-07T02:44:26Z",
+      "timestamp": 1481078666000
+    },
+    "score": 1.0,
+    "subtitle": [],
+    "short-title": [],
+    "issued": {
+      "date-parts": [
+        [
+          2014
+        ]
+      ]
+    },
+    "references-count": 17,
+    "URL": "http://dx.doi.org/10.1145/2661829.2661974",
+    "relation": {
+      "cites": []
+    },
+    "id": "XQtuRkTU"
+  },
+  {
+    "indexed": {
+      "date-parts": [
+        [
+          2017,
+          4,
+          1
+        ]
+      ],
+      "date-time": "2017-04-01T19:33:26Z",
+      "timestamp": 1491075206096
+    },
+    "publisher-location": "New York, New York, USA",
+    "reference-count": 39,
+    "publisher": "ACM Press",
+    "license": [
+      {
+        "URL": "http://www.acm.org/publications/policies/copyright_policy#Background",
+        "start": {
+          "date-parts": [
+            [
+              2016,
+              8,
+              13
+            ]
+          ],
+          "date-time": "2016-08-13T00:00:00Z",
+          "timestamp": 1471046400000
+        },
+        "delay-in-days": 225,
+        "content-version": "vor"
+      }
+    ],
+    "funder": [
+      {
+        "DOI": "10.13039/100006785",
+        "name": "Google",
+        "doi-asserted-by": "publisher",
+        "award": []
+      },
+      {
+        "DOI": "10.13039/100000001",
+        "name": "National Science Foundation",
+        "doi-asserted-by": "publisher",
+        "award": [
+          "1418511"
+        ]
+      },
+      {
+        "name": "Samsung Scholarship",
+        "award": []
+      },
+      {
+        "name": "Centers for Disease Control and Prevention",
+        "award": []
+      },
+      {
+        "name": "Children's Healthcare of Atlanta",
+        "award": []
+      },
+      {
+        "name": "UCB",
+        "award": []
+      }
+    ],
+    "content-domain": {
+      "domain": [],
+      "crossmark-restriction": false
+    },
+    "short-container-title": [],
+    "published-print": {
+      "date-parts": [
+        [
+          2016
+        ]
+      ]
+    },
+    "DOI": "10.1145/2939672.2939823",
+    "type": "paper-conference",
+    "created": {
+      "date-parts": [
+        [
+          2016,
+          8,
+          8
+        ]
+      ],
+      "date-time": "2016-08-08T18:33:46Z",
+      "timestamp": 1470681226000
+    },
+    "source": "Crossref",
+    "is-referenced-by-count": 1,
+    "title": "Multi-layer Representation Learning for Medical Concepts",
+    "prefix": "10.1145",
+    "author": [
+      {
+        "given": "Edward",
+        "family": "Choi",
+        "affiliation": [
+          {
+            "name": "Georgia Institute of Technology, Atlanta, GA, USA"
+          }
+        ]
+      },
+      {
+        "given": "Mohammad Taha",
+        "family": "Bahadori",
+        "affiliation": [
+          {
+            "name": "Georgia Institute of Technology, Atlanta, GA, USA"
+          }
+        ]
+      },
+      {
+        "given": "Elizabeth",
+        "family": "Searles",
+        "affiliation": [
+          {
+            "name": "Children's Healthcare of Atlanta, Atlanta, GA, USA"
+          }
+        ]
+      },
+      {
+        "given": "Catherine",
+        "family": "Coffey",
+        "affiliation": [
+          {
+            "name": "Children's Healthcare of Atlanta, Atlanta, GA, USA"
+          }
+        ]
+      },
+      {
+        "given": "Michael",
+        "family": "Thompson",
+        "affiliation": [
+          {
+            "name": "Children's Healthcare of Atlanta, Atlanta, GA, USA"
+          }
+        ]
+      },
+      {
+        "given": "James",
+        "family": "Bost",
+        "affiliation": [
+          {
+            "name": "Children's Healthcare of Atlanta, Atlanta, GA, USA"
+          }
+        ]
+      },
+      {
+        "given": "Javier",
+        "family": "Tejedor-Sojo",
+        "affiliation": [
+          {
+            "name": "Children's Healthcare of Atlanta, Atlanta, GA, USA"
+          }
+        ]
+      },
+      {
+        "given": "Jimeng",
+        "family": "Sun",
+        "affiliation": [
+          {
+            "name": "Georgia Institute of Technology, Atlanta, GA, USA"
+          }
+        ]
+      }
+    ],
+    "member": "320",
+    "container-title": "Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16",
+    "original-title": [],
+    "deposited": {
+      "date-parts": [
+        [
+          2016,
+          12,
+          7
+        ]
+      ],
+      "date-time": "2016-12-07T04:54:12Z",
+      "timestamp": 1481086452000
+    },
+    "score": 1.0,
+    "subtitle": [],
+    "short-title": [],
+    "issued": {
+      "date-parts": [
+        [
+          2016
+        ]
+      ]
+    },
+    "references-count": 39,
+    "URL": "http://dx.doi.org/10.1145/2939672.2939823",
+    "relation": {
+      "cites": []
+    },
+    "id": "TwvauiTv"
+  },
+  {
+    "source": "PMC",
+    "accessed": {
+      "date-parts": [
+        [
+          2017,
+          4,
+          25
+        ]
+      ]
+    },
+    "id": "FUIfIdE",
+    "title": "Causal Phenotype Discovery via Deep Networks",
+    "author": [
+      {
+        "family": "Kale",
+        "given": "David C."
+      },
+      {
+        "family": "Che",
+        "given": "Zhengping"
+      },
+      {
+        "family": "Bahadori",
+        "given": "Mohammad Taha"
+      },
+      {
+        "family": "Li",
+        "given": "Wenzhe"
+      },
+      {
+        "family": "Liu",
+        "given": "Yan"
+      },
+      {
+        "family": "Wetzel",
+        "given": "Randall"
+      }
+    ],
+    "container-title-short": "AMIA Annu Symp Proc",
+    "container-title": "AMIA Annual Symposium Proceedings",
+    "publisher": "American Medical Informatics Association",
+    "issued": {
+      "date-parts": [
+        [
+          2015
+        ]
+      ]
+    },
+    "page": "677-686",
+    "volume": "2015",
+    "PMID": "26958203",
+    "PMCID": "PMC4765623",
+    "type": "article-journal"
+  },
   {
     "abstract": "After a more than decade-long period of relatively little research activity in the area of recurrent neural networks, several new developments will be reviewed here that have allowed substantial progress both in understanding and in technical solutions towards more efficient training of recurrent networks. These advances have been motivated by and related to the optimization issues surrounding deep learning. Although recurrent networks are extremely powerful in what they can in principle represent in terms of modelling sequences,their training is plagued by two aspects of the same issue regarding the learning of long-term dependencies. Experiments reported here evaluate the use of clipping gradients, spanning longer time ranges with leaky integration, advanced momentum techniques, using more powerful output probability models, and encouraging sparser gradients to help symmetry breaking and credit assignment. The experiments are performed on text and music data and show off the combined effects of these techniques in generally improving both training and test error.",
     "author": [
@@ -27624,5 +28362,61 @@
     },
     "title": "Generating multi-label discrete electronic health records using generative adversarial networks",
     "type": "article-journal"
+  },
+  {
+    "abstract": "Observational studies are rising in importance due to the widespread accumulation of data in fields such as healthcare, education, employment and ecology. We consider the task of answering counterfactual questions such as, “Would this patient have lower blood sugar had she received a different medication?”. We propose a new algorithmic framework for counterfactual inference which brings together ideas from domain adaptation and representation learning. In addition to a theoretical justification, we perform an empirical comparison with previous approaches to causal inference from observational data. Our deep learning algorithm significantly outperforms the previous state-of-the-art.",
+    "author": [
+      {
+        "family": "Johansson",
+        "given": "Fredrik D."
+      },
+      {
+        "family": "Shalit",
+        "given": "Uri"
+      },
+      {
+        "family": "Sontag",
+        "given": "David"
+      }
+    ],
+    "id": "173ftiSzF",
+    "issued": {
+      "date-parts": [
+        [
+          2016,
+          5
+        ]
+      ]
+    },
+    "title": "Learning representations for counterfactual inference",
+    "type": "article-journal"
+  },
+  {
+    "abstract": "Disparate areas of machine learning have benefited from models that can take raw data with little preprocessing as input and learn rich representations of that raw data in order to perform well on a given prediction task. We evaluate this approach in healthcare by using longitudinal measurements of lab tests, one of the more raw signals of a patient’s health state widely available in clinical data, to predict disease onsets. In particular, we train a Long Short-Term Memory (LSTM) recurrent neural network and two novel convolutional neural networks for multi-task prediction of disease onset for 133 conditions based on 18 common lab tests measured over time in a cohort of 298K patients derived from 8 years of administrative claims data. We compare the neural networks to a logistic regression with several hand-engineered, clinically relevant features. We find that the representation-based learning approaches significantly outperform this baseline. We believe that our work suggests a new avenue for patient risk stratification based solely on lab results.",
+    "author": [
+      {
+        "family": "Razavian",
+        "given": "Narges"
+      },
+      {
+        "family": "Marcus",
+        "given": "Jake"
+      },
+      {
+        "family": "Sontag",
+        "given": "David"
+      }
+    ],
+    "id": "c6MfDdWP",
+    "issued": {
+      "date-parts": [
+        [
+          2016,
+          8
+        ]
+      ]
+    },
+    "title": "Multi-task prediction of disease onsets from longitudinal lab tests",
+    "type": "article-journal"
   }
 ]
\ No newline at end of file
diff --git a/citations.json b/citations.json
index 20d49e31..5fb39c57 100644
--- a/citations.json
+++ b/citations.json
@@ -27272,5 +27272,793 @@
     "standard_citation": "arxiv:1703.06490v1",
     "bibtex": "@article{xl1ijigK,\n abstract = {Access to electronic health records (EHR) data has motivated computational\nadvances in medical research. However, various concerns, particularly over\nprivacy, can limit access to and collaborative use of EHR data. Sharing\nsynthetic EHR data could mitigate risk. In this paper, we propose a new\napproach, medical Generative Adversarial Network (medGAN), to generate\nrealistic synthetic EHRs. Based on an input EHR dataset, medGAN can generate\nhigh-dimensional discrete variables (e.g., binary and count features) via a\ncombination of an autoencoder and generative adversarial networks. We also\npropose minibatch averaging to efficiently avoid mode collapse, and increase\nthe learning efficiency with batch normalization and shortcut connections. To\ndemonstrate feasibility, we showed that medGAN generates synthetic EHR datasets\nthat achieve comparable performance to real data on many experiments including\ndistribution statistics, predictive modeling tasks and medical expert review.},\n archiveprefix = {arXiv},\n author = {Edward Choi and Siddharth Biswal and Bradley Malin and Jon Duke and Walter F. Stewart and Jimeng Sun},\n eprint = {1703.06490v1},\n file = {1703.06490v1.pdf},\n link = {http://arxiv.org/abs/1703.06490v1},\n month = {Mar},\n primaryclass = {cs.LG},\n title = {Generating Multi-label Discrete Electronic Health Records using\nGenerative Adversarial Networks},\n year = {2017}\n}\n\n",
     "citation_id": "xl1ijigK"
+  },
+  "arxiv:1605.03661": {
+    "source": "arxiv",
+    "identifer": "1605.03661",
+    "standard_citation": "arxiv:1605.03661",
+    "bibtex": "@article{173ftiSzF,\n abstract = {Observational studies are rising in importance due to the widespread\naccumulation of data in fields such as healthcare, education, employment and\necology. We consider the task of answering counterfactual questions such as, \"Would this patient have lower blood sugar had she received a different\nmedication?\". We propose a new algorithmic framework for counterfactual\ninference which brings together ideas from domain adaptation and representation\nlearning. In addition to a theoretical justification, we perform an empirical\ncomparison with previous approaches to causal inference from observational\ndata. Our deep learning algorithm significantly outperforms the previous\nstate-of-the-art.},\n archiveprefix = {arXiv},\n author = {Fredrik D. Johansson and Uri Shalit and David Sontag},\n eprint = {1605.03661v2},\n file = {1605.03661v2.pdf},\n link = {http://arxiv.org/abs/1605.03661v2},\n month = {May},\n primaryclass = {stat.ML},\n title = {Learning Representations for Counterfactual Inference},\n year = {2016}\n}\n\n",
+    "citation_id": "173ftiSzF"
+  },
+  "arxiv:1608.00647": {
+    "source": "arxiv",
+    "identifer": "1608.00647",
+    "standard_citation": "arxiv:1608.00647",
+    "bibtex": "@article{c6MfDdWP,\n abstract = {Disparate areas of machine learning have benefited from models that can take\nraw data with little preprocessing as input and learn rich representations of\nthat raw data in order to perform well on a given prediction task. We evaluate\nthis approach in healthcare by using longitudinal measurements of lab tests, one of the more raw signals of a patient's health state widely available in\nclinical data, to predict disease onsets. In particular, we train a Long\nShort-Term Memory (LSTM) recurrent neural network and two novel convolutional\nneural networks for multi-task prediction of disease onset for 133 conditions\nbased on 18 common lab tests measured over time in a cohort of 298K patients\nderived from 8 years of administrative claims data. We compare the neural\nnetworks to a logistic regression with several hand-engineered, clinically\nrelevant features. We find that the representation-based learning approaches\nsignificantly outperform this baseline. We believe that our work suggests a new\navenue for patient risk stratification based solely on lab results.},\n archiveprefix = {arXiv},\n author = {Narges Razavian and Jake Marcus and David Sontag},\n eprint = {1608.00647v3},\n file = {1608.00647v3.pdf},\n link = {http://arxiv.org/abs/1608.00647v3},\n month = {Aug},\n primaryclass = {cs.LG},\n title = {Multi-task Prediction of Disease Onsets from Longitudinal Lab Tests},\n year = {2016}\n}\n\n",
+    "citation_id": "c6MfDdWP"
+  },
+  "doi:10.1037/h0037350": {
+    "source": "doi",
+    "identifer": "10.1037/h0037350",
+    "standard_citation": "doi:10.1037/h0037350",
+    "citeproc": {
+      "indexed": {
+        "date-parts": [
+          [
+            2017,
+            4,
+            2
+          ]
+        ],
+        "date-time": "2017-04-02T04:48:24Z",
+        "timestamp": 1491108504137
+      },
+      "reference-count": 0,
+      "publisher": "American Psychological Association (APA)",
+      "issue": "5",
+      "content-domain": {
+        "domain": [],
+        "crossmark-restriction": false
+      },
+      "short-container-title": [
+        "Journal of Educational Psychology"
+      ],
+      "published-print": {
+        "date-parts": [
+          [
+            1974
+          ]
+        ]
+      },
+      "DOI": "10.1037/h0037350",
+      "type": "article-journal",
+      "created": {
+        "date-parts": [
+          [
+            2006,
+            6,
+            8
+          ]
+        ],
+        "date-time": "2006-06-08T01:08:10Z",
+        "timestamp": 1149728890000
+      },
+      "page": "688-701",
+      "source": "Crossref",
+      "is-referenced-by-count": 1615,
+      "title": "Estimating causal effects of treatments in randomized and nonrandomized studies.",
+      "prefix": "10.1037",
+      "volume": "66",
+      "author": [
+        {
+          "given": "Donald B.",
+          "family": "Rubin",
+          "affiliation": []
+        }
+      ],
+      "member": "15",
+      "container-title": "Journal of Educational Psychology",
+      "original-title": [],
+      "deposited": {
+        "date-parts": [
+          [
+            2011,
+            8,
+            23
+          ]
+        ],
+        "date-time": "2011-08-23T13:47:16Z",
+        "timestamp": 1314107236000
+      },
+      "score": 1.0,
+      "subtitle": [],
+      "short-title": [],
+      "issued": {
+        "date-parts": [
+          [
+            1974
+          ]
+        ]
+      },
+      "references-count": 0,
+      "alternative-id": [
+        "1975-06502-001"
+      ],
+      "URL": "http://dx.doi.org/10.1037/h0037350",
+      "relation": {},
+      "issn-type": [
+        {
+          "value": "0022-0663",
+          "type": "print"
+        }
+      ],
+      "subject": [
+        "Education",
+        "Developmental and Educational Psychology"
+      ],
+      "id": "cpNVdlL7"
+    },
+    "citation_id": "cpNVdlL7"
+  },
+  "doi:10.1093/jamia/ocw011": {
+    "source": "doi",
+    "identifer": "10.1093/jamia/ocw011",
+    "standard_citation": "doi:10.1093/jamia/ocw011",
+    "citeproc": {
+      "indexed": {
+        "date-parts": [
+          [
+            2017,
+            4,
+            1
+          ]
+        ],
+        "date-time": "2017-04-01T03:11:56Z",
+        "timestamp": 1491016316606
+      },
+      "reference-count": 31,
+      "publisher": "Oxford University Press (OUP)",
+      "issue": "4",
+      "content-domain": {
+        "domain": [],
+        "crossmark-restriction": false
+      },
+      "short-container-title": [
+        "J Am Med Inform Assoc"
+      ],
+      "published-print": {
+        "date-parts": [
+          [
+            2016,
+            7
+          ]
+        ]
+      },
+      "DOI": "10.1093/jamia/ocw011",
+      "type": "article-journal",
+      "created": {
+        "date-parts": [
+          [
+            2016,
+            4,
+            24
+          ]
+        ],
+        "date-time": "2016-04-24T00:18:19Z",
+        "timestamp": 1461457099000
+      },
+      "page": "731-740",
+      "source": "Crossref",
+      "is-referenced-by-count": 2,
+      "title": "Electronic medical record phenotyping using the anchor and learn framework",
+      "prefix": "10.1093",
+      "volume": "23",
+      "author": [
+        {
+          "given": "Yoni",
+          "family": "Halpern",
+          "affiliation": []
+        },
+        {
+          "given": "Steven",
+          "family": "Horng",
+          "affiliation": []
+        },
+        {
+          "given": "Youngduck",
+          "family": "Choi",
+          "affiliation": []
+        },
+        {
+          "given": "David",
+          "family": "Sontag",
+          "affiliation": []
+        }
+      ],
+      "member": "286",
+      "published-online": {
+        "date-parts": [
+          [
+            2016,
+            4,
+            23
+          ]
+        ]
+      },
+      "container-title": "Journal of the American Medical Informatics Association",
+      "original-title": [],
+      "deposited": {
+        "date-parts": [
+          [
+            2017,
+            1,
+            23
+          ]
+        ],
+        "date-time": "2017-01-23T19:48:54Z",
+        "timestamp": 1485200934000
+      },
+      "score": 1.0,
+      "subtitle": [],
+      "short-title": [],
+      "issued": {
+        "date-parts": [
+          [
+            2016,
+            4,
+            23
+          ]
+        ]
+      },
+      "references-count": 31,
+      "alternative-id": [
+        "10.1093/jamia/ocw011"
+      ],
+      "URL": "http://dx.doi.org/10.1093/jamia/ocw011",
+      "relation": {},
+      "issn-type": [
+        {
+          "value": "1067-5027",
+          "type": "print"
+        },
+        {
+          "value": "1527-974X",
+          "type": "electronic"
+        }
+      ],
+      "subject": [
+        "Health Informatics"
+      ],
+      "id": "A9JeoGV8"
+    },
+    "citation_id": "A9JeoGV8"
+  },
+  "doi:10.1145/2567948.2577348": {
+    "source": "doi",
+    "identifer": "10.1145/2567948.2577348",
+    "standard_citation": "doi:10.1145/2567948.2577348",
+    "citeproc": {
+      "indexed": {
+        "date-parts": [
+          [
+            2017,
+            3,
+            31
+          ]
+        ],
+        "date-time": "2017-03-31T16:59:20Z",
+        "timestamp": 1490979560540
+      },
+      "publisher-location": "New York, New York, USA",
+      "reference-count": 10,
+      "publisher": "ACM Press",
+      "license": [
+        {
+          "URL": "http://www.acm.org/publications/policies/copyright_policy#Background",
+          "start": {
+            "date-parts": [
+              [
+                2014,
+                4,
+                7
+              ]
+            ],
+            "date-time": "2014-04-07T00:00:00Z",
+            "timestamp": 1396828800000
+          },
+          "delay-in-days": 96,
+          "content-version": "vor"
+        }
+      ],
+      "content-domain": {
+        "domain": [],
+        "crossmark-restriction": false
+      },
+      "short-container-title": [],
+      "published-print": {
+        "date-parts": [
+          [
+            2014
+          ]
+        ]
+      },
+      "DOI": "10.1145/2567948.2577348",
+      "type": "paper-conference",
+      "created": {
+        "date-parts": [
+          [
+            2016,
+            2,
+            5
+          ]
+        ],
+        "date-time": "2016-02-05T19:44:31Z",
+        "timestamp": 1454701471000
+      },
+      "source": "Crossref",
+      "is-referenced-by-count": 19,
+      "title": "Learning semantic representations using convolutional neural networks for web search",
+      "prefix": "10.1145",
+      "author": [
+        {
+          "given": "Yelong",
+          "family": "Shen",
+          "affiliation": [
+            {
+              "name": "Kent State University, Kent, OH, USA"
+            }
+          ]
+        },
+        {
+          "given": "Xiaodong",
+          "family": "He",
+          "affiliation": [
+            {
+              "name": "Microsoft, Redmond, WA, USA"
+            }
+          ]
+        },
+        {
+          "given": "Jianfeng",
+          "family": "Gao",
+          "affiliation": [
+            {
+              "name": "Microsoft, Redmond, WA, USA"
+            }
+          ]
+        },
+        {
+          "given": "Li",
+          "family": "Deng",
+          "affiliation": [
+            {
+              "name": "Microsoft, Redmond, WA, USA"
+            }
+          ]
+        },
+        {
+          "given": "Grégoire",
+          "family": "Mesnil",
+          "affiliation": [
+            {
+              "name": "University of Montréal, Montréal, Canada"
+            }
+          ]
+        }
+      ],
+      "member": "320",
+      "container-title": "Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion",
+      "original-title": [],
+      "deposited": {
+        "date-parts": [
+          [
+            2016,
+            12,
+            6
+          ]
+        ],
+        "date-time": "2016-12-06T21:46:47Z",
+        "timestamp": 1481060807000
+      },
+      "score": 1.0,
+      "subtitle": [],
+      "short-title": [],
+      "issued": {
+        "date-parts": [
+          [
+            2014
+          ]
+        ]
+      },
+      "references-count": 10,
+      "URL": "http://dx.doi.org/10.1145/2567948.2577348",
+      "relation": {
+        "cites": []
+      },
+      "id": "1qa47hoP"
+    },
+    "citation_id": "1qa47hoP"
+  },
+  "doi:10.1145/2661829.2661974": {
+    "source": "doi",
+    "identifer": "10.1145/2661829.2661974",
+    "standard_citation": "doi:10.1145/2661829.2661974",
+    "citeproc": {
+      "indexed": {
+        "date-parts": [
+          [
+            2017,
+            4,
+            1
+          ]
+        ],
+        "date-time": "2017-04-01T02:45:31Z",
+        "timestamp": 1491014731095
+      },
+      "publisher-location": "New York, New York, USA",
+      "reference-count": 17,
+      "publisher": "ACM Press",
+      "license": [
+        {
+          "URL": "http://www.acm.org/publications/policies/copyright_policy#Background",
+          "start": {
+            "date-parts": [
+              [
+                2014,
+                11,
+                3
+              ]
+            ],
+            "date-time": "2014-11-03T00:00:00Z",
+            "timestamp": 1414972800000
+          },
+          "delay-in-days": 306,
+          "content-version": "vor"
+        }
+      ],
+      "content-domain": {
+        "domain": [],
+        "crossmark-restriction": false
+      },
+      "short-container-title": [],
+      "published-print": {
+        "date-parts": [
+          [
+            2014
+          ]
+        ]
+      },
+      "DOI": "10.1145/2661829.2661974",
+      "type": "paper-conference",
+      "created": {
+        "date-parts": [
+          [
+            2014,
+            11,
+            7
+          ]
+        ],
+        "date-time": "2014-11-07T17:10:54Z",
+        "timestamp": 1415380254000
+      },
+      "source": "Crossref",
+      "is-referenced-by-count": 4,
+      "title": "Medical Semantic Similarity with a Neural Language Model",
+      "prefix": "10.1145",
+      "author": [
+        {
+          "given": "Lance",
+          "family": "De Vine",
+          "affiliation": []
+        },
+        {
+          "given": "Guido",
+          "family": "Zuccon",
+          "affiliation": []
+        },
+        {
+          "given": "Bevan",
+          "family": "Koopman",
+          "affiliation": []
+        },
+        {
+          "given": "Laurianne",
+          "family": "Sitbon",
+          "affiliation": []
+        },
+        {
+          "given": "Peter",
+          "family": "Bruza",
+          "affiliation": []
+        }
+      ],
+      "member": "320",
+      "container-title": "Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM '14",
+      "original-title": [],
+      "deposited": {
+        "date-parts": [
+          [
+            2016,
+            12,
+            7
+          ]
+        ],
+        "date-time": "2016-12-07T02:44:26Z",
+        "timestamp": 1481078666000
+      },
+      "score": 1.0,
+      "subtitle": [],
+      "short-title": [],
+      "issued": {
+        "date-parts": [
+          [
+            2014
+          ]
+        ]
+      },
+      "references-count": 17,
+      "URL": "http://dx.doi.org/10.1145/2661829.2661974",
+      "relation": {
+        "cites": []
+      },
+      "id": "XQtuRkTU"
+    },
+    "citation_id": "XQtuRkTU"
+  },
+  "doi:10.1145/2939672.2939823": {
+    "source": "doi",
+    "identifer": "10.1145/2939672.2939823",
+    "standard_citation": "doi:10.1145/2939672.2939823",
+    "citeproc": {
+      "indexed": {
+        "date-parts": [
+          [
+            2017,
+            4,
+            1
+          ]
+        ],
+        "date-time": "2017-04-01T19:33:26Z",
+        "timestamp": 1491075206096
+      },
+      "publisher-location": "New York, New York, USA",
+      "reference-count": 39,
+      "publisher": "ACM Press",
+      "license": [
+        {
+          "URL": "http://www.acm.org/publications/policies/copyright_policy#Background",
+          "start": {
+            "date-parts": [
+              [
+                2016,
+                8,
+                13
+              ]
+            ],
+            "date-time": "2016-08-13T00:00:00Z",
+            "timestamp": 1471046400000
+          },
+          "delay-in-days": 225,
+          "content-version": "vor"
+        }
+      ],
+      "funder": [
+        {
+          "DOI": "10.13039/100006785",
+          "name": "Google",
+          "doi-asserted-by": "publisher",
+          "award": []
+        },
+        {
+          "DOI": "10.13039/100000001",
+          "name": "National Science Foundation",
+          "doi-asserted-by": "publisher",
+          "award": [
+            "1418511"
+          ]
+        },
+        {
+          "name": "Samsung Scholarship",
+          "award": []
+        },
+        {
+          "name": "Centers for Disease Control and Prevention",
+          "award": []
+        },
+        {
+          "name": "Children's Healthcare of Atlanta",
+          "award": []
+        },
+        {
+          "name": "UCB",
+          "award": []
+        }
+      ],
+      "content-domain": {
+        "domain": [],
+        "crossmark-restriction": false
+      },
+      "short-container-title": [],
+      "published-print": {
+        "date-parts": [
+          [
+            2016
+          ]
+        ]
+      },
+      "DOI": "10.1145/2939672.2939823",
+      "type": "paper-conference",
+      "created": {
+        "date-parts": [
+          [
+            2016,
+            8,
+            8
+          ]
+        ],
+        "date-time": "2016-08-08T18:33:46Z",
+        "timestamp": 1470681226000
+      },
+      "source": "Crossref",
+      "is-referenced-by-count": 1,
+      "title": "Multi-layer Representation Learning for Medical Concepts",
+      "prefix": "10.1145",
+      "author": [
+        {
+          "given": "Edward",
+          "family": "Choi",
+          "affiliation": [
+            {
+              "name": "Georgia Institute of Technology, Atlanta, GA, USA"
+            }
+          ]
+        },
+        {
+          "given": "Mohammad Taha",
+          "family": "Bahadori",
+          "affiliation": [
+            {
+              "name": "Georgia Institute of Technology, Atlanta, GA, USA"
+            }
+          ]
+        },
+        {
+          "given": "Elizabeth",
+          "family": "Searles",
+          "affiliation": [
+            {
+              "name": "Children's Healthcare of Atlanta, Atlanta, GA, USA"
+            }
+          ]
+        },
+        {
+          "given": "Catherine",
+          "family": "Coffey",
+          "affiliation": [
+            {
+              "name": "Children's Healthcare of Atlanta, Atlanta, GA, USA"
+            }
+          ]
+        },
+        {
+          "given": "Michael",
+          "family": "Thompson",
+          "affiliation": [
+            {
+              "name": "Children's Healthcare of Atlanta, Atlanta, GA, USA"
+            }
+          ]
+        },
+        {
+          "given": "James",
+          "family": "Bost",
+          "affiliation": [
+            {
+              "name": "Children's Healthcare of Atlanta, Atlanta, GA, USA"
+            }
+          ]
+        },
+        {
+          "given": "Javier",
+          "family": "Tejedor-Sojo",
+          "affiliation": [
+            {
+              "name": "Children's Healthcare of Atlanta, Atlanta, GA, USA"
+            }
+          ]
+        },
+        {
+          "given": "Jimeng",
+          "family": "Sun",
+          "affiliation": [
+            {
+              "name": "Georgia Institute of Technology, Atlanta, GA, USA"
+            }
+          ]
+        }
+      ],
+      "member": "320",
+      "container-title": "Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16",
+      "original-title": [],
+      "deposited": {
+        "date-parts": [
+          [
+            2016,
+            12,
+            7
+          ]
+        ],
+        "date-time": "2016-12-07T04:54:12Z",
+        "timestamp": 1481086452000
+      },
+      "score": 1.0,
+      "subtitle": [],
+      "short-title": [],
+      "issued": {
+        "date-parts": [
+          [
+            2016
+          ]
+        ]
+      },
+      "references-count": 39,
+      "URL": "http://dx.doi.org/10.1145/2939672.2939823",
+      "relation": {
+        "cites": []
+      },
+      "id": "TwvauiTv"
+    },
+    "citation_id": "TwvauiTv"
+  },
+  "pmid:26958203": {
+    "source": "pmid",
+    "identifer": "26958203",
+    "standard_citation": "pmid:26958203",
+    "citeproc": {
+      "source": "PMC",
+      "accessed": {
+        "date-parts": [
+          [
+            2017,
+            4,
+            25
+          ]
+        ]
+      },
+      "id": "FUIfIdE",
+      "title": "Causal Phenotype Discovery via Deep Networks",
+      "author": [
+        {
+          "family": "Kale",
+          "given": "David C."
+        },
+        {
+          "family": "Che",
+          "given": "Zhengping"
+        },
+        {
+          "family": "Bahadori",
+          "given": "Mohammad Taha"
+        },
+        {
+          "family": "Li",
+          "given": "Wenzhe"
+        },
+        {
+          "family": "Liu",
+          "given": "Yan"
+        },
+        {
+          "family": "Wetzel",
+          "given": "Randall"
+        }
+      ],
+      "container-title-short": "AMIA Annu Symp Proc",
+      "container-title": "AMIA Annual Symposium Proceedings",
+      "publisher": "American Medical Informatics Association",
+      "issued": {
+        "date-parts": [
+          [
+            2015
+          ]
+        ]
+      },
+      "page": "677-686",
+      "volume": "2015",
+      "PMID": "26958203",
+      "PMCID": "PMC4765623",
+      "type": "article-journal"
+    },
+    "citation_id": "FUIfIdE"
   }
 }
\ No newline at end of file
diff --git a/processed-citations.tsv b/processed-citations.tsv
index c66e76c0..af0e0f0e 100644
--- a/processed-citations.tsv
+++ b/processed-citations.tsv
@@ -7,12 +7,14 @@ text	citation	standard_citation	citation_id
 @arxiv:1511.02386	arxiv:1511.02386	arxiv:1511.02386	15lbUf0as
 @arxiv:1602.00357	arxiv:1602.00357	arxiv:1602.00357	HRXii6Ni
 @arxiv:1602.05629	arxiv:1602.05629	arxiv:1602.05629	TaPZBxYS
+@arxiv:1605.03661	arxiv:1605.03661	arxiv:1605.03661	173ftiSzF
 @arxiv:1605.07723	arxiv:1605.07723	arxiv:1605.07723	5Il3kN32
 @arxiv:1606.00931	arxiv:1606.00931	arxiv:1606.00931	1FE0F2pQ
 @arxiv:1606.05718	arxiv:1606.05718	arxiv:1606.05718	mbEp6jNr
 @arxiv:1606.08813v3	arxiv:1606.08813v3	arxiv:1606.08813v3	7yE9K08a
 @arxiv:1607.00133	arxiv:1607.00133	arxiv:1607.00133	ucHUOABT
 @arxiv:1607.07519	arxiv:1607.07519	arxiv:1607.07519	Ohd1Q9Xw
+@arxiv:1608.00647	arxiv:1608.00647	arxiv:1608.00647	c6MfDdWP
 @arxiv:1608.02158	arxiv:1608.02158	arxiv:1608.02158	qXdO2aMm
 @arxiv:1609.02943	arxiv:1609.02943	arxiv:1609.02943	ULSPV0rh
 @arxiv:1610.05820	arxiv:1610.05820	arxiv:1610.05820	1HbRTExaU
@@ -48,6 +50,7 @@ text	citation	standard_citation	citation_id
 @doi:10.1021/acs.molpharmaceut.5b00982	doi:10.1021/acs.molpharmaceut.5b00982	doi:10.1021/acs.molpharmaceut.5b00982	1VZjheOA
 @doi:10.1021/acs.molpharmaceut.6b00248	doi:10.1021/acs.molpharmaceut.6b00248	doi:10.1021/acs.molpharmaceut.6b00248	EMDwvRGb
 @doi:10.1021/ci500340n	doi:10.1021/ci500340n	doi:10.1021/ci500340n	16FEYidu2
+@doi:10.1037/h0037350	doi:10.1037/h0037350	doi:10.1037/h0037350	cpNVdlL7
 @doi:10.1038/nature14539	doi:10.1038/nature14539	doi:10.1038/nature14539	BeijBSRE
 @doi:10.1038/nature16961	doi:10.1038/nature16961	doi:10.1038/nature16961	2gn6PKkv
 @doi:10.1038/nbt.3313	doi:10.1038/nbt.3313	doi:10.1038/nbt.3313	yXqhuueV
@@ -75,6 +78,7 @@ text	citation	standard_citation	citation_id
 @doi:10.1093/bioinformatics/btu703	doi:10.1093/bioinformatics/btu703	doi:10.1093/bioinformatics/btu703	15E5yG1Ho
 @doi:10.1093/bioinformatics/btu791	doi:10.1093/bioinformatics/btu791	doi:10.1093/bioinformatics/btu791	7atXz0r
 @doi:10.1093/bioinformatics/btv472	doi:10.1093/bioinformatics/btv472	doi:10.1093/bioinformatics/btv472	kqjqFesT
+@doi:10.1093/jamia/ocw011	doi:10.1093/jamia/ocw011	doi:10.1093/jamia/ocw011	A9JeoGV8
 @doi:10.1093/nar/gkq747	doi:10.1093/nar/gkq747	doi:10.1093/nar/gkq747	QlbXLqH
 @doi:10.1093/nar/gku1058	doi:10.1093/nar/gku1058	doi:10.1093/nar/gku1058	12aqvAgz6
 @doi:10.1097/00000542-199004000-00024	doi:10.1097/00000542-199004000-00024	doi:10.1097/00000542-199004000-00024	nPRpl05n
@@ -97,6 +101,9 @@ text	citation	standard_citation	citation_id
 @doi:10.1136/amiajnl-2013-001935	doi:10.1136/amiajnl-2013-001935	doi:10.1136/amiajnl-2013-001935	11OyzMl87
 @doi:10.1142/9789813207813_0050	doi:10.1142/9789813207813_0050	doi:10.1142/9789813207813_0050	qe90c1CL
 @doi:10.1142/9789814644730_0014	doi:10.1142/9789814644730_0014	doi:10.1142/9789814644730_0014	PBiRSdXv
+@doi:10.1145/2567948.2577348	doi:10.1145/2567948.2577348	doi:10.1145/2567948.2577348	1qa47hoP
+@doi:10.1145/2661829.2661974	doi:10.1145/2661829.2661974	doi:10.1145/2661829.2661974	XQtuRkTU
+@doi:10.1145/2939672.2939823	doi:10.1145/2939672.2939823	doi:10.1145/2939672.2939823	TwvauiTv
 @doi:10.1158/1078-0432.CCR-13-0583	doi:10.1158/1078-0432.CCR-13-0583	doi:10.1158/1078-0432.ccr-13-0583	pEIw87Mp
 @doi:10.1186/1758-2946-5-30	doi:10.1186/1758-2946-5-30	doi:10.1186/1758-2946-5-30	M1EW8Rfl
 @doi:10.1186/s12859-015-0845-0	doi:10.1186/s12859-015-0845-0	doi:10.1186/s12859-015-0845-0	18lqFDKRR
@@ -119,6 +126,7 @@ text	citation	standard_citation	citation_id
 @doi:10.3389/fgene.2014.00342	doi:10.3389/fgene.2014.00342	doi:10.3389/fgene.2014.00342	ppGS5h4v
 @pmid:21347133	pmid:21347133	pmid:21347133	y9ONtSZ9
 @pmid:24159271	pmid:24159271	pmid:24159271	11sli93ov
+@pmid:26958203	pmid:26958203	pmid:26958203	FUIfIdE
 @pmid:27134610	pmid:27134610	pmid:27134610	4rTluXLs
 @tag:Abe	doi:10.1101/gr.634603	doi:10.1101/gr.634603	1HhqhBwrM
 @tag:Alipanahi2015_predicting	doi:10.1038/nmeth.3547	doi:10.1038/nmeth.3547	2UI1BZuD