Deep Machine Learning Knowledge Exchange

General Machine Learning Topics
- Games
NLP
- Word2Vec source code
- FastText Code
- gensim
- Memory networks are implemented in MemNN. Attempts to solve task of reason attention and memory.
- Stack RNN source code and blog post
- Pre-trained word embeddings for WSJ corpus by Koc AI-Lab
- HLBL language model by Turian
- Real-valued vector "embeddings" by Dhillon
- Improving Word Representations Via Global Context And Multiple Word Prototypes by Huang
- Dependency based word embeddings
- Global Vectors for Word Representations
- TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
- *Node.js and Javascript - - Node.js Libaries for NLP
  - Twitter-text - A JavaScript implementation of Twitter's text processing library
  - Knwl.js - A Natural Language Processor in JS
  - Retext - Extensible system for analyzing and manipulating natural language
  - NLP Compromise - Natural Language processing in the browser
  - Natural - general natural language facilities for node
- Python - Python NLP Libraries
  - Scikit-learn: Machine learning in Python
  - Natural Language Toolkit (NLTK)
  - Pattern - A web mining module for the Python programming language. It has tools for natural language processing, machine learning, among others.
  - TextBlob - Providing a consistent API for diving into common natural language processing (NLP) tasks. Stands on the giant shoulders of NLTK and Pattern, and plays nicely with both.
  - YAlign - A sentence aligner, a friendly tool for extracting parallel sentences from comparable corpora.
  - jieba - Chinese Words Segmentation Utilities.
  - SnowNLP - A library for processing Chinese text.
  - KoNLPy - A Python package for Korean natural language processing.
  - Rosetta - Text processing tools and wrappers (e.g. Vowpal Wabbit)
  - BLLIP Parser - Python bindings for the BLLIP Natural Language Parser (also known as the Charniak-Johnson parser)
  - PyNLPl - Python Natural Language Processing Library. General purpose NLP library for Python. Also contains some specific modules for parsing common NLP formats, most notably for FoLiA, but also ARPA language models, Moses phrasetables, GIZA++ alignments.
  - python-ucto - Python binding to ucto (a unicode-aware rule-based tokenizer for various languages)
  - python-frog - Python binding to Frog, an NLP suite for Dutch. (pos tagging, lemmatisation, dependency parsing, NER)
  - python-zpar - Python bindings for ZPar, a statistical part-of-speech-tagger, constiuency parser, and dependency parser for English.
  - colibri-core - Python binding to C++ library for extracting and working with with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way.
  - spaCy - Industrial strength NLP with Python and Cython.
  - PyStanfordDependencies - Python interface for converting Penn Treebank trees to Stanford Dependencies.
- *C++ - - C++ Libraries
  - MIT Information Extraction Toolkit - C, C++, and Python tools for named entity recognition and relation extraction
  - CRF++ - Open source implementation of Conditional Random Fields (CRFs) for segmenting/labeling sequential data & other Natural Language Processing tasks.
  - CRFsuite - CRFsuite is an implementation of Conditional Random Fields (CRFs) for labeling sequential data.
  - BLLIP Parser - BLLIP Natural Language Parser (also known as the Charniak-Johnson parser)
  - colibri-core - C++ library, command line tools, and Python binding for extracting and working with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way.
  - ucto - Unicode-aware regular-expression based tokenizer for various languages. Tool and C++ library. Supports FoLiA format.
  - libfolia - C++ library for the FoLiA format
  - frog - Memory-based NLP suite developed for Dutch: PoS tagger, lemmatiser, dependency parser, NER, shallow parser, morphological analyzer.
  - MeTA - MeTA : ModErn Text Analysis is a C++ Data Sciences Toolkit that facilitates mining big text data.
  - Mecab (Japanese)
  - Mecab (Korean)
  - Moses
- *Java - - Java NLP Libraries
  - Stanford NLP
  - OpenNLP
  - ClearNLP
  - Word2vec in Java
  - ReVerb Web-Scale Open Information Extraction
  - OpenRegex An efficient and flexible token-based regular expression language and engine.
  - CogcompNLP - Core libraries developed in the U of Illinois' Cognitive Computation Group.
- *Scala - - Scala NLP Libraries
  - Saul - Library for developing NLP systems, including built in modules like SRL, POS, etc.
- Clojure
  - Clojure-openNLP - Natural Language Processing in Clojure (opennlp)
  - Infections-clj - Rails-like inflection library for Clojure and ClojureScript
- Ruby
  - Kevin Dias's A collection of Natural Language Processing (NLP) Ruby libraries, tools and software
- Service
  - Wit-ai - Natural Language Interface for apps and devices.
  - Iris - Free text search API over large public document collections.

Books

Papers

General Machine Learning Topics
- A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks [OpenReview]
- A Way out of the Odyssey: Analyzing and Combining Recent Insights for LSTMs [arXiv]
- b-GAN: Unified Framework of Generative Adversarial Networks [OpenReview]
- Can Active Memory Replace Attention? [arXiv]
- Capacity and Learnability in Recurrent Neural Networks [OpenReview]
- Categorical Reparameterization with Gumbel-Softmax [arXiv]
- Deep Information Propagation [OpenReview]
- Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models [arXiv]
- Fully Character-Level Neural Machine Translation without Explicit Segmentation [arXiv]
- Importance Sampling with Unequal Support [arXiv]
- Incremental Sequence Learning [arXiv]
- Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning [arXiv]
- Learning to Protect Communications with Adversarial Neural Cryptography [arXiv]
- Professor Forcing: A New Algorithm for Training Recurrent Networks [arXiv]
- Overcoming catastrophic forgetting in neural networks [arXiv]
- Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer [OpenReview]
- Quasi-Recurrent Neural Networks [arXiv]
- Structured Attention Networks [OpenReview]
- Unrolled Generative Adversarial Networks [OpenReview]
- Using Fast Weights to Attend to the Recent Past [arXiv]
A Diversity-Promoting Objective Function for Neural Conversation Models [arXiv]
A Neural Attention Model for Abstractive Sentence Summarization [arXiv]
A Neural Conversational Model [arXiv]
A Neural Network Approach to Context-Sensitive Generation of Conversational Responses [arXiv]
A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification [arXiv]
Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models [arXiv]
Adaptive Computation Time for Recurrent Neural Networks [arXiv]
An Actor-Critic Algorithm for Sequence Prediction [arXiv]
Associative Long Short-Term Memory [arXiv]
Attention with Intention for a Neural Network Conversation Model [arXiv]
Attention-over-Attention Neural Networks for Reading Comprehension [arXiv]
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [arXiv]
Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex [arXiv]
Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models [arXiv]
Building Machines That Learn and Think Like People [arXiv]
Character-Aware Neural Language Models [arXiv]
Character-level Convolutional Networks for Text Classification [arXiv]
Contextual LSTM (CLSTM) models for Large scale NLP tasks [arXiv]
Deep Knowledge Tracing [arXiv]
Deep Networks with Stochastic Depth [arXiv]
Distilling the Knowledge in a Neural Network [arXiv]
Distributed Representations of Sentences and Documents [arXiv]
Document Embedding with Paragraph Vectors [arXiv]
Document Embedding with Paragraph Vectors [arXiv]
Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers [arXiv]
End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning [arXiv]
End-To-End Memory Networks [arXiv]
Exploring the Limits of Language Modeling [arXiv]
Generating Sentences from a `ous Space [arXiv]
Grammar as a Foreign Langauage [arXiv]
Hierarchical Multiscale Recurrent Neural Networks [arXiv]
Incorporating Copying Mechanism in Sequence-to-Sequence Learning [arXiv]
Latent Predictor Networks for Code Generation [arXiv]
Layer Normalization [arXiv]
Learning Online Alignments with Continuous Rewards Policy Gradient [arXiv]
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation [arXiv]
Learning to Execute [arXiv]
Learning to Translate in Real-time with Neural Machine Translation [arXiv]
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism [arXiv]
Natural Language Comprehension with the EpiReader [arXiv]
Neural Machine Translation by Jointly Learning to Align and Translate [arxiv]
Neural Machine Translation with Recurrent Attention Modeling [arXiv]
Neural Responding Machine for Short-Text Conversation [arXiv]
Neural Turing Machines [arxiv]
On the Properties of Neural Machine Translation: Encoder-Decoder Approaches [arXiv]
On Using Very Large Target Vocabulary for Neural Machine Translation [arXiv]
Pointer Networks [arXiv]
Pointer Sentinel Mixture Models arXiv
Progressive Neural Networks [arXiv]
Recurrent Memory Network for Language Modeling [arXiv]
Recurrent Models of Visual Attention [arXiv]
Recurrent Neural Machine Translation [arXiv]
Recurrent Neural Network Regularization [arXiv]
ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks [arXiv]
Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves [arXiv]
SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient [arXiv]
Sequence to Sequence Learning with Neural Networks [arXiv]
Sequence-Level Knowledge Distillation [arXiv]
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention [arXiv]
Skip-Thought Vectors [arXiv]
Spatial Transformer Networks [arXiv]
SQuAD: 100,000+ Questions for Machine Comprehension of Text [arXiv]
Text Understanding from Scratch [arXiv]
Training Very Deep Networks [arXiv]
WebNav: A New Large-Scale Task for Natural Language based Sequential Decision Making [arXiv]
Wide & Deep Learning for Recommender Systems [arXiv]
A Character-level Decoder without Explicit Segmentation for Neural Machine Translation [arXiv]
A Convolutional Neural Network for Modelling Sentences [arXiv]
A Fast Unified Model for Parsing and Sentence Understanding [arXiv]
A guide to convolution arithmetic for deep learning [arXiv]
A Network-based End-to-End Trainable Task-oriented Dialogue System [arXiv]
A New Method to Visualize Deep Neural Networks [[arXiv](A New Method to Visualize Deep Neural Networks)]
A Persona-Based Neural Conversation Model [arXiv]
A Primer on Neural Network Models for Natural Language Processing [arXiv]
A Roadmap towards Machine Intelligence [arXiv]
A Semisupervised Approach for Language Identification based on Ladder Networks [arXiv]
A Survey: Time Travel in Deep Learning Space: An Introduction to Deep Learning Models and How Deep Learning Models Evolved from the Initial Ideas [arXiv]
Adversarially Learned Inference [arXiv]
Architectural Complexity Measures of Recurrent Neural Networks [arXiv]
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing [arXiv]
Aspect Level Sentiment Classification with Deep Memory Network [arXiv]
Attend, Infer, Repeat: Fast Scene Understanding with Generative Models [arXiv]
Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures [arXiv]
AVEC 2016 - Depression, Mood, and Emotion Recognition Workshop and Challenge [arXiv]
Bag of Tricks for Efficient Text Classification [arXiv]
Benefits of depth in neural networks [arXiv]
BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 [arXiv]
Bitwise Neural Networks [arXiv]
Character-based Neural Machine Translation [arXiv]
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images [arXiv]
Cognitive Science in the era of Artificial Intelligence: A roadmap for reverse-engineering the infant language-learner [arXiv]
Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search arXiv
Colorful Image Colorization [arXiv]
Concrete Problems in AI Safety [arXiv]
Conditional Image Generation with PixelCNN Decoders [arXiv]
Connecting Generative Adversarial Networks and Actor-Critic Methods [arXiv]
Context-Dependent Word Representation for Neural Machine Translation [arXiv]
Conversational Contextual Cues: The Case of Personalization and History for Response Ranking [arXiv]
Convolutional Neural Networks for Sentence Classification [arxiv]
Correlational Neural Networks [arXiv]
Coverage-based Neural Machine Translation [arXiv]
Data Programming: Creating Large Training Sets, Quickly [arXiv]
Deconstructing the Ladder Network Architecture [arXiv]
Decoupled Neural Interfaces using Synthetic Gradients [arXiv]
Deep API Learning [arXiv]
Deep Learning without Poor Local Minima [arXiv]
Deep Neural Networks for YouTube Recommendations [paper]
Deep Portfolio Theory [arXiv]
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translatin [arXiv]
Deep Reinforcement Learning Discovers Internal Models [arXiv]
Deep Reinforcement Learning for Dialogue Generation [arXiv]
Deeply-Fused Nets [arXiv]
DeViSE: A Deep Visual-Semantic Embedding Model [pub]
Dialog-based Language Learning [arXiv]
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning [arXiv]
Dynamic Capacity Networks [arXiv]
Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes [arXiv]
Effective Use of Word Order for Text Categorization with Convolutional Neural Networks [arXiv]
Efficient Estimation of Word Representations in Vector Space [arXiv]
End-to-End Reinforcement Learning of Dialogue Agents for Information Access [arXiv]
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF [arXiv]
Energy-based Generative Adversarial Network [arXiv]
Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks [arXiv]
Exploiting Similarities among Languages for Machine Translation [arXiv]
Extraction of Salient Sentences from Labelled Documents [arXiv]
Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation [arXiv]
FractalNet: Ultra-Deep Neural Networks without Residuals [arXiv]
Gated-Attention Readers for Text Comprehension [arXiv]
Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus [arXiv]
Generating images with recurrent adversarial networks [arXiv]
Generative Adversarial Networks [arXiv]
Going Deeper with Convolutions [arXiv]
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation [arXiv]
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation [arXiv]
Hierarchical Memory Networks [arXiv]
Higher Order Recurrent Neural Networks [arXiv]
How NOT To Evaluate Your Dialogue System [arXiv]
HyperNetworks [arXiv]
Improved Techniques for Training GANs [arXiv])
Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs [arXiv]
Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning [arXiv]
Improving the Robustness of Deep Neural Networks via Stability Training [arXiv]
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [arXiv]
Iterative Alternating Neural Attention for Machine Reading [arXiv]
Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition [arXiv]
Key-Value Memory Networks for Directly Reading Documents [arXiv]
Language to Logical Form with Neural Attention [arXiv]
Learning Discriminative Features via Label Consistent Neural Network [arXiv]
Learning Distributed Representations of Sentences from Unlabelled Data [arXiv]
Learning End-to-End Goal-Oriented Dialog [arXiv]
Learning Language Games through Interaction [arXiv]
Learning Longer Memory in Recurrent Neural Networks [arXiv]
Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention [arXiv]
Learning Simple Algorithms from Examples [arXiv]
Learning to Compose Neural Networks for Question Answering [arXiv]
Learning to learn by gradient descent by gradient descent [arXiv]
Learning to Transduce with Unbounded Memory [arXiv]
Learning Word Segmentation Representations to Improve Named Entity Recognition for Chinese Social Media [arXiv]
Listen, Attend and Spell [arxiv]
Long Short-Term Memory-Networks for Machine Reading [arXiv]
Machine Comprehension Using Match-LSTM and Answer Pointer [arXiv]
Maxout Networks [arXiv]
Memory-Efficient Backpropagation Through Time [arXiv]
Memory-enhanced Decoder for Neural Machine Translation [arXiv]
Model-Free Episodic Control [arXiv]
Movie Description [arXiv]
MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition [arXiv]
Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss [arXiv]
Multiple Object Recognition with Visual Attention [arXiv]
Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation [arXiv]
Multi-Task Cross-Lingual Sequence Tagging from Scratch [arXiv]
Natural Language Processing (almost) from Scratch [arXiv]
Net2Net: Accelerating Learning via Knowledge Transfer [arXiv]
Neural Architectures for Fine-grained Entity Type Classification [arXiv]
Neural Architectures for Named Entity Recognition [arXiv]
Neural GPUs Learn Algorithms [arXiv]
Neural Language Correction with Character-Based Attention [arXiv]
Neural Net Models for Open-Domain Discourse Coherence [arXiv]
Neural Network Translation Models for Grammatical Error Correction [arXiv]
Neural Programmer: Inducing Latent Programs with Gradient Descent [arXiv]
Neural Programmer-Interpreters [arXiv]
Neural Random-Access Machines [arxiv]
Neural Semantic Encoders [arXiv]
Neural Variational Inference for Text Processing [arXiv]
On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models [arXiv]
On Multiplicative Integration with Recurrent Neural Networks [arxiv]
One-Shot Generalization in Deep Generative Models [arXiv]
One-shot Learning with Memory-Augmented Neural Networks [arXiv]
Online and Offline Handwritten Chinese Character Recognition [arXiv]
Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network [arXiv]
PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents [arXiv]
Pixel Recurrent Neural Networks [arXiv]
Playing FPS Games with Deep Reinforcement Learning [arXiv]
Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games [arXiv]
Policy Networks with Two-Stage Training for Dialogue Systems [arXiv]
Recurrent Batch Normalization [arXiv]
Recurrent Dropout without Memory Loss [arXiv]
Recurrent Highway Networks [arXiv]
Recurrent Neural Network Grammars [arXiv]
Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [[arXiv](Recursive Recurrent Nets with Attention Modeling for OCR in the Wild)]
Regularizing RNNs by Stabilizing Activations [arXiv]
Reinforcement Learning Neural Turing Machines [arXiv]
ReSeg: A Recurrent Neural Network for Object Segmentation [arXiv]
Residual Networks of Residual Networks: Multilevel Residual Networks [arXiv]
Safe and Efficient Off-Policy Reinforcement Learning [arXiv]
Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention [arXiv]
Semi-Supervised Classification with Graph Convolutional Networks [arXiv]
Semi-Supervised Learning with Ladder Networks [arXiv]
Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction [arXiv]
Sequence-to-Sequence Learning as Beam-Search Optimization [arXiv]
Sequence-to-Sequence RNNs for Text Summarization [arXiv]
Session-based Recommendations with Recurrent Neural Networks [arXiv]
sk_p: a neural program corrector for MOOCs [arXiv]
Smart Reply: Automated Response Suggestion for Email [arXiv]
Stacked Approximated Regression Machine: A Simple Deep Learning Approach [arXiv]
Stealing Machine Learning Models via Prediction APIs [arXiv]
Survey on the attention based RNN model and its applications in computer vision [arXiv]
Swivel: Improving Embeddings by Noticing What’s Missing [arXiv]
Temporal Attention Model for Neural Machine Translation [arXiv]
TensorFlow: A system for large-scale machine learning [arXiv]
The IBM 2016 English Conversational Telephone Speech Recognition System [arXiv]
The Inevitability of Probability: Probabilistic Inference in Generic Neural Networks Trained with Non-Probabilistic Feedback [arXiv]
Towards an integration of deep learning and neuroscience [arXiv]
Towards Deep Symbolic Reinforcement Learning [arXiv]
Towards Principled Unsupervised Learning [arXiv]
Training Recurrent Neural Networks by Diffusion [arXiv]
Tree-structured composition in neural networks without tree-structured architectures [arXiv]
Tutorial on Variational Autoencoders [arXiv]
Understanding Deep Convolutional Networks [arXiv]
Unsupervised Learning for Physical Interaction through Video Prediction [arXiv]
Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles [arXiv]
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks [arXiv]
Very Deep Convolutional Networks for Large-Scale Image Recognition [arXiv]
Very Deep Convolutional Networks for Natural Language Processing [arXiv]
Virtual Adversarial Training for Semi-Supervised Text Classification [arXiv]
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations [arXiv]
Visual Storytelling [arXiv]
Visualizing and Understanding Convolutional Networks [arXiv]
Visualizing and Understanding Neural Models in NLP [arXiv]
WaveNet: A Generative Model For Raw Audio [arXiv]
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks [arXiv]
Wide Residual Networks [arXiv]
WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia [arXiv]
Zero-Resource Translation with Multi-Lingual Neural Machine Translation [arXiv]
Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations [arXiv]- Computer Vision
- Image-to-Image Translation with Conditional Adversarial Networks [arXiv]
- Lip Reading Sentences in the Wild [arXiv]
- Deep Residual Learning for Image Recognition [arXiv]
- Rethinking the Inception Architecture for Computer Vision [arXiv]
- Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks [arXiv]
- Video Pixel Networks [arXiv]
Audio Processing
- Deep Speech 2: End-to-End Speech Recognition in English and Mandarin [arXiv]
Reinforcement Learning
- Learning to reinforcement learn [arXiv]
- Learning to reinforcement learn [arXiv]
- A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models [arXiv]
- The Predictron: End-To-End Learning and Planning [OpenReview]
- Third-Person Imitation Learning [OpenReview]
- Generalizing Skills with Semi-Supervised Reinforcement Learning [OpenReview]
- Sample Efficient Actor-Critic with Experience Replay [OpenReview]
- Reinforcement Learning with Unsupervised Auxiliary Tasks [arXiv]
- Neural Architecture Search with Reinforcement Learning [OpenReview]
- Towards Information-Seeking Agents [OpenReview]
- Multi-Agent Cooperation and the Emergence of (Natural) Language [OpenReview]
- Improving Policy Gradient by Exploring Under-appreciated Rewards [OpenReview]
- Stochastic Neural Networks for Hierarchical Reinforcement Learning [OpenReview]
- Tuning Recurrent Neural Networks with Reinforcement Learning [OpenReview]
- RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning [arXiv]
- Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning [OpenReview]
- Learning to Perform Physics Experiments via Deep Reinforcement Learning [OpenReview]
- Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU [OpenReview]
- Learning to Compose Words into Sentences with Reinforcement Learning[OpenReview]
- Deep Reinforcement Learning for Accelerating the Convergence Rate [OpenReview]
- #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning [arXiv]
- Learning to Compose Words into Sentences with Reinforcement Learning [OpenReview]
- Learning to Navigate in Complex Environments [arXiv]
- Unsupervised Perceptual Rewards for Imitation Learning [OpenReview]
- Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic [OpenReview]
NLP
- General Topics
  - Strategies for Training Large Vocabulary Neural Language Models [arXiv]
  - Multilingual Language Processing From Bytes [arXiv]
  - Learning Document Embeddings by Predicting N-grams for Sentiment Classification of Long Movie Reviews [arXiv]
  - Target-Dependent Sentiment Classification with Long Short Term Memory [arXiv]
  - Reading Text in the Wild with Convolutional Neural Networks [arXiv]
  - Deep Reinforcement Learning with a Natural Language Action Space [arXiv]
  - Sequence Level Training with Recurrent Neural Networks [arXiv]
  - Teaching Machines to Read and Comprehend [arxiv]
  - Semi-supervised Sequence Learning [arXiv]
  - Multi-task Sequence to Sequence Learning [arXiv]
  - Alternative structures for character-level RNNs [arXiv]
  - Larger-Context Language Modeling [arXiv]
  - A Unified Tagging Solution: Bidirectional LSTM Recurrent Neural Network with Word Embedding [arXiv]
  - Towards Universal Paraphrastic Sentence Embeddings [arXiv]
  - BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies [arXiv]
  - Sequence Level Training with Recurrent Neural Networks [arXiv]
  - Natural Language Understanding with Distributed Representation [arXiv]
  - sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings [arXiv]
  - LSTM-based Deep Learning Models for non-factoid answer selection [arXiv]
- Review Articles
- Word Vectors
  - A Primer on Neural Network Models for Natural Language Processing Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
  - A neural probabilistic language model Bengio 2003. Seminal paper on word vectors.
  - Efficient Estimation of Word Representations in Vector Space
  - Distributed Representations of Words and Phrases and their Compositionality
  - Linguistic Regularities in Continuous Space Word Representations
  - Enriching Word Vectors with Subword Information
  - Deep Learning, NLP, and Representations
  - GloVe: Global vectors for word representation Pennington, Socher, Manning. 2014. Creates word vectors and relates word2vec to matrix factorizations. Evalutaion section led to controversy by Yoav Goldberg
  - Infinite Dimensional Word Embeddings - new
  - Skip Thought Vectors - word representation method
  - Adaptive skip-gram - similar approach, with adaptive properties
- Named Entity Recognition
- Sentiment Analysis
  - Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Socher et al. 2013. Introduces Recursive Neural Tensor Network and dataset: "sentiment treebank." Includes demo site. Uses a parse tree.
  - Distributed Representations of Sentences and Documents
  - Deep Recursive Neural Networks for Compositionality in Language
  - Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
  - Semi-supervised Sequence Learning
  - Bag of Tricks for Efficient Text Classification
  - Adversarial Training Methods for Semi-Supervised Text Classification [arXiv]
- Neural Machine Translation & Dialog
  - A Convolutional Encoder Model for Neural Machine Translation [arXiv]
  - A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues
  - A Neural Network Approach to Context-Sensitive Generation of Conversational Responses Sordoni et al. 2015. Generates responses to tweets. Uses Recurrent Neural Network Language Model (RLM) architecture of (Mikolov et al., 2010). source code: RNNLM Toolkit
  - A Neural Conversation Model Vinyals, Le 2015. Uses LSTM RNNs to generate conversational responses. Uses seq2seq framework. Seq2Seq was originally designed for machine translation and it "translates" a single sentence, up to around 79 words, to a single sentence response, and has no memory of previous dialog exchanges. Used in Google Smart Reply feature for Inbox
  - A Persona-Based Neural Conversation Model Li et al. 2016 Proposes persona-based models for handling the issue of speaker consistency in neural response generation. Builds on seq2seq.
  - Addressing the Rare Word Problem in Neural Machine Translation (abstract)
  - Attention with Intention for a Neural Network Conversation Model Yao et al. 2015 Architecture is three recurrent networks: an encoder, an intention network and a decoder.
  - Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models Serban, Sordoni, Bengio et al. 2015. Extends hierarchical recurrent encoder-decoder neural network (HRED).
  - Batch Policy Gradient Methods for Improving Neural Conversation Models [OpenReview]
  - Context-Dependent Word Representation for Neural Machine Translation
  - Cross-lingual Pseudo-Projected Expectation Regularization for Weakly Supervised Learning
  - Deep Reinforcement Learning for Dialogue Generation Li et al. 2016. Uses reinforcement learing to generate diverse responses. Trains 2 agents to chat with each other. Builds on seq2seq.
  - Deep learning for chatbots Article summary of state of the art, and challenges for chatbots.
  - Deep learning for chatbots. part 2 Implements a retrieval based dialog agent using dual encoder lstm with TensorFlow, based on the Ubuntu dataset [paper] includes source code
  - Dialogue Learning With Human-in-the-Loop [OpenReview]
  - Dual Learning for Machine Translation [arXiv]
  - Effective Approaches to Attention-based Neural Machine Translation
  - Generating Chinese Named Entity Data from a Parallel Corpus
  - Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
  - Improving Neural Language Models with a Continuous Cache [OpenReview]
  - Incorporating Copying Mechanism in Sequence-to-Sequence Learning Gu et al. 2016 Proposes CopyNet, builds on seq2seq.
  - Iterative Refinement for Machine Translation [OpenReview]
  - IXA pipeline: Efficient and Ready to Use Multilingual NLP tools
  - Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation (abstract)
  - Learning through Dialogue Interactions [OpenReview]
  - Neural Responding Machine for Short-Text Conversation Shang et al. 2015 Uses Neural Responding Machine. Trained on Weibo dataset. Achieves one round conversations with 75% appropriate responses.
  - Neural Machine Translation by jointly learning to align and translate Bahdanau, Cho 2014. "comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation." Implements attention mechanism. English to French Demo
  - Neural Machine Translation in Linear Time [arXiv]
  - On Using Very Large Target Vocabulary for Neural Machine Translation
  - Sequence to Sequence Learning with Neural Networks (nips presentation). Uses seq2seq to generate translations.
  - Towards an automatic Turing test: Learning to evaluate dialogue responses [OpenReview]
  - Unsupervised Pretraining for Sequence to Sequence Learning [arXiv]
  - Uses Recurrent Neural Network Language Model (RLM) architecture of (Mikolov et al., 2010). source code: RNNLM Toolkit
  - Vocabulary Selection Strategies for Neural Machine Translation [OpenReview]
- Image Captioning
  - Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Xu et al. 2015 Creates captions by feeding image into a CNN which feeds into hidden state of an RNN that generates the caption. At each time step the RNN outputs next word and the next location to pay attention to via a probability over grid locations. Uses 2 types of attention soft and hard. Soft attention uses gradient descent and backprop and is deterministic. Hard attention selects the element with highest probability. Hard attention uses reinforcement learning, rather than backprop and is stochastic.
  - Open source implementation in TensorFlow
- Memory and Attention Models
  - Memory Networks
  - End-To-End Memory Networks Sukhbaatar et. al 2015.
  - Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks Weston 2015. Classifies QA tasks like single factoid, yes/no etc. Extends memory networks.
  - Evaluating prerequisite qualities for learning end to end dialog systems Dodge et. al 2015. Tests Memory Networks on 4 tasks including reddit dialog task. See Jason Weston lecture on MemNN
  - Neural Turing Machines
  - Olah and Carter blog on NTM
  - Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets
  - Reasoning, Attention and Memory RAM workshop at NIPS 2015. slides included
- General NLP topics

Communities

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.idea		.idea
notes		notes
papers		papers
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Machine Learning Knowledge Exchange

Hello, this is winnerineast. I believe the better future is Human Being + Machine and I'm working on it in order to make it happen. Here is the inventory for all kinds of knowledges I collected from internet without any sign-in.

Special and equivalent thanks to (I just appended the name at the tail when I leverage or borrow his/her information)

Andrew Thomas

Denny Britz

Flood Sung

Keon Kim

Nam Vu

Part of the informations are from

ai-reading-list

awesome-machine-learning

awesome-spanish-nlp

DL4NLP

jjangsangy's awesome-nlp

nlp-reading-group

Notes and Tutorials

Courses

People

Source Codes

Books

Papers

Communities

About

Releases

Packages

License

winnerineast/MLKX

Folders and files

Latest commit

History

Repository files navigation

Deep Machine Learning Knowledge Exchange

Hello, this is winnerineast. I believe the better future is Human Being + Machine and I'm working on it in order to make it happen. Here is the inventory for all kinds of knowledges I collected from internet without any sign-in.

Special and equivalent thanks to (I just appended the name at the tail when I leverage or borrow his/her information)

Part of the informations are from

About

Resources

License

Stars

Watchers

Forks