Skip to content

word2vec, sentence2vec, machine reading comprehension, dialog system, text classification, pretrained language model (i.e., XLNet, BERT, ELMo, GPT), sequence labeling, information retrieval, information extraction (i.e., entity, relation and event extraction), knowledge graph, text generation, network embedding

Notifications You must be signed in to change notification settings

gaoisbest/NLP-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

b4ab56d · Jan 11, 2021
Jun 22, 2019
Jul 28, 2019
Aug 22, 2019
Dec 30, 2019
Jun 1, 2020
Jul 12, 2019
Jan 11, 2021
Mar 20, 2020
Oct 14, 2020
Jun 17, 2019
Nov 17, 2020
Mar 19, 2020
Nov 1, 2020
Jul 31, 2019
Jul 29, 2019

Repository files navigation

NLP-Projects

Natural Language Processing projects, which includes concepts and scripts about:

Concepts

1. Attention

  • Attention == weighted averages
  • The attention review 1 and review 2 summarize attention mechanism into several types:
    • Additive vs Multiplicative attention
    • Self attention
    • Soft vs Hard attention
    • Global vs Local attention

2. CNNs, RNNs and Transformer

  • Parallelization [1]

    • RNNs
      • Why not good ?
      • Last step's output is input of current step
    • Solutions
      • Simple Recurrent Units (SRU)
        • Perform parallelization on each hidden state neuron independently
      • Sliced RNNs
        • Separate sequences into windows, use RNNs in each window, use another RNNs above windows
        • Same as CNNs
    • CNNs
      • Why good ?
      • For different windows in one filter
      • For different filters
  • Long-range dependency [1]

    • CNNs
      • Why not good ?
      • Single convolution can only caputure window-range dependency
    • Solutions
      • Dilated CNNs
      • Deep CNNs
        • N * [Convolution + skip-connection]
        • For example, window size=3 and sliding step=1, second convolution can cover 5 words (i.e., 1-2-3, 2-3-4, 3-4-5)
    • Transformer > RNNs > CNNs
  • Position [1]

    • CNNs

      • Why not good ?
      • Convolution preserves relative-order information, but max-pooling discards them
    • Solutions

      • Discard max-pooling, use deep CNNs with skip-connections instead
      • Add position embedding, just like in ConvS2S
    • Transformer

      • Why not good ?
      • In self-attention, one word attends to other words and generate the summarization vector without relative position information
  • Semantic features extraction [2]

    • Transformer > CNNs == RNNs

3. Pattern of DL in NLP models [3]

  • Data

    • Preprocess
    • Pre-training (e.g., ELMO, BERT)
    • Multi-task learning
    • Transfer learning, ref_1, ref_2
      • Use source task/domain S to increase target task/domain T
    • If S has a zero/one/few instances, we call it zero-shot, one-shot, few-shot learning, respectively
  • Model

    • Encoder
      • CNNs, RNNs, Transformer
    • Structure
      • Sequential, Tree, Graph
  • Learning (change loss definition)

    • Adversarial learning
    • Reinforcement learning

References

Awesome public apis

Awesome packages

Chinese

English

Future directions

About

word2vec, sentence2vec, machine reading comprehension, dialog system, text classification, pretrained language model (i.e., XLNet, BERT, ELMo, GPT), sequence labeling, information retrieval, information extraction (i.e., entity, relation and event extraction), knowledge graph, text generation, network embedding

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published