Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Fix LSTM and GRU layers gradient calculations #18203

Merged
merged 6 commits into from
May 14, 2020

Commits on May 12, 2020

  1. Fix input gradient calculation for bidirectional LSTM

    For bidiractional LSTM with number of layers > 2 input gradient calculation was incorrect.
    Reason of wrong calculations was overwriting y derivative (dy) tensor by
    calculated x derivative (dx) tensor before right2left layer could use dy for own
    gradient calculations.
    Propsed fix uses additional space to avoid overwriting.
    bgawrych committed May 12, 2020
    Configuration menu
    Copy the full SHA
    be2e87c View commit details
    Browse the repository at this point in the history
  2. Fix gradient calculation for GRU

    For GRU with number of layers > 2 i2h_weight gradient for
    layers in the middle (all except last and first) was incorrect.
    Wrong caluculations were caused by assigning output pointer to
    input instead of calculating new input pointer.
    bgawrych committed May 12, 2020
    Configuration menu
    Copy the full SHA
    624e530 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7bb38c4 View commit details
    Browse the repository at this point in the history
  4. Fix comments

    bgawrych committed May 12, 2020
    Configuration menu
    Copy the full SHA
    201f671 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    df6e90c View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    34d8947 View commit details
    Browse the repository at this point in the history