Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Sockeye to MXNet/Gluon 2.x #953

Closed
wants to merge 56 commits into from
Closed

Update Sockeye to MXNet/Gluon 2.x #953

wants to merge 56 commits into from

Conversation

fhieber
Copy link
Contributor

@fhieber fhieber commented Jun 9, 2021

This updates Sockeye to the APIs introduced in MXNet 2 (i.e. the mxnet/master branch).

Major changes with MX2:

  • *Output operators have been removed with MXNet v2. Therefore we are required to use our secondary loss implementation without SoftmaxOutput.
  • MX2 introduces a new interface for HybridBlocks. Users do not have to implement hybrid_forward, but only forward, which uses Parameter class members directly (i.e. self.weight.data()), and the mx.nd namespace (no more 'F'). One consequence of this is that we cannot use our pattern of implementing both forward and hybrid_forward for HybridBlocks to have shape-inspecting logic/computation in HybridBlocks. We resolve this by introducing the same pattern using __call__ and forward :)
  • MX numpy arrays do not implement __hash__, therefore we cannot use the lru_cache trick in OutputLayer. I worked around this by writing a hacky minimal 'cache', hasing on the string representation of the array.
  • Simplify state caching in inference as numpy arrays can have size 0 in dimensions, therefore we do not require 'dummy states of size'. -> this can have a small positive effect on inference speed for self-attentional decoders, since we can save a slice op.

Known/Open Issues:

  • Sparse gradient updates for embeddings do not work for mx numpy arrays. This is disabled for now along with a logger warning. MXNet team is looking into building a workaround for this.

Changes:

  • Fixed a bug with predicted output lengths in beam search (missing a repeat op to scale to beam size) -- this bug is also in master currently.
  • Fixing imports due to changed locations of modules
  • removing prefix logic in all blocks for SockeyeModel (not done for loss.py yet)
  • hybrid_forward -> forward for HybridBlocks for SockeyeModel
  • simplified parameter sharing: one can simply assign parameters from another block.
  • Removed the PrintValue custom op
  • removed with self.name_scope() everywhere.
  • Changes to inference (hybrid blocks for beam search)
  • Switch to mx.np -- the new Numpy Array interface
  • Numpy indexing/broadcasting requires (x, 1) shapes for various auxiliary arrays in beam search (finished, inactive etc.)
  • Changed saving/loading logic of prepared data to conform with numpy standards.
  • Write more detailed changelog for 2.4.0.
  • check if np.copy is still needed in encoders -- it is

Potential future work

  • Explore optimizing when and how to convert data back from MXnet numpy arrays to lists in inference (and whether that should be done on the inference device, GPUs, or on the CPU
  • Explore optimizing writing to bucket arrays in chunks instead of for loops for multi-factored data
  • Lexical constrained decoding could probably be sped up/vectorized with all the availble operators/indexing from numpy now

Related links:

Note: This PR changes the Github action for pull requests to install a nightly build of MX2.

Pull Request Checklist

  • Changes are complete (if posting work-in-progress code, prefix your pull request title with '[WIP]'
    until you can check this box.
  • Unit tests pass (pytest)
  • System tests pass (pytest test/system)
  • Passed code style checking (./style-check.sh)
  • You have considered writing a test
  • Updated major/minor version in sockeye/__init__.py. Major version bump if this is a backwards incompatible change.
  • Updated CHANGELOG.md

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@fhieber fhieber changed the title [DO NOT MERGE] Work in progress to update to MX2 APIs Update Sockeye to MXNet/Guon 2.x Jul 11, 2021
@fhieber fhieber changed the title Update Sockeye to MXNet/Guon 2.x Update Sockeye to MXNet/Gluon 2.x Jul 11, 2021
@fhieber fhieber closed this Nov 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant