Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Enable serializing/deserializing ndarrays in np_shape semantics #15090

Merged

Conversation

reminisce
Copy link
Contributor

Description

np_shape semantics was introduced to support future NumPy operators where scalar tensors and zero-size tensors are common to see. Due to the concern on the potential issues of backward compatibility when this semantics is enabled, such as different handling on scalar tensors w/ or w/o this semantics, serializing/deserializing was simply marked as unsupported when this semantics is enabled.

At the moment, DGL developers want to enable this semantics in their work to support zero-size tensors. Simply disabling serializing/deserializing ndarrays of all types: dense, sparse, zero-size, and scalars would make their unit tests fail in np_shape semantics.

After careful consideration, we decided to loosen the constraint to support serialization/deserialization in the semantics of np_shape for ndarrays satisfying ALL the following three conditions as it would be the same as handling future NumPy ndarrays.

  1. The storage type MUST be default type, i.e. this is a dense ndarray.
  2. The ndarray CANNOT be a zero-size ndarray, i.e. with shape like (2, 0, 3).
  3. The ndarray CANNOT be a scalar ndarray, i.e. with shape ().

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

src/ndarray/ndarray.cc Outdated Show resolved Hide resolved
@abhinavs95
Copy link
Contributor

@mxnet-label-bot add [NDArray]

@reminisce reminisce force-pushed the loosen_ndarray_serialization_constraint branch from ab68e7e to 610606b Compare May 31, 2019 06:36
@reminisce
Copy link
Contributor Author

@zheng-da Please test this PR in DGL to see if any test of saving/loading zero-size tensors is broken before merge.

@reminisce reminisce changed the title Loosen the constraint on serializing/deserializing ndarrays in np_shape semantics Enable serializing/deserializing ndarrays in np_shape semantics May 31, 2019
@@ -1580,13 +1581,20 @@ static const uint32_t NDARRAY_V1_MAGIC = 0xF993fac8;
/* magic number for ndarray version 2, with storage type */
static const uint32_t NDARRAY_V2_MAGIC = 0xF993fac9;

// magic number for ndarray version 3, with np shape semantics.
// The ndarray must be saved and loaded within np shape semantics.
static const uint32_t NDARRAY_V3_MAGIC = 0xF993faca;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any tests for handling legacy storage types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zheng-da
Copy link
Contributor

zheng-da commented Jun 1, 2019

@reminisce I checked with DGL. it works fine.

@zheng-da zheng-da merged commit e8a20fb into apache:master Jun 1, 2019
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
…he#15090)

* Loosen the contraint on serializing/deserializing ndarrays within the scope of np_shape

* Support save/load dense ndarrays in np_shape semantics
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants