Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow inplace memory optimization for different data type #1696

Merged
merged 6 commits into from
Sep 15, 2018

Conversation

ZhennanQin
Copy link
Contributor

@tqchen Sorry I messed up previous PR, so I closed that and recreate this clean one for review. For example, for quantized relu, it may inplace convert data from int8 to uint8. We should support this kind of operation.

@tqchen
Copy link
Member

tqchen commented Sep 9, 2018

The patch will introduce bug in cases such as int8->float32 inplace, because of the difference in terms of memory, if you want to make it work, at least we should check the data type size are the same.

Usually in place memory optimization is not as significant, we can even simply turn it off, as long as memory sharing is turned on. This is because the memory sharing optimization will enable sharing of int8 to the next few layers anyway..

@ZhennanQin
Copy link
Contributor Author

@tqchen Thanks for your comment. From my point, In place memory optimization should only care the total size of memory, instead of the element size. If there's a pooling operator can convert 2x2 int8 to 1x1 float32, I don't see any problem for allowing in place memory for it.

And for our case, it's not just an optimization for reducing memory usage, but a mandatory requirement from a special operator. It's like for an add operator, C = A + B, our library will always place result C into operand A. I guess memory sharing optimization can't figure out whether A or B should be in placed.

@zhenhuaw-me
Copy link
Contributor

@ZhennanQin I think @tqchen is not saying that in-place optimization is not needed, his point is (please correct me if i am going the wrong way :) @tqchen ) if you are going to optimize this scenario, a well-designed mechanism is needed - rather than some workaround-like thing that looks like to work (with respect)...

btw, is it supposed to be data type in title :)? and a personal curiosity, are you from MLT?

@ZhennanQin ZhennanQin changed the title Allow inplace memory optimization for different date type Allow inplace memory optimization for different data type Sep 10, 2018
@ZhennanQin
Copy link
Contributor Author

ZhennanQin commented Sep 10, 2018

@jackwish Thanks for pointing out the typo. Yes, I'm from intel MLT team, and working on integrating MKLDNN into mxnet. But I'm still not understanding, why you're thinking this is just a work around? Is there any other option to describe such in place memory limitation from computing library?

@zhenhuaw-me
Copy link
Contributor

@jackwish Thanks for pointing out the typo. Yes, I'm from intel MLT team, and working on integrating MKLDNN into mxnet. But I'm still not standing, why you're thinking this is just a work around? Is there any other option to describe such in place memory limitation from computing library?

Quote @tqchen

The patch will introduce bug in cases such as int8->float32 inplace, because of the difference in terms of memory, if you want to make it work, at least we should check the data type size are the same.

I am not sure what particular scenario (i mean the data types) you are trying to optimize, by a mandatory requirement from a special operator are you trying to say that this in-place optimization is to make some underlying library happy? @ZhennanQin

@ZhennanQin
Copy link
Contributor Author

ZhennanQin commented Sep 10, 2018

are you trying to say that this in-place optimization is to make some underlying library happy?

Yes. In another words, it's not an optimization, but a requirement from underlying library need to satisfy, just like the instruction restrictions need to handle by register allocation in compiler field. We need to describe it in some way to memory planning and scheduler to ensure that certain operand and output share same memory and the overrided operand won't be used anymore.

@zhenhuaw-me
Copy link
Contributor

By describe it in some way to memory planning and scheduler, you mean in TVM? I understand that underlying library (MKL-DNN?) may have restrictions, however, personally, in this context I think it could be a bit aggressive to change like this patch - is it safe (regarding memory layout)?

@ZhennanQin
Copy link
Contributor Author

is it safe (regarding memory layout)?

Currently, in place optimization doesn't care about memory layout, and I don't think we need to care it after removing data type check. Memory layout may change after in place computation, just like data layout.

Also, 'FInplaceOption' needs developer to explicitly set for those operators needed. It's developer's responsibility to use it carefully and correctly, to ensure it doesn't break any potential rule if has on particular backend.

@tqchen
Copy link
Member

tqchen commented Sep 10, 2018

While this may be OK for MLKDNN, if we directly do an inplace optimization to do int8->fp32 cast, there will be a problem

The following inplace cast code can cause bug, because when we write dfloat[0], it override dint[1];

void* data = malloc(4 * sizeof(float));
int8_t* dint8 = data;
float* dfloat = data;

for (int i = 0; i < 4; ++i) {
   dfloat[i] = dint8[i];
}

@yzhliu
Copy link
Member

yzhliu commented Sep 10, 2018

Looks like in this specific case, simply check dtype size can satisfy both sides.

@ZhennanQin
Copy link
Contributor Author

ZhennanQin commented Sep 11, 2018

if we directly do an inplace optimization to do int8->fp32 cast, there will be a problem

Firstly, we still check the memory size before doing in place optimization, so direct cast int8 -> fp32 won't be in placed.

Secondly, if we define a new operator to do data type cast and remain memory size unchanged, then whether it could be in placed depends on the algorithm it used. Eg, if the new cast is defined as slice+cast, which only cast first N / 4 elements to output(N is the element number), then we shouldn't do in place optimization for it because of the bug you mentioned. But if the new case is defined as pool(kernel=[2, 2], stride=2) + cast, then it could be in placed.

So SliceCast operator should do more check on data type size when register 'FInplaceOption' attribute, like

.set_attr<nnvm::FInplaceOption>(
    "FInplaceOption", [](const NodeAttrs& attrs) {
      if (CheckDtypeSize())
        return std::vector<std::pair<int, int> >{{0, 0}};
      else
        return std::vector<std::pair<int, int> >{};
    })

Overall, data type size check isn't mandatory for all in place optimization, it should be done when register 'FInplaceOption' attribute if necessary. I understand that you're worry about existing code may rely on the data type size check, I will add it for this PR to keep consistency.

@ZhennanQin
Copy link
Contributor Author

@tqchen Code is refactored to use a helper function to check data size. Please review.

@@ -13,6 +13,30 @@
namespace nnvm {
namespace pass {
namespace {
// Return bytes of data flag.
static int GetDTypeSize(int type_flag) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tqchen tqchen self-assigned this Sep 13, 2018
@tqchen tqchen added the status: need update need update based on feedbacks label Sep 13, 2018
@tqchen tqchen merged commit 4312660 into apache:master Sep 15, 2018
@tqchen tqchen added status: accepted and removed status: need update need update based on feedbacks labels Sep 15, 2018
@ptrendx ptrendx mentioned this pull request Feb 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants