-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[Discussion] 1.5.0 Roadmap #14619
Comments
Hey, this is the MXNet Label Bot. |
The changes since 1.4.0 release that are already merged in the master branch will be included in the 1.5.0 release. The list can be found at: https://github.com/apache/incubator-mxnet/compare/v1.4.x...master?expand=1 |
Hi everyone, I've created v1.5.x branch here: https://github.com/apache/incubator-mxnet/tree/v1.5.x |
Thanks for starting this! I would like to include exception handling fixes: #14397 (@anirudh2290), #14433(@anirudh2290) , #14575 (@arcadiaphy). These three should be merged by end of next week hopefully. Conversion of FP32 models to mixed precision models (#14584) (Should be in by May first week tentatively). In addition, I have some changes to profiler to visualize gpu memory pooling and help make better decisions on the env variable choice. It is currently in a branch (https://github.com/anirudh2290/mxnet/tree/memory_profiler_poc2) and intend to open a PR soon (next week). |
MKLDNN Quantization PR
FP32 optimization
Documentations
|
Some users pointed out useful features around matrix inversions, determinants, log determinants. I propose to add some small features to make these calculations easier: https://issues.apache.org/jira/projects/MXNET/issues/MXNET-1350?filter=allissues . Comment in this issue: These are relevant calculations and some adjustments to the existing tools would help newcomers more easily leverage the existing work. I'm interested and willing to implement this feature. I'm quite busy at the moment but can likely finish this over a few days before mid May. Thoughts? |
Easily the biggest feature Mxnet is lacking is the higher order gradient support. There appears to be some work to get this going, but it's been a bit stagnant. The lack of strong support for this feature prohibits the ability to implement a number of DL algorithms. Everything beyond this seems like quality of life features. I would offer to help on this front, but I won't have the time necessary to work it out. I list it here in hopes that others will answer the call. Beyond that, I think having dynamic shape in symbols would be a nice feature. On a smaller scale, I think it would be nice if Gluon had support for blocks that operate on keyword arguments. It's pretty easy to add support for that in a non-breaking way (and I've done it in my own projects), but ideally this feature would be supported in other code like the data loader, which currently is fairly structure around assuming tuples rather than dicts (which would pair with keyword args). A nitpick that I have is that when it comes to serialization, Mxnet (python) seems to assume you always want to write to a file in that it requests a path to a file to serialize the data. This often isn't appropriate in production systems. It would be much nicer if Mxnet simply took a file-like object or just returned bytes so you can do what you want with it. |
Features I'd like to see for 1.5 include: Amp if ready
New TensorRT integration with subgraph API support and FP16NVTX ranges for easier GPU profiling.+1 @pengzhao-intel on MKLDNN work. I'd love to make use of these optimizations. +1 to @anirudh2290's 3 very useful improvements. |
Any plan to simplify compilation process on windows? |
Yes, we have the plan for MKLDNN on windows and will fix it in 1.5. I will add into my table. |
Update parameters manually in training loop. |
I'd like #14869 to go in, estimated time to complete 05/10 |
I desperately need higher order differentiation. Plz make it possible. Thanks to everyone for all your contributions so far. |
@mouryarishik @jmacglashan Hi, about higher order gradients, @apeforest and @larroy are actively working on this and will be first available in the master branch and nightly pip install packages. Unfortunately, it won't make it to 1.5.0 as we plan to release soon. Stay tuned, thanks! |
Should we formally deprecate amalgamation as all it does is lead people down a dead end? |
@aaronmarkham is it broken? |
@aaronmarkham so does there exist a tutorial to illustrate how to get libmxnet.so for mobile devices? |
@szha My understanding is that it doesn't work. There are several open issues about it, but I haven't tried it out yet myself. |
@aaronmarkham that sounds like something that needs fixing. not sure if it's enough reason to kill it though |
Wouldn't it be better to have a preprocessor flag to achieve the same result? Cross compilation is solved. |
@mouryarishik could you give details about your usecase? Thanks. |
@larroy A lot of GAN models require 2nd order gradients for stabilised training. |
Would it be possible to fix this gluon serialization/deserialization bug #12795 in the 1.5 release? It has been opened for a long time (still not working in 1.4.1) and makes it hard to serialize gluon graphs for some applications e.g. in gluon-ts. |
@mouryarishik We've already have a few operators to support higher order gradient:
However, due to the current design of NNVM, the graph data structure to model the computation graph, the support of higher order gradient in operators has to be implemented one by one (good news is that moving to NNVM 2.0 in the near future higher order gradient in operators will be supported automatically by NNVM). In the meantime before NNVM is upgraded to 2.0, we plan to support higher order gradient in a limited number of operators. It would be great if you could identify a set of operators that are used in your model and require higher order gradient support. We will prioritize implementation for those operators. Thanks for your continuous support and passion for MXNet. |
I guess depends on the GAN, as you could have any layer, so if you want to use GAN with convs you need higher order for conv... |
@vafl duplicate name issue should have been fixed already. |
@szha In 1.4.1 the issue is still there (see the reproducible example in #12795 ). When you reuse any layer in a gluon graph, the graph cannot be serialized and loaded anymore. You have to explicitly create a new layer and share the parameters. I think what was pushed is a workaround of this issue for RNNs. |
@vafl yes what I meant is that 1.5.0 will include the fix. If you use the nightly package of mxnet you will see that the included code example is passing correctly. |
Adding a shape property to mxnet symbol would be great. |
@szha there're already lots of great proposals from the community. |
See #15589 for the new roadmap discussion. |
Let's start a discussion here about the roadmap towards 1.5.0. We are looking for:
If you have any item that you'd like to propose to have in the roadmap, please do:
cc @apache/mxnet-committers
The text was updated successfully, but these errors were encountered: