[Discussion] 1.5.0 Roadmap #14619

szha · 2019-04-04T19:21:02Z

Let's start a discussion here about the roadmap towards 1.5.0. We are looking for:

New features that are useful to your research and development.
Improvements and patches to existing features.

If you have any item that you'd like to propose to have in the roadmap, please do:

Create (or locate existing) issue/pull request for the item, note the issue/pull request number.
Comment in this issue: 1) the above issue number, 2) one sentence of what the item is about and why it's useful to you.
Indicate whether you'd be willing to help out on the item.
Share the ETA if you're driving the item and have an guesstimate on when it will be done.

cc @apache/mxnet-committers

mxnet-label-bot · 2019-04-04T19:21:05Z

Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.
Here are my recommended labels: Feature

szha · 2019-04-04T19:49:02Z

The changes since 1.4.0 release that are already merged in the master branch will be included in the 1.5.0 release. The list can be found at: https://github.com/apache/incubator-mxnet/compare/v1.4.x...master?expand=1

eric-haibin-lin · 2019-04-04T21:01:39Z

Hi everyone, I've created v1.5.x branch here: https://github.com/apache/incubator-mxnet/tree/v1.5.x
Before we have an agreement on the timeline and features, I will synchronize this branch with the master branch periodically. Once we have decided the code freeze day, we will only cherry-pick required changes/features to the branch then.

anirudh2290 · 2019-04-04T21:43:28Z

Thanks for starting this! I would like to include exception handling fixes: #14397 (@anirudh2290), #14433(@anirudh2290) , #14575 (@arcadiaphy). These three should be merged by end of next week hopefully. Conversion of FP32 models to mixed precision models (#14584) (Should be in by May first week tentatively). In addition, I have some changes to profiler to visualize gpu memory pooling and help make better decisions on the env variable choice. It is currently in a branch (https://github.com/anirudh2290/mxnet/tree/memory_profiler_poc2) and intend to open a PR soon (next week).

pengzhao-intel · 2019-04-05T00:37:08Z

MKLDNN Quantization PR

Name	PR#	status
sum	#14614	DONE
relu	#14604	DONE
refactor requantize	#14608	DONE
improve quantize	#14641	DONE
conv + activation	#14819	DONE
cache op	#14785, #14931	DONE
quantization flow to support 0 shape (RNN, concat)	#15031	DONE
New models (SSD COCO/RN18/MobileNet v2)	#14646, #14823	DONE

FP32 optimization

Name	PR#	status
data loader for CPU	#14824	DONE
transpose	#14545	DONE
RNN refactor with NNVM	#14476	DONE
reshape enhance	#14903	DONE
sum1d	#14914	DONE
softmax 1d	#14818	DONE
MKL Math (ERF, mean, etc)	#14893	DONE
MKLDNN RNN (vRNN,LSTM)	#14713	DONE
Build (Window/Linux)	#14740, #14743, dmlc/mshadow#374, #14829 #14877	DONE
Update MKLDNN to 0.19	#14783	DONE

Documentations

Name	PR#	status
Windows Build Instruction	#14952	DONE
MKLDNN OP	#14891	DONE

zboldyga · 2019-04-05T18:47:42Z

Some users pointed out useful features around matrix inversions, determinants, log determinants. I propose to add some small features to make these calculations easier: https://issues.apache.org/jira/projects/MXNET/issues/MXNET-1350?filter=allissues .

#14360

Comment in this issue: These are relevant calculations and some adjustments to the existing tools would help newcomers more easily leverage the existing work.

I'm interested and willing to implement this feature.

I'm quite busy at the moment but can likely finish this over a few days before mid May.

Thoughts?

jmacglashan · 2019-04-20T16:32:48Z

Easily the biggest feature Mxnet is lacking is the higher order gradient support. There appears to be some work to get this going, but it's been a bit stagnant. The lack of strong support for this feature prohibits the ability to implement a number of DL algorithms. Everything beyond this seems like quality of life features. I would offer to help on this front, but I won't have the time necessary to work it out. I list it here in hopes that others will answer the call.

Beyond that, I think having dynamic shape in symbols would be a nice feature.

On a smaller scale, I think it would be nice if Gluon had support for blocks that operate on keyword arguments. It's pretty easy to add support for that in a non-breaking way (and I've done it in my own projects), but ideally this feature would be supported in other code like the data loader, which currently is fairly structure around assuming tuples rather than dicts (which would pair with keyword args).

A nitpick that I have is that when it comes to serialization, Mxnet (python) seems to assume you always want to write to a file in that it requests a path to a file to serialize the data. This often isn't appropriate in production systems. It would be much nicer if Mxnet simply took a file-like object or just returned bytes so you can do what you want with it.

KellenSunderland · 2019-04-21T20:21:21Z

Features I'd like to see for 1.5 include:

Amp if ready

MXNet AMP (automatic mixed precision) #14173
ETA: Not clear

New TensorRT integration with subgraph API support and FP16

NVTX ranges for easier GPU profiling.

+1 @pengzhao-intel on MKLDNN work. I'd love to make use of these optimizations. +1 to @anirudh2290's 3 very useful improvements.

stereomatchingkiss · 2019-04-22T01:46:27Z

Any plan to simplify compilation process on windows?
Any document to show us how to compile mxnet with the support of mkldnn on windows?

pengzhao-intel · 2019-04-22T10:25:06Z

Any plan to simplify compilation process on windows?
Any document to show us how to compile mxnet with the support of mkldnn on windows?

Yes, we have the plan for MKLDNN on windows and will fix it in 1.5. I will add into my table.
@yinghu5 @NeoZhangJianyu

shuokay · 2019-05-06T00:29:35Z

Update parameters manually in training loop.
#14735

roywei · 2019-05-08T23:21:02Z

I'd like #14869 to go in, estimated time to complete 05/10

stu1130 · 2019-05-15T00:30:48Z

Dependency Updates PR
#14950 Update CI to use latest cuDNN & fix the ARCH_MISMATCH error on m60 gpu
#14887 CUDA 10.1 PyPi script
#14588 Update the numpy version

mouryarishik · 2019-05-15T10:26:59Z

I desperately need higher order differentiation. Plz make it possible. Thanks to everyone for all your contributions so far.

roywei · 2019-05-16T16:36:46Z

@mouryarishik @jmacglashan Hi, about higher order gradients, @apeforest and @larroy are actively working on this and will be first available in the master branch and nightly pip install packages. Unfortunately, it won't make it to 1.5.0 as we plan to release soon. Stay tuned, thanks!

aaronmarkham · 2019-05-16T21:52:48Z

Should we formally deprecate amalgamation as all it does is lead people down a dead end?

szha · 2019-05-17T03:30:16Z

@aaronmarkham is it broken?

kohillyang · 2019-05-19T12:32:46Z

@aaronmarkham so does there exist a tutorial to illustrate how to get libmxnet.so for mobile devices?

aaronmarkham · 2019-05-20T18:15:08Z

@szha My understanding is that it doesn't work. There are several open issues about it, but I haven't tried it out yet myself.
@kohillyang I'd love to see a guide for this using a recent build of MXNet. The closest we have is the amalgamation guide:
https://mxnet.incubator.apache.org/versions/master/faq/smart_device.html
If you try it out, please keep me posted - I'd be happy to get the guide updated with tips on getting it to work.

szha · 2019-05-20T20:26:09Z

@aaronmarkham that sounds like something that needs fixing. not sure if it's enough reason to kill it though

larroy · 2019-06-19T01:06:00Z

Wouldn't it be better to have a preprocessor flag to achieve the same result? Cross compilation is solved.

larroy · 2019-06-19T01:06:58Z

@mouryarishik could you give details about your usecase? Thanks.

mouryarishik · 2019-06-19T03:13:08Z

@larroy A lot of GAN models require 2nd order gradients for stabilised training.

vafl · 2019-06-19T10:09:52Z

Would it be possible to fix this gluon serialization/deserialization bug #12795 in the 1.5 release?

It has been opened for a long time (still not working in 1.4.1) and makes it hard to serialize gluon graphs for some applications e.g. in gluon-ts.

apeforest · 2019-06-19T16:59:35Z

@mouryarishik We've already have a few operators to support higher order gradient:

elemwise_mul, log, log10, relu, FullyConnected, sin, cos, exp

However, due to the current design of NNVM, the graph data structure to model the computation graph, the support of higher order gradient in operators has to be implemented one by one (good news is that moving to NNVM 2.0 in the near future higher order gradient in operators will be supported automatically by NNVM).

In the meantime before NNVM is upgraded to 2.0, we plan to support higher order gradient in a limited number of operators. It would be great if you could identify a set of operators that are used in your model and require higher order gradient support. We will prioritize implementation for those operators.

Thanks for your continuous support and passion for MXNet.

larroy · 2019-06-20T01:48:42Z

I guess depends on the GAN, as you could have any layer, so if you want to use GAN with convs you need higher order for conv...

szha · 2019-06-20T03:45:24Z

@vafl duplicate name issue should have been fixed already.

vafl · 2019-06-20T14:24:10Z

@szha In 1.4.1 the issue is still there (see the reproducible example in #12795 ). When you reuse any layer in a gluon graph, the graph cannot be serialized and loaded anymore. You have to explicitly create a new layer and share the parameters.

I think what was pushed is a workaround of this issue for RNNs.

szha · 2019-06-21T00:55:11Z

@vafl yes what I meant is that 1.5.0 will include the fix. If you use the nightly package of mxnet you will see that the included code example is passing correctly.

vanewu · 2019-07-16T02:37:22Z

Adding a shape property to mxnet symbol would be great.

pengzhao-intel · 2019-07-16T14:33:57Z

@szha there're already lots of great proposals from the community.
I think we need to create a new topic for 1.6 roadmap :)

szha · 2019-07-18T18:23:19Z

See #15589 for the new roadmap discussion.

szha pinned this issue Apr 4, 2019

szha added Roadmap Call for Contribution labels Apr 4, 2019

KellenSunderland unpinned this issue Apr 28, 2019

szha pinned this issue Apr 29, 2019

larroy mentioned this issue Jun 26, 2019

[MXNET-978] Fully connected, higher order grad #14779

Merged

5 tasks

zboldyga mentioned this issue Jul 17, 2019

supporting matrix inversion and determinant #14360

Open

szha unpinned this issue Jul 18, 2019

szha closed this as completed Jul 18, 2019

samskalicky mentioned this issue Aug 19, 2019

Deserialization problem with gluon ValueError: There are multiple outputs with name ... #12795

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion] 1.5.0 Roadmap #14619

[Discussion] 1.5.0 Roadmap #14619

szha commented Apr 4, 2019 •

edited

Loading

mxnet-label-bot commented Apr 4, 2019

szha commented Apr 4, 2019

eric-haibin-lin commented Apr 4, 2019

anirudh2290 commented Apr 4, 2019

pengzhao-intel commented Apr 5, 2019 •

edited

Loading

zboldyga commented Apr 5, 2019

jmacglashan commented Apr 20, 2019 •

edited

Loading

KellenSunderland commented Apr 21, 2019 •

edited

Loading

stereomatchingkiss commented Apr 22, 2019

pengzhao-intel commented Apr 22, 2019

shuokay commented May 6, 2019

roywei commented May 8, 2019

stu1130 commented May 15, 2019

mouryarishik commented May 15, 2019 •

edited

Loading

roywei commented May 16, 2019

aaronmarkham commented May 16, 2019

szha commented May 17, 2019

kohillyang commented May 19, 2019 •

edited

Loading

aaronmarkham commented May 20, 2019

szha commented May 20, 2019

larroy commented Jun 19, 2019

larroy commented Jun 19, 2019

mouryarishik commented Jun 19, 2019

vafl commented Jun 19, 2019

apeforest commented Jun 19, 2019

larroy commented Jun 20, 2019

szha commented Jun 20, 2019

vafl commented Jun 20, 2019

szha commented Jun 21, 2019

vanewu commented Jul 16, 2019

pengzhao-intel commented Jul 16, 2019

szha commented Jul 18, 2019

[Discussion] 1.5.0 Roadmap #14619

[Discussion] 1.5.0 Roadmap #14619

Comments

szha commented Apr 4, 2019 • edited Loading

mxnet-label-bot commented Apr 4, 2019

szha commented Apr 4, 2019

eric-haibin-lin commented Apr 4, 2019

anirudh2290 commented Apr 4, 2019

pengzhao-intel commented Apr 5, 2019 • edited Loading

zboldyga commented Apr 5, 2019

jmacglashan commented Apr 20, 2019 • edited Loading

KellenSunderland commented Apr 21, 2019 • edited Loading

Amp if ready

New TensorRT integration with subgraph API support and FP16

NVTX ranges for easier GPU profiling.

stereomatchingkiss commented Apr 22, 2019

pengzhao-intel commented Apr 22, 2019

shuokay commented May 6, 2019

roywei commented May 8, 2019

stu1130 commented May 15, 2019

mouryarishik commented May 15, 2019 • edited Loading

roywei commented May 16, 2019

aaronmarkham commented May 16, 2019

szha commented May 17, 2019

kohillyang commented May 19, 2019 • edited Loading

aaronmarkham commented May 20, 2019

szha commented May 20, 2019

larroy commented Jun 19, 2019

larroy commented Jun 19, 2019

mouryarishik commented Jun 19, 2019

vafl commented Jun 19, 2019

apeforest commented Jun 19, 2019

larroy commented Jun 20, 2019

szha commented Jun 20, 2019

vafl commented Jun 20, 2019

szha commented Jun 21, 2019

vanewu commented Jul 16, 2019

pengzhao-intel commented Jul 16, 2019

szha commented Jul 18, 2019

szha commented Apr 4, 2019 •

edited

Loading

pengzhao-intel commented Apr 5, 2019 •

edited

Loading

jmacglashan commented Apr 20, 2019 •

edited

Loading

KellenSunderland commented Apr 21, 2019 •

edited

Loading

mouryarishik commented May 15, 2019 •

edited

Loading

kohillyang commented May 19, 2019 •

edited

Loading