-
Notifications
You must be signed in to change notification settings - Fork 6.8k
v1.0 Stable Release TODO List #2944
Comments
I would propose Float16 support as an additional target. |
|
For optimization part, @tqchen and I are thinking about supporting throw optimizer into computation graph, so less ccxx will be needed. |
Until we have RTC that doesn't help much. You still need at least 2x buffer. |
We may consider to building document on EC2, then sync back to readdoc because doc build fail for time out in compile. |
yes. or maybe just host from ec2 |
great!! |
@vchuravy we may need to put more effort on int8 rather than fp16. From current info, int8 will be mainstream in future. |
@antinucleon Great to hear, the work @Godricly and I have been working focused purely on making our operators support arbitrary DTypes. That should help the Int8 work as well? (this is of topic but I would expect FixedPoint with Int8 instead of truly Int8?) |
@vchuravy It is still investigated by @winstywang If use int8 directly, there is no performance gain. But official document mentions for new TitanX, the int8 performance is 44T, almost 4 times than fp32. |
@vchuravy NV should have specific instructions for int8, currently using int8 directly only brings 25% performance gain according to our test. |
My suggestion as follows:
|
stochastic depth can be done with bucketing. |
with NNVM we may enable fully dynamic execution. |
@piiswrong @leopd We need to move doc building system to EC2. Readthedoc system is keeping failure because building out of time. |
@antinucleon Is there any paper available right now for uint8 NN? And what is NNVM stands for? I'm having a hard time searching for it. |
Here are some thoughts about the docs:
|
Another thing I'd like to ask for is a refactor of LSTM; if it is possible. |
I would vote for another issue which is very important for user:
|
Resnet is caused by IO. Min has reproduced exact result by using Torch IO.
|
I hope that for each of the issue raised, people can show up and assign, or self assign each of the issue, so we are moving forward effectively. |
it's good to have a single page containing all things. but total agree that we can open issue for each point and cite the links here. |
@mli Yes. If someone wants to talk more about/start working on a task, feel free to open a new issue and link it here. Also assign it to milestone v1.0 |
Also we may consider to treat warning as error in the future. |
I'll list a roadmap for scala pkg this weekend. |
@antinucleon Can I know what's wrong with IO that causes the performance drop? |
For docs, I think the query of our github issues with keyword "how to" is a good source for getting a list of topics to potentially cover. |
@piiswrong What does NNVM stands for? |
@windywinter about NNVM: dmlc/MXNet.jl#115 |
@antinucleon, @jennyzhang0215 and I have implemented MemN2N and NTM and replicated the results in the paper, we may release the code after AAAI or WWW. I can send you the code if you need now. |
Is ok to do some code optimization in NNVM? #3105 |
Thanks all DMLC for this great effort |
It's about time for a feature complete stable release.
We are in the process of a major refactor. While most changes are in backend side and therefore should not significantly affect users, we do expect to break a few little things and maybe compatibility with other language bindings.
So authors of Julia, R, Scala, etc, package please stay tuned and adopt the new API. It should be a quick fix and we will have guide for the transition.
@thirdwing @pluskid @vchuravy @Ldpe2G
Transition Guide/List of Breaking Changes:
Developer
mshadow::TBlob
andmshadow::TShape
toTBlob
andTShape
in your code.Storage::Get()->Alloc(size, Context::GPU())
to allocate memory to current GPU instead.User
If you were training networks with BatchNormalization layer on CPU or on GPU with cudnn v4 or below before Jul 5th, you may find your model outputting totally wrong results after loading it back for testing. The simplest fix is to load your .param files with ndarray.load and set all arrays with key ending with '_gamma' to 1.0 and save them back.TODOs
Use NCCL reduce and broadcast and fix deadlock bug.NCCL is problematic with our engineThe text was updated successfully, but these errors were encountered: