-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove DataFlow.jl, use IRTools #30
Comments
I'm thinking about refactoring/forking ONNX.jl for stability and export of graphs (ultimately trying to interface with TF light). How would IRtools compare to https://github.com/phipsgabler/DynamicComputationGraphs.jl as foundation for this work? I'm looking at ease of use/ robustness and accessibility of the code. The thought of working with IR feels intimidating and DCG seems like it's a bit closer in abstraction to the destination. Also would curious to hear if @phipsgabler has any thoughts on this? |
Well, I don't think DCG is stable enough, yet. And I don't really know in what way you are using IR-transforming generated functions here. But maybe there's an intermediate solution, which I have already been speculating about implementing. Namely, when you think about the IR transformation code of DCG, it's basically a little, very specialized reimplementation of Cassette's It would certainly be possible to factor out the Cassette-like overloading part cleanly, and implement DCG on top of it. Maybe the same principle works for ONNX as well, but I don't feel qualified to answer that. |
But if I look at it more closely, ONNX only emits IR, and does not need to change existing IR, right? In that case, just DCG's purpose is just to record computation graphs of calls to existing functions, in form of "extended Wengert lists". |
@AriMKatz I was also quite intimdated by the output from IRTools and ended up with a non-metaprogramming/non-compilerhacking solution which instead uses multiple-dispatch (abuse) and this worked out for the cases I'm interested in. @phipsgabler Uhm, also, sorry for derailing topic.... I'm interested in the original issue since my package will break the day dataflow breaks. Is there any plan to do this and if not, will you accept a PR for it (given that it probably will hit almost the whole codebase)? |
@DrChainsaw Unfortunately had to leave Julia world for the project which spurred my comment, so I won't be getting to this anytime soon.... But I'd be excited to see it happen anyway. |
@DrChainsaw But to get the ONNX graph, you'd just need (as far as I understand) to to recurse once over the IR of the function you want to serialize, get the all the original IR down to "primitives", and translate it -- a static computation graph. IRTracker (former DCG) will give you a new function with new IR that tracks the execution of the original function every time it is called -- a dynamic computation graph. As an analogy: you can just take a TensorFlow graph and analyze/transform its structure, since it's fixed over multiple executions. In contrast, you wouldn't like to do that in a PyTorch model, where you define a new, possibly different graph every time you execute something (which is similar to what IRTracker does). It's not that I want to avoid work, I just think that my project does not do the things needed for this use case :) But I can quickly throw together a recursive static tracker some time. |
@phipsgabler Understood. It was basically just the tracing functionality along with the context hooks (e.g. an OnnxContext which collects recognized ops in a protobuffer) which seemed applicable. I understand that the tracing is not the main portion of what IRTracker does and this (along with the imaturity warning) is why I chose to roll my own tracing mechanism in ONNXmutable. FYI, what PyTorch offers in terms of ONNX export for its dynamic part is basically what you describe: Run the graph once and whatever path is taken this time gets recorded. Not much else you can do I guess given that the ONNX graphs are basically static. This is the same approach I have taken as well: Run the function once with a tracer (AbstractProbe) as input and whatever functions the tracer recognizes gets recorded. ONNX has some support for control code (loops and conditions) and to make those work in my approach one would need to put those in a special function which is recognized by the tracing mechanism (i.e. function branchcondition(x, condition, path1, path2)). It would be awesome if one could build some IR hacking tool which translates julia control code to the ONNX equivalent. I'm very inexperienced with IR hacking but to me this seems to be identical to language transpiling which afaik is a very hard problem. |
Well, I must take back my hesitations :D Turns out I had a complete error in reasoning -- it's the recusion that poses a problem. We of course can't track IR recursively in a static form, since the the selection of methods happens only at runtime. I implemented this, which made me realize my mistake. The dynamo records the I think to really generalize this approach, one would need a concolic execution tool making sure every function is called with enough different arguments so that all branches are taken. If the one-path solution is enough, then IRTracker could in fact work for this purpose (and would be mature enough, I guess, at least to some ad-hoc IR hacking code). |
Thanks alot @phipsgabler I'm currently pursuing other interests, but I will revisit your example and see if I can refactor my approach, if nothing else then for the sake of learning. |
@DrChainsaw have you seen: https://github.com/MikeInnes/Mjolnir.jl/issues/1 It seems to have the right level of abstraction and functionality for refactoring this and also your computational graph packages. Uses an abstract tracing approach that already can return a graph of operations. It seems much more powerful, general but still user friendly than any other approach as it's based on IR tools but higher level. I plan to play around with it some more in the coming days. |
@AriMKatz It would also be awesome to clear out all those vertices from the NaiveXX packages and just work with generic julia code (although I'm not sure if the level of usefulness of those packages motivates the effort). Btw, if you are still looking into ONNX export then feel free to take whatever you need from ONNXmutable. That package should anyways have the exporter parts split out into a separate package as they are not dependent on NaiveNASflux. |
This seems like a description of an abstract interpreter; i.e. the tool that both Julia's type inference and Mjolnir use to handle both control flow and recursion. |
Closing this now that we're using https://github.com/dfdx/Ghost.jl. CC @dfdx in case there are any useful insights in this thread. |
IRTools is generally much more powerful and easy to use than DataFlow.jl, and more importantly it's maintained (DataFlow.jl is effectively deprecated at this point). It should be pretty easy to port over the parts of this package that use it.
One thing IRTools does not do as well just yet is printing of expressions, but that's coming along (FluxML/IRTools.jl#25), and since ONNX.jl doesn't have any control flow yet it should be particularly straightforward.
The text was updated successfully, but these errors were encountered: