-
-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorganizing optional package loading #918
Comments
I think this old discussion with @tbreloff about lazy loading is worth bringing up: Requires is a great solution that I think might be able to handle the lazy loading too. If backend initialization code (https://github.com/JuliaPlots/Plots.jl/blob/master/src/backends/gr.jl#L53) is in an https://github.com/JuliaPlots/Plots.jl/blob/master/src/backends.jl#L20 to # package_name(:gr) == :GR
$sym(; kw...) = (default(; kw...); @eval(:(import $(package_name(sym))); backend(Symbol($str))) so that way it's essentially doing using Plots
import GR plus the backend setup. That should make most of Plots.jl precompile just fine and it should be pretty fast? I think you'll need to make a dummy library for Plots.jl's internal Plotly though, since it's not the same as Plotly.jl. Maybe Requires can act on the importing of submodules (@MikeInnes ?)? |
The I hope that I can finish the conditional modules infrastructure for 0.7, but I have no time to polish that due to JuliaCon. |
@vchuravy is 100% right but it's also worth pointing out that a lot of precompile is parsing. For dynamic code the type inference won't buy you much, so Requires isn't likely to add a huge overhead vs full precompilation. OTOH, I haven't generally used Requires for more backend-like integrations (Like Flux/TensorFlow), because if the user explicitly sets the active backend you can just load the required code then. If you need to do |
I just saw @StefanKarpinski 's talk on Pkg3, and it seems to me he said that conditional modules will not be coming, in the sense we need them here. This would seem to point to the reorganisation numbered "2" above. Any thoughts on this? |
I vote for option 2. I think it would solve a lot problems and wouldn't affect usability at least for me. It would also possibly make it more clear that some of the backends are less mature (e.g. maybe not tagged at all). |
Personally I'm in favor of option 2. I think the most straightforward solution is to have a package PlotsGR which extends some methods of Plots in order to display a plot that was created with Plots. The key question, in my view, is understanding how to split duties between the Plots and PlotsGR package. As of now, I think the situation is not ideal. IIUC, what happens is that Plots creates a plot object (which is basically an Array of Dicts with all the different attributes, one Dict per series), transforming a high-level command from the user into very specific, low-level details of how the plot should be drawn (ticks, margins, colors, line-widths and so on...). The backend code then has a monolithic This approach has the following issues:
What I believe is an extra source of confusion is the fact that not all What could potentially help would be the following: We could choose one example backend, for example GR, and decompose the
All of this logic would then live into Plots.jl, where one would have something like: function display(sp::Subplot, w, h, viewport_canvas)
set_font!(sp, ...)
draw_grid!(sp, ...)
...
end These small functions (in the example way of coding function set_font!(sp::Subplot{GRBackend}, ...)
...
end and would represent the "backend interface". Then, to update an existing plot, one could call one of these methods rather than displaying the whole thing. To develop a new backend, one would only need to implement all of these small functions. The drawback of this plan is that it is a lot of work, so maybe we could start some PlotsBase + PlotsGR repos (while leaving Plots alone) and see how far that gets. |
@piever all you say sums up a lot of the problems I had Plots.jl.
Well, I've done quite a bit of that work in MakiE already and would love to have more help and use this is a prototype to see how we can structure Plots.jl in the future! Let me know how I can get people started! Now that quite a lot of the basic functionality is working, I think MakiE is ripe for another thorough documentation step (especially for the internals)! |
I'm also happy to try out (and contribute to) this reorg idea in MakiE.jl. Things that would help me the most are:
The last point is basically what I was mentioning above. I believe it's crucial to have a backend independent representation of a plot object (in Plots.jl it is an array of dictionaries, I believe we'd want a richer structure if some attributes have to be linked to others): figuring out this representation would in my view be a major step forward. The next step would be figuring out how to structure a display function that goes hand in hand with the plot representation (i.e. where the "isolated components" of the plot have corresponding small update functions) in an abstract way, independent of GLVisualize. Finally, one would add the GLVisualize implementation of the "interface" functions used to display the plot. With this refactor I think MakiE would get much closer to the conceptual organization of Plots and it'd be much easier for Plots contributors to also contribute to MakiE. |
What I've been recommending is that:
This will all compile well and lets things run much smoother than dictionaries, yet it should make it easy for Plots.jl to specify logic at a high level and let the packages handle the specifics. |
I'd like to differ... I prototyped a completely typed pipeline in What needs to happen though is a proper definition of what attributes are expected, how they get transformed, where default values come from and a definition of invalid attributes - which is, as I remember it, a mess in Plots.jl right now. I will write more in the documentations about how I plan to structure things and the integration with Plots.jl! I have quite a few ideas about this and quite a bit is already implemented :) |
Platform-, language- and device independent graphics is always "a lot of work". We will contribute where we can, but we are still in the process of improving the deployment of GR for both Julia and Python, which IMO is the hardest job. |
@simon if you know the common type signatures you can force them to precompile though. |
@ChrisRackauckas yeah kind of, but the other points are still true as well. And precompilation doesn't actually do much, so unless we cache the binary representation, we will still have huge jit overheads. Because of that I don't see this as a high priority anymore... |
We could call a test suite of sorts in the top level of the module to force it to precompile a lot more. Of course it won't be type stable at the top level call either. I even recommended using a default global there. That doesn't matter. It's in the repeated calls further down. That's what I was advocating and it seems we agree except for type vs dict in the construction. I find the code in plots building a big dict harder to read anyways since there is no layout describing what should be in there. The fields could be left untyped or only typed when obvious, but I think structuring it into types has this readability advantage as well. |
That's exactly what I agree with! And I think I found a relatively nice way to work around that! |
I basically created a macro, that defines for a specific plot what an attribute should get converted to, what attributes are allowed and it does proper error handling for invalid attributes. It also documents all attributes semi automatically! |
You can automatically generate precompile statements with SnoopCompile. But the first sentence of https://github.com/timholy/SnoopCompile.jl#userimgjl is quite important: what gets saved to |
This is an interesting discussion, and it will be interesting to see the improvements @SimonDanisch is making and whether they could be applied in Plots. But I'll say I like @ChrisRackauckas suggestion a lot. |
To add some feedback to this discussion, I gave a talk on julia to my department (biology/ecology) yesterday - who largely use R and C. The startup time for Plots and plots backends (and probably the general perception that gave of the julia) was singled out as the major drawback to switching to julia for R converts. Using RCall plot() from julia in the same demo was incomparably faster and really rubbed it in. I think fixing this critical for general use of julia for R users. |
I can totally understand that! If I wasn't already invested in Julia this would also be a deal breaker to me :P After all, I'm the kind of person that once switched from firefox to chrome because it started 0.5s faster :D ...which is why I'm trying to get |
One solution that is pretty close to working today is making Revise as good as it can be and keeping your Julia session open a long time (say, a week). At that point startup and JIT-compilation become basically irrelevant. Revise can't really work properly on 0.6, but as of yesterday 0.7 seems to be mostly "Revise-ready." Need to merge timholy/Revise.jl#49 and then it should work for basically any package (if it doesn't, it's a bug). Method deletion is also very, very close to working. That will leave changes to type definitions as the one reason you truly have to restart a Julia session. (Macros and generated functions might require some manual |
I already work like that with Atom. And while it's nice I still need to restart Julia more often then I'd like to (for whatever reason) - and every time really hurts! |
Atom/Juno is an amazing solution, but AFAICT it isn't as low-level as Revise. Just try Overall I've started thinking about it this way: in C, your development cost is
In julia, the numerical factor is much smaller (Julia is a much more efficient language to write code in than C), but unfortunately it may be If you haven't tried it yet (perhaps because of current flaws in Revise), for big projects like GLVisualize/MakiE I can't even begin to describe how much nicer it is. For my own use I am just about to throw 0.6 under the bus because Revise should be so much more robust on 0.7. Of course, there are then a lot of packages I will need to fix before I can get real work done. |
Well, static-julia is for shipping a package that you don't touch anymore - and I also want to get it going for offering Julia packages to other languages without having them to wait for the JIT. Maybe it's my fault, but I sadly have a pretty high fatal error rate (especially with OpenCL/CUDA/OpenGL - also stackoverflows sometimes terminate julia) and I also change my types relatively often when prototyping stuff... OpenGL also forces me to restart because of some messed up state of the horrible opengl state machine. Don't get me wrong, I think revise is a bliss for developers and I'm pretty sure I will start using it! But we will also need to target another audience at some point: people that will never touch package code and want to run They just want to write their self contained scripts calling package code and restart Julia as they please ;) |
Oh yes, for users more static compilation is the answer. Definitely 👍 👍 for that. |
Didn't sound grumpy to me. It's just that I'm not sure there's a good solution to the compile-time problem right now. I brought up the whole Revise thing because I think that keeping the same session running is IMO the best answer we have now, and likely to stay that way until at least Julia 1.1. To provide a little more detail: I suspect that the best hope for reducing time-to-first-plot would be through a well-chosen set of precompile statements. But these have pretty serious limitations: for it to make a difference, IIUC you have to "own" all the types and functions that are arguments to that call in your package. So saying With regards to optional dependencies, other people here have many more insights than I do, so I won't even venture a comment. Looking back again at the OP, I see that's your main concern in this issue, so apologies for derailing it. |
Oh, one thought (sorry to still be at "reduce time-to-first-plot"): https://discourse.julialang.org/t/does-compile-time-depend-on-type-stability/6548/5?u=tim.holy |
Honestly that seems like it counts as a bug in those packages to me. If loading a certain combination of packages breaks things then issues are inevitable whether or not Plots works around it now. |
The issue seems to run pretty deep. For me, normal is: julia> tic(); using Plots; toc()
elapsed time: 5.966643648 seconds
5.966643648
julia> tic(); p = plot(rand(10,10)); toc()
elapsed time: 9.649117342 seconds
9.649117342
julia> tic(); display(p); toc()
elapsed time: 5.783277532 seconds
5.783277532 adding julia> tic(); using Plots; toc()
WARNING: using Plots.GR in module Main conflicts with anexisting identifier.
elapsed time: 9.443139341 seconds
9.443139341
julia> tic(); p = plot(rand(10,10)); toc()
elapsed time: 9.0992447 seconds
9.0992447
julia> tic(); display(p); toc()
elapsed time: 4.139666058 seconds
4.139666058 adding t;dr: I don't think a re-org would give the change that I hoped, and a full-blown change (Makie.jl) is required unless there's a big upstream change to the way precompilation works. |
I found a way to lazily load a submodule on demand In file /MyPackage/src/MyPackage.jl __precompile__(true)
module MyPackage
...
# Allow Julia to find modules in the extensions/src directory (or any other directory)
function __init__()
local path = joinpath(Pkg.dir(), "MyPackage", "extensions", "src")
path in LOAD_PATH || push!(LOAD_PATH, path)
end
...
# No reference to MyPackageSubA whatsoever
...
end #module
In file /MyPackage/extensions/src/MyPackageSubA.jl __precompile__(true)
module MyPackageSubA
using MyPackage
...
export foo
foo() = println("Module MyPackageSubA loaded successfully.")
...
end #module julia> using MyPackage
julia> foo()
UndefVarError: foo not defined
julia> using MyPackageSubA
julia> foo()
Module MyPackageSubA loaded successfully. Julia currently cannot handle if !isdefined(Main.MyPackage)
using MyPackage
end
let
local path = joinpath(Pkg.dir(), "MyPackage", "extensions", "src")
path in LOAD_PATH || push!(LOAD_PATH, path)
end
using SubA
# using MyPackage.SubA would be nicer, if this means that I can
# refer to unexported variables in SubA as follows
MyPackage.SubA.bar So far I haven't encountered any gotcha's yet. |
Does that reduce the startup time? I'm not sure that can be AOT compiled if it has to change global load paths. |
@ChrisRackauckas I made the following changes to my code above (changed the |
I don't really understand the time to first plot results: @time using Plots
@time pyplot(show=true)
@time plot(rand(10,10)) 6.816656 seconds (6.19 M allocations: 349.942 MiB, 3.78% gc time)
6.892494 seconds (4.57 M allocations: 245.328 MiB, 1.28% gc time)
14.941858 seconds (9.96 M allocations: 526.248 MiB, 1.28% gc time) vs. @time using Plots
@time gr(show=true)
@time plot(rand(10,10)) 6.827028 seconds (6.19 M allocations: 349.926 MiB, 3.87% gc time)
1.760695 seconds (1.76 M allocations: 93.075 MiB, 1.39% gc time)
12.434105 seconds (10.41 M allocations: 537.749 MiB, 1.55% gc time) Here are the "pure" GR results: @time using GR
@time plot(rand(10,10)) 0.073958 seconds (34.38 k allocations: 2.758 MiB)
2.986878 seconds (2.32 M allocations: 123.543 MiB, 7.26% gc time) |
That's what I am saying is probably the pipeline, since that's really the one thing introduced in there. Even when it's not doing much it still has to send dictionaries through it. We would want to AOT compile that on most types, which isn't possible with something like PackageCompiler.jl given the stuff Simon says in the README about globals and uninferred stuff. |
So really what should be in focus should be to move to a more type oriented (is named tuples inferrable here?) design for plot attributes? I appreciate the other efforts being made in the ecosystem, but it would be sad to have all of Plots be for nothing all due to the time it takes to get to the first plot. |
NamedTuples would be inferrable, but I think plots heavily uses the mutability of the dictionaries. I don't think Plots would be all for nothing since if Makie has the right recipe system then a lot could carry other, and IMO the pipeline could use a rework to handle some of the issues it wasn't created with in mind. |
Well, MakiE is still at a stage where if all the julians start using it for their daily plotting needs, the repo will be swimming in angry issues in no time. Saying "Plots doesn't work anymore, Makie will take over" is much more likely to just kill Plots in the short term, and some other plotting package will take over long before MakiE is mature enough. 2D plotting is complex, in different ways than that most often encountered with 3D plotting. That's the point of @pkofod 's comment as I understand it - let's fix this if it's doable. |
I'm talking about development, not usage. I'm still using Plots, but I don't think it can keep developing in a way that can solve some of its longest standing issues, and some of these issues are quite important. Makie will probably take more than a year. |
So what do you suggest? |
Keep using and recommending Plots, while building/testing compilation/recipes/backend support on Makie. I am just assuming at this point that some things like font scaling, legends outside of the plot, etc. won't be fixed, but in general plots will be a good safe option to plot with for quite a long time. |
Who's going to want to maintain plots though if everyone says it's a sinking ship ? If makie is the future (which is yet to be proven though I admire the work and ambition) everyone should stop developing plots and work on makie instead ( although it's unclear if that is feasible or if makie is more of a one person project). |
Maintaining what Plots.jl does well is much different than accommodating all of the feature requests that it has. I think it's perfectly fine to understand the Plots.jl probably won't implement all of the feature requests it has (and many seem extremely difficult), but it can be a bug-free stable plotting library which people can depend on. |
I'd tend to agree with @pkofod here: Makie still has a long way to go before becoming a full replacement (and it's unclear how Plots developers should contribute to it) and it seems like lazy loading of backends is not the limiting factor of start up time (and even if it were, we're willing to work around that by loading eagerly the GR backend). I'd say we should focus on giving the best possible experience with GR and see if there is anything we can do to reduce start up time (which tbh with GR is ~ 20 seconds, definitely not the end of the world especially given that with Revise or Juno there really isn't that much need to continuously restart Julia). Both using |
Here's my attempt of a reorg: #1598 |
Have you seen this: JuliaLang/julia#2025 (comment)? |
No I was not aware of this, thanks a lot for the link! |
I think so, but I haven't used it, nor do I really know much about this general problem. The Requires.jl PR by Tim was JuliaPackaging/Requires.jl#46, maybe there is more info there. |
OK, then I'll give that solution a try as well ... if only I would have known this a few days ago ... 🙂 |
@daschw do we want to keep this open? |
No, I think that's resolved. |
The two main issues plagueing Plots are, and have been for a long time, 1) slow loading time and time-to-first-plot, and 2) instabilities (world age, precompilation, the "Media" issue) due to conditional package loading. (There is third, lesser, issue about the big git history that makes this incompatible with e.g. JuliaPro at the moment, but that we hope to resolve with Pkg3). Addressing these two must be on our roadmap.
This issue is to discuss how to deal with the second, but the solution could also markedly improve the first. In a recent discourse thread (https://discourse.julialang.org/t/optional-dependencies-requires-jl/3294) involving some of the people who know a lot about Julia's package system, there were a number of different suggestions, that nicely sum up the various ideas that have been on the table.
There were also a number of other ideas, e.g. https://discourse.julialang.org/t/optional-dependencies-requires-jl/3294/9 and https://discourse.julialang.org/t/optional-dependencies-requires-jl/3294/24 , though I'm not completely sure how those would solve the issue.
Finally, @vchuravy has made a follow-up PR on the discourse discussion JuliaLang/julia#21743 which might deliver the infrastructure we need in 0.7 or 1.0.
I think resolving this should be a high priority.
cc @tbreloff @pkofod @jheinen @daschw @Evizero @ChrisRackauckas @pfitzseb
The text was updated successfully, but these errors were encountered: