-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Curated list of models #716
Comments
Can you clarify what the "umbrella package" is? If the "umbrella package" is MLJ.jl, then I would definitely complain. I don't want What's wrong with asking users to install |
For example, the ensemble functionality lives inside MLJ.jl. I would be quite annoyed if I had to install a whole bunch of unrelated packages just so I could use MLJ's ensemble functionality. |
Now, on the other hand, if we first moved ALL of the functionality out of MLJ.jl into other repos, then I would have no problem adding a whole bunch of dependencies to MLJ.jl. But as long as there is functionality in MLJ.jl that is not available in another package (MLJBase.jl, etc.), then I am opposed to adding lots of dependencies to MLJ.jl. |
So I guess the two options are:
|
For the record, MLJ is not intended to load any code, but still has the ensemble.jl stuff. The plan has always been to remove this. Maybe there are few other small things too, I forget. Also, I very much like @DilumAluthge 's proposal JuliaAI/MLJModels.jl#346 to address the beginner's problem. @juliohm What do you think? |
Also, if you want to directly load a model (no macros) you can do julia> load_path("PCA")
"MLJMultivariateStatsInterface.PCA"
julia> load_path("RandomForestRegressor")
ERROR: ArgumentError: Ambiguous model name. Use pkg=... .
The model RandomForestRegressor is provided by these packages:
["DecisionTree", "ScikitLearn"].
Stacktrace:
[1] info(::String; pkg::Nothing) at /Users/anthony/.julia/packages/MLJModels/GyILf/src/model_search.jl:80
[2] load_path(::String; pkg::Nothing) at /Users/anthony/.julia/packages/MLJModels/GyILf/src/loading.jl:32
[3] load_path(::String) at /Users/anthony/.julia/packages/MLJModels/GyILf/src/loading.jl:32
[4] top-level scope at REPL[16]:1
julia> load_path("RandomForestRegressor", pkg="ScikitLearn")
"MLJScikitLearnInterface.RandomForestRegressor"
julia> using MLJScikitLearnInterface
julia> import MLJScikitLearnInterface.RandomForestRegressor
julia> RandomForestRegressor()
RandomForestRegressor(
n_estimators = 100,
criterion = "mse",
max_depth = nothing,
min_samples_split = 2,
min_samples_leaf = 1,
min_weight_fraction_leaf = 0.0,
max_features = "auto",
max_leaf_nodes = nothing,
min_impurity_decrease = 0.0,
bootstrap = true,
oob_score = false,
n_jobs = nothing,
random_state = nothing,
verbose = 0,
warm_start = false,
ccp_alpha = 0.0,
max_samples = nothing) @245 |
I think my concern is twofold: (1) we still need manual intervention to get a new model into an existing session. This could be addressed with a prompt installation option yes/no triggered by |
I fully support this idea. MLJ.jl would therefore provide a more user-friendly installation for users who are not writing packages, but actually writing ML pipelines for solving their problems with various models from a curated list. Advanced users seeking a more lightweight dependency to add to their own packages could be using a subpackage of the MLJ.jl stack like MLJBase.jl and MLJModelInterface.jl, and possibly a MLJEnsemble.jl. In summary, one must always keep in mind two types of users:
|
I am opening this issue to discuss the possibility of a curated list of models.
Right now end-users are forced to rely on a non-trivial macro
@load
that fails depending on the scope (local vs. global) and can be considered advanced for newcomers.My opinion is that a curated list should be the recommended workflow where users don't need to bother installing dependencies manually:
This curated list could be made a dependency of the umbrella package. I don't think users would complain about too many dependencies given that any modern ML pipeline nowadays runs dozens of models at least.
cc: @DilumAluthge
The text was updated successfully, but these errors were encountered: