`XGBoostClassifier` can't be serialised. Add custom serialisation? #512

aviatesk · 2020-05-01T07:33:34Z

Describe the bug

deserialized XGBoostClassifier can't predict.

This only happens for MLJ machine interface: XGBoost.save(xgboost::Booser, fname) -> predict(Booster(fname), X) works as expected.

/cc @aviks I would like to ping in a case you're interested in this

To Reproduce

using MLJ

n = 100000
X = DataFrame(A = rand(n), B = categorical(rand(1:10, n)))
y = categorical(rand(0:2, n))

@load XGBoostClassifier
xgb_machine = machine(XGBoostClassifier(), X, y)
MLJ.fit!(xgb_machine; verbosity = 0)
MLJ.predict(xgb_machine, X) # <- works
MLJ.save("xgboost.jlso", xgb_machine)

###

xgb_machine′ = machine("xgboost.jlso")
MLJ.predict(xgb_machine′, X) # <- errors

the final line results in the following error:

RROR: Call to XGBoost C function XGBoosterPredict failed: [16:29:15] src/c_api/c_api.cc:963: DMatrix/Booster has not been intialized or has already been disposed.

Stack trace returned 0 entries:


Stacktrace:
 [1] error(::String, ::String, ::String, ::String) at ./error.jl:42
 [2] XGBoosterPredict(::Ptr{Nothing}, ::Ptr{Nothing}, ::Int32, ::UInt32, ::Array{UInt64,1}) at /Users/aviatesk/.julia/packages/XGBoost/q0cZR/src/xgboost_wrapper_h.jl:15
 [3] ...

Versions

versioninfo():

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  CPU: Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)
Environment:
  JULIA_PROJECT = @.
  JULIA_EDITOR = atom  -a
  JULIA_NUM_THREADS = 4
  JULIA_PKG_DEVDIR = /Users/aviatesk/julia/packages

pkg] st

(data) pkg> st
Status `~/pigicon-root/pigicon/piginator.jl/data/Project.toml`
  [336ed68f] CSV v0.6.1
  [a93c6f00] DataFrames v0.20.2
  [1313f7d8] DataFramesMeta v0.5.0
  [682c06a0] JSON v0.21.0
  [7acf609c] LightGBM v0.2.1
  [add582a8] MLJ v0.11.2
  [a7f614a8] MLJBase v0.13.2
  [6ee0df7b] MLJLinearModels v0.3.2
  [e80e1ace] MLJModelInterface v0.2.2
  [d491faf4] MLJModels v0.9.9
  [1914dd2f] MacroTools v0.5.5
  [91a5bcdd] Plots v0.29.9
  [2913bbd2] StatsBase v0.33.0
  [f3b207a7] StatsPlots v0.14.5
  [009559a3] XGBoost v0.4.3

The text was updated successfully, but these errors were encountered:

OkonSamuel · 2020-05-01T22:51:18Z

cc @aviatesk . The culprit here is the mach.fitresult[1] returned after deserialization. Particularly the booster.handle Pointer is invalid. It is worth noting that XGBoost.save calls the libxgboost C library while MLJ makes use of JLSO library (it maybe that JLSO doesn't know how to deal with pointers??). @ablaom , @tlienart Maybe we should for this specfic case save using XGBoost library save??.

ablaom · 2020-05-02T02:43:14Z

@OkonSamuel Thanks for investigating!

Maybe we should for this specfic case save using XGBoost library save??.

This is possible. One overloads the fallbacks recalled below, somewhere in the code where XGBoost has its implementation of the MLJ model API:

MLJModelInterface.save(file, model, fitresult, report; kwargs...) =
    JLSO.save(file,
              :model => model,
              :fitresult => fitresult,
              :report => report; kwargs...)
function MLJModelInterface.restore(file; kwargs...)
    dict = JLSO.load(file)
    return dict[:model], dict[:fitresult], dict[:report]
end

The tricky part is that, for the classifier, the MLJ fitresult has an additional part - a (categorical) element of the target - which is not part of the XGBoost learned parameters. This is essential, to keep track of the categorical encoding and complete pool of classes. So that needs to be saved as well. This issue is likely to recur in other classifiers wanting to use a custom save, so probably worth finding an elegant solution.

yalwan-iqvia · 2020-09-15T11:04:07Z

I'm going to try this for LightGBM models -- I anticipated this issue and put some code in to deal with it but I never tested it in context of MLJ serialisation. If it works I'll come back and point at the code I have implemented within LightGBM.jl which the XGB maintainers could probably also make use of.

ablaom · 2020-09-29T22:05:44Z

@yalwan-iqvia

The tricky part is that, for the classifier, the MLJ fitresult has an additional part - a (categorical) element of the target - which is not part of the XGBoost learned parameters. This is essential, to keep track of the categorical encoding and complete pool of classes. So that needs to be saved as well.

Do you have any suggestions about how we combine the two bits of information - the XGBoost learned parameters, and the complete pool of classes (eg, the CategoricalValue stored in the fitresult) - into the one file? Or perhaps we have two files and the user just specifies the xgboost one, with the other created/deserialized "silently" ?

ablaom · 2020-09-30T00:47:06Z

Okay, I have discovered some flaws in the API on the deserializing restoring side. Working on a PR for that.

yalwan-iqvia · 2020-09-30T07:38:47Z

So for the LGBM I was only referring to the bit which causes the error as reported by user about a disposed model -- I made internal structs carry around serialised string version of model so even after its deepcopied elsewhere and the ptr is null then it can try to reload the string serialised model into memory before continuing upon attempting operations such as predict, etc. I didn't get around to trying this yet but I can point you at the code

Here's our test that this runs correctly after deepcopying:
https://github.com/IQVIA-ML/LightGBM.jl/blob/6384a4b63e5145de92968171337375459aeab3f0/test/ffi/base_utils.jl

Here's the bit where we make sure we still have a valid pointer after a deepcopy:
https://github.com/IQVIA-ML/LightGBM.jl/blob/master/src/wrapper.jl#L51

And finally, for cases which aren't covered by deepcopies somehow, we do this for safety
https://github.com/IQVIA-ML/LightGBM.jl/blob/master/src/fit.jl#L123

and in other places where we think we might continue after a SERde we use this:
https://github.com/IQVIA-ML/LightGBM.jl/blob/master/src/utils.jl#L79

ablaom · 2020-10-19T01:52:28Z

After JuliaRegistries/General#23206 is merged, update MLJModels and serialisation should now work.

ablaom changed the title ~~deserialized XGBoostClassifier can't predict~~ XGBoostClassifier can't be serialised. Add custom serialisation? May 8, 2020

ablaom self-assigned this Sep 29, 2020

ablaom mentioned this issue Oct 1, 2020

Tweaks to the model API for supporting machine serialization JuliaAI/MLJBase.jl#429

Merged

ablaom mentioned this issue Oct 19, 2020

Plsregressor plus more migration JuliaAI/MLJModels.jl#325

Merged

3 tasks

ablaom closed this as completed Oct 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`XGBoostClassifier` can't be serialised. Add custom serialisation? #512

`XGBoostClassifier` can't be serialised. Add custom serialisation? #512

aviatesk commented May 1, 2020

OkonSamuel commented May 1, 2020

ablaom commented May 2, 2020 •

edited

Loading

yalwan-iqvia commented Sep 15, 2020

ablaom commented Sep 29, 2020

ablaom commented Sep 30, 2020

yalwan-iqvia commented Sep 30, 2020 •

edited

Loading

ablaom commented Oct 19, 2020

XGBoostClassifier can't be serialised. Add custom serialisation? #512

XGBoostClassifier can't be serialised. Add custom serialisation? #512

Comments

aviatesk commented May 1, 2020

OkonSamuel commented May 1, 2020

ablaom commented May 2, 2020 • edited Loading

yalwan-iqvia commented Sep 15, 2020

ablaom commented Sep 29, 2020

ablaom commented Sep 30, 2020

yalwan-iqvia commented Sep 30, 2020 • edited Loading

ablaom commented Oct 19, 2020

`XGBoostClassifier` can't be serialised. Add custom serialisation? #512

`XGBoostClassifier` can't be serialised. Add custom serialisation? #512

ablaom commented May 2, 2020 •

edited

Loading

yalwan-iqvia commented Sep 30, 2020 •

edited

Loading