Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predict new data without training. #7

Open
niknoproblems opened this issue Apr 13, 2016 · 14 comments
Open

Predict new data without training. #7

niknoproblems opened this issue Apr 13, 2016 · 14 comments

Comments

@niknoproblems
Copy link

Can I predict new data by trained model? Or I always should call "run" method?

@jfloff
Copy link
Owner

jfloff commented Apr 13, 2016

Do you mean using a previous training model (using the save_model flag)?

@niknoproblems
Copy link
Author

exactly

@jfloff
Copy link
Owner

jfloff commented Apr 16, 2016

Sorry, I have yet to implement that function since me personally had never use for it.

Just to get the feeling how do you envision such an interface? When I started looking into this problem I felt that I would probably need to split pywFM.run into pywFM.train and pywFM.predict, also adding pywFM.load_model that's able to load a train model. Problem is that would probably hurt performance since we would need to run 2 different libfm commands: one with save_model flag another with load_model flag.

Another alternative would be a separate pywFM.train_model and pywFM.run_model that trains and runs a model respectively.

@niknoproblems
Copy link
Author

niknoproblems commented Apr 17, 2016

I think first approach with train and predict methods more standard and clean, like in sklearn. Without that feature many ml techniques like stacking,blending become not so trivial.
About performance , yes we should run two libfm commands , but this hurt only for training phase,in predicting stage you need only load model for predict.

@jfloff
Copy link
Owner

jfloff commented Apr 18, 2016

I'm also leaning towards that approach, since it meets one of my todo points

Improve the save_model / load_model so we can have a more defined init-fit-predict cycle (perhaps we could inherit from sklearn.BaseEstimator)

This weekend I have a little bit of time and I will start to work on this branch (that will break BC, so bumping version). Feel free to also submit changes

@niknoproblems
Copy link
Author

Sorry, I'm not saw your todo .
Thank you very much for future work.

@felixmaximilian
Copy link

Hi @jfloff,
any advances into that direction? I just realized the issue mentioned by @NickFlamel and this simply makes your (very cool) wrapper not usable in a production environment.
Btw. I tried the example from Rendle that is also on your README but the prediction is very bad. I guess this is because we don't have much data, but this kinda makes the example unsuitable^^.

@jfloff
Copy link
Owner

jfloff commented Jun 30, 2016

I'm sorry, I haven't had time to dedicate to improving this. I realise that this feature would really improve running several different predictions, and I really want to improve it, but if I'm going to do it, I will inherit from sklearn.BaseEstimator right from the start (which takes a little bit more work).

I have a deadline for Monday. After that I'll dig into this, I promise! :)

The example is just to show how the API works, and what's the flow of libfm :)

@jfloff
Copy link
Owner

jfloff commented Jul 4, 2016

It seems that predict without a new train is not really supported at this moment. It seems that the functionality is not at 100% (e.g. not working for MCMC). I've also taken a look at libFM source code but I haven't had much success. Documentation is also lacking the save_model and load model function.

I'm going on a limb here and ping @thierry-silbermann here since he was responsible for save_model and load_model in libFM. Could you give us some insight on how we should proceed

@jilljenn
Copy link

Hi, here is how we could proceed to make a predict method:

https://github.com/jilljenn/TF-recomm/blob/master/forward.py#L22

Where the pickled elements are those:

https://github.com/jilljenn/TF-recomm/blob/master/fm_mangaki.py#L39

@jfloff
Copy link
Owner

jfloff commented Feb 19, 2018

Want to try submit a PR for this?

@jilljenn
Copy link

Yes. It will look like this.

https://github.com/mangaki/mangaki/pull/549/files#diff-2b98b5dc82ffbac20dd8c88ce88d6b5cR65

I don't know why I had sometimes to use .A1 (conversion matrix to ndarray), sometimes not.

Can you consider casting model.weights and model.pairwise_interactions to NumPy arrays?

@jfloff
Copy link
Owner

jfloff commented Feb 19, 2018

I don't see any problem with that.

@jilljenn
Copy link

5 years later, I finally made a scikit-learn estimator:
https://github.com/jilljenn/ktm/blob/master/fm.py#L25

It will be improved over the next few days, then I can copy it in your repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants