Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding MCMC and n_more_iter #37

Closed
merrellb opened this issue Feb 1, 2016 · 15 comments
Closed

Understanding MCMC and n_more_iter #37

merrellb opened this issue Feb 1, 2016 · 15 comments

Comments

@merrellb
Copy link

merrellb commented Feb 1, 2016

MCMC seems to be sensitive to the number of iterations between fit_predict calls. Setting n_more_iter=1 has an optimal RMSE of 0.884 at step 27. With n_more_iter=10, RMSE is at 0.860 and still improving at 10 iterations (equivalent to 100 with a step of 1). The documentation mentions "We can warm_start every fastFM model which allows us to calculate custom statistics during the model fitting process efficiently", which seems to suggest this sensitivity shouldn't be present.

Code excerpts and results below:

fm = mcmc.FMRegression(n_iter=0, rank=10)
fm.fit_predict(X_train, y_train, X_test)
for i in range(100):
    y_pred = fm.fit_predict(X_train, y_train, X_test, n_more_iter=1)
    y_pred[y_pred > 5] = 5
    y_pred[y_pred < 1] = 1
    print(i, np.sqrt(mean_squared_error(y_pred, y_test)))
0 1.04720819915
1 0.97778708587
2 0.948017085861
3 0.93420488937
4 0.927061672571
5 0.922935100294
6 0.920257539721
7 0.918207455438
8 0.916209819939
9 0.913894249208
10 0.911193471613
11 0.908216022258
12 0.905165877765
13 0.902210412052
14 0.899446393313
15 0.896925703595
16 0.894664102177
17 0.892659400427
18 0.890901483694
19 0.889378630713
20 0.888077107425
21 0.886984550257
22 0.886090970488
23 0.88538765354
24 0.884863438543
25 0.884506506975
26 0.884306239634
27 0.884247226983
28 0.884317448473
29 0.884505712397
30 0.884800603791
31 0.885189631326
32 0.885661840403
33 0.886204648292
34 0.886803484806
35 0.8874511923
36 0.888143165709
37 0.888869261405
38 0.889633023158
39 0.890425723873
40 0.891229955936
41 0.892040331705
42 0.892863900969
43 0.893707463792
44 0.894571302645
45 0.895459171301
46 0.896382680339
47 0.897345726293
48 0.8983542138
49 0.899390885256
50 0.900449139443
51 0.90153494656
52 0.902640912205
53 0.903777024948
54 0.904930834814
55 0.906096794935
56 0.907275765877
57 0.9084718919
58 0.909667324754
59 0.910873466425
60 0.912084967428
61 0.913287204773
62 0.914474364239
63 0.915653122817
64 0.916826161945
65 0.917994888944
66 0.919162248375
67 0.920329277189
68 0.92149676688
69 0.922659246729
70 0.923810308867
71 0.924949669371
72 0.926074311282
73 0.927179794147
74 0.928264040074
75 0.929326802974
76 0.930376434434
77 0.9314191654
78 0.932455845704
79 0.933482129994
80 0.93449735718
81 0.935486636427
82 0.936467682892
83 0.937443712518
84 0.938410321158
85 0.939356840954
86 0.94027321806
87 0.941148746045
88 0.942005000857
89 0.9428498981
90 0.943684777878
91 0.944508612458
92 0.945317518167
93 0.946115254746
94 0.946898829726
95 0.947669244301
96 0.948434140619
97 0.949193065287
98 0.949947628213
99 0.950688137695
fm = mcmc.FMRegression(n_iter=0, rank=10)
fm.fit_predict(X_train, y_train, X_test)
for i in range(10):
    y_pred = fm.fit_predict(X_train, y_train, X_test, n_more_iter=10)
    y_pred[y_pred > 5] = 5
    y_pred[y_pred < 1] = 1
    print(i, np.sqrt(mean_squared_error(y_pred, y_test)))
0 0.911849673248
1 0.902846141012
2 0.89065879739
3 0.880330818455
4 0.874373355886
5 0.870324418211
6 0.866544031989
7 0.863735153323
8 0.861829622252
9 0.860483981533
@ibayer
Copy link
Owner

ibayer commented Feb 6, 2016

mcmc.fit_predict returns the average prediction over all samples. This

    y_pred[y_pred > 5] = 5
    y_pred[y_pred < 1] = 1

causes the model to average the clipped predictions.

Better use copies in your comparison.

    y_pred = fm.fit_predict(X_train, y_train, X_test, n_more_iter=10)
    y_pred_tmp = np.copy(y_pred)
    y_pred_tmp[y_pred_tmp > 5] = 5
    y_pred_tmp[y_pred_tmp < 1] = 1
    print(i, np.sqrt(mean_squared_error(y_pred_tmp, y_test)))

@merrellb
Copy link
Author

merrellb commented Feb 6, 2016

Are you saying that the clipping of the predictions actually impacts the underlying model? I tried using the copy as you suggest and I seem to get the same results.

Even if this is a potential problem how does this relate to the issue at hand? I don't see how this explains the discrepancy between running the model 100 times with a step of 1 vs running it 10 times with a step of 10.

@ibayer
Copy link
Owner

ibayer commented Feb 7, 2016

Are you saying that the clipping of the predictions actually impacts the underlying model?

Yes, for mcmc the n+1 prediction depends on the n'th prediction. So if you change the n'th prediction by clipping...

I don't see how this explains the discrepancy between running the model 100 times with a step of 1 vs running it 10 times with a step of 10.

Well, the mean over 100 clipped predictions doesn't have to be the same as the mean over 10 clipped predictions, even if the underlying number of iterations is the same.

You could simply run the experiment without clipping as a test, then you know if clipping causes differences or not.

I can look into this If you narrow the issue down and provide a Short, Self Contained, Correct, Example.

@merrellb
Copy link
Author

merrellb commented Feb 7, 2016

I am not really seeing much (if any) difference by clipping the copy vs the original predictions. I had assumed the predictions provided by fit_predict were already a copy and not a mutable reference to the internal predictions used by the model.

Here is a example I've created. Please let me know if it meets your needs. Thanks!

https://gist.github.com/merrellb/1d5e1b9c2c2c03870c4d

@ibayer
Copy link
Owner

ibayer commented Feb 7, 2016

Thank's for the example.

  1. rm clipping (since it doesn't make a difference)
  2. Reduce the dataset size as much as possible (a small artificial example is best). Ideal is something
    one can check by hand. You could also try sklearn.datasets.make_regression.
  3. Can you do the same comparison with the als solver? The mcmc solver uses much of the als code.
    If this is a bug in the warm start code then it should also be visible in a als warm start comparison. als is much easier to debug.

@merrellb
Copy link
Author

merrellb commented Feb 8, 2016

  1. Done
  2. I will see if I can get the same issue to show up using make_regression. If not I will see if I can pare Movielens down small enough to embed in the example
  3. So far I have not been able to recreate this with the als solver. Ten steps of 1 seem to yield the same result at one step of 10.

Could the same reasons that we are unable to separate fit and predict with mcmc (seeming loss of state?) be impacting warm starts? I must admit I don't have a great understanding of mcmc.

@ibayer
Copy link
Owner

ibayer commented Feb 8, 2016

I can reproduce your results. Great, a really small example can even go in the test suite later. Good to know that als works, that narrows things down. Can you compare against restart too?

Could the same reasons that we are unable to separate fit and predict (seeming loss of state?) be impacting warm starts?

That's likely. I'll take a closer look later.

@merrellb
Copy link
Author

merrellb commented Feb 8, 2016

Can you explain in relative layman's terms the differences with mcmc that prevent good results when splitting fit and predict? It seems that if we have enough state to warm-start the algorithm we should have enough state to predict independently of fitting.

I'm not quite sure what you mean by "restart". I modified my script to explore the one other variation I see and the results were unremarkable. The slightly modified script and its output illustrating this is at:

https://gist.github.com/merrellb/8086a166916a7353f896

@ibayer
Copy link
Owner

ibayer commented Feb 9, 2016

Can you explain in relative layman's terms the differences with mcmc that prevent good results when splitting fit and predict? It seems that if we have enough state to warm-start the algorithm we should have enough state to predict independently of fitting.

MCMC returns the mean over the prediction at every iteration. The model parameter themselves are too expensive to keep. The mean is calculated on a running basis.

I'm not quite sure what you mean by "restart". I modified my script to explore the one other variation I see and the results were unremarkable.

You got it right, it's your "coldstart".

The slightly modified script and its output illustrating this is at:

Great, I think I know now what's wrong!
The formula for the mean updates is only correct for n_more_iter=1.

See:
https://github.com/ibayer/fastFM-core/blob/master/src/ffm_als_mcmc.c#L255
https://github.com/ibayer/fastFM-core/blob/master/src/ffm_utils.c#L54

PR welcome, I won't have time to fix this before the weekend.

@ibayer
Copy link
Owner

ibayer commented Feb 9, 2016

@merrellb
Copy link
Author

That is great you've narrowed down the issue although I am a bit confused that "the formula for the mean updates is only correct for n_more_iter=1." Isn't n_more_iter=1 is the value that my "hot" example uses to illustrate the problem?

Wish I could help on the C side of things but my skills are more than a decade rusty. Definitely happy to help where I can on the Python side of things.

@ibayer
Copy link
Owner

ibayer commented Feb 14, 2016

I have just merge a fix for the bad mcmc performance that you observed when using the n_more_iter parameter. I was wrong about my guess what causes this issue. Please let me know if my PR fixes your issue.
New binaries are available:
https://github.com/ibayer/fastFM/releases/tag/v0.2.4

@merrellb
Copy link
Author

My initial testing of the OSX Python 3.5 wheel seems to confirm the fix (some random variation but hot, warm and cold all perform similarly). Thanks!

One very minor issue is that the filename doesn't quite match up with the Github version tag.

@ibayer
Copy link
Owner

ibayer commented Feb 14, 2016

Great, I have fixed the file names too.

@ibayer
Copy link
Owner

ibayer commented Feb 15, 2016

Can we close this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants