Small differences in prediction between operating systems #57

schuemie · 2022-03-31T07:15:20Z

Just trying to understand: we're fitting a not-so-large logistic regression (for a propensity score) using the exact same data on two platforms (Windows and Linux). The fitted model coefficients are identical, but there are tiny differences in the predicted propensity scores. The maximum difference between PS is 9.99e-16. (Ironically, this leads to different PS matching, leading to larger differences in the effect size estimate). There's no sampling before fitting the model, so we're calling Cyclops' predict() on the same data used to fit the model.

Repeat runs on the same OS produce the exact same result, so results are reproducible in that sense. We compared the output of .Machine in R , and the only difference we see is sizeof.long = 4 on Windows and sizeof.long = 8 on Linux.

Any thoughts what could explain these differences?

The text was updated successfully, but these errors were encountered:

msuchard · 2022-03-31T13:21:10Z

Now, this is very interesting!

Is the model fit with cross-validation (in which case the PRNG may differ between machines) and the observed coefficients may look the same but are actually a little different?

Otherwise, I'll need to explore the code-base a bit for a better idea. In terms of a solution, we could round the predicted scores, say, to the nearest 1E-10 before matching.

schuemie · 2022-04-01T11:37:58Z

Yes, we're using cross-validation, and in fact there's a difference of 3.469447e-16 in the optimal hyperparameter! But as far as R is concerned, the fitted coefficients are identical (using the '==' operator). Does Cyclops use a higher precision internally?

Since we're actually running on simulated data (Synpuf), I can share the actual patient-level data with you. I'll send an e-mail.

schuemie · 2022-04-01T11:40:09Z

(BTW, rounding the PS scores sounds like an excellent idea)

schuemie mentioned this issue Apr 20, 2022

Round propensity scores to increase reproducibility OHDSI/CohortMethod#120

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small differences in prediction between operating systems #57

Small differences in prediction between operating systems #57

schuemie commented Mar 31, 2022

msuchard commented Mar 31, 2022

schuemie commented Apr 1, 2022

schuemie commented Apr 1, 2022

Small differences in prediction between operating systems #57

Small differences in prediction between operating systems #57

Comments

schuemie commented Mar 31, 2022

msuchard commented Mar 31, 2022

schuemie commented Apr 1, 2022

schuemie commented Apr 1, 2022