Unexpected determinancy from `rng` functions across model runs #568

cmgoold · 2022-06-14T09:20:38Z

Summary:

I'm generating random numbers as part of a simulation based calibration. I'm finding that subsequent runs (every ~3-5 runs) return the same number from the rng calls. This doesn't happen in CmdStan so I think it's localised to cmdstanpy.

Description:

Here's an example Stan model called normal-rng.stan:

/* normal-rng.stan */
transformed data{
  real x_;
  x_ = std_normal_rng();
}

generated quantities{
  real x = x_;
  // to check if the same behaviour occurs in GQs
  real x_gq = std_normal_rng();
}

And some sample code to run the model:

from cmdstanpy import CmdStanModel

mod = CmdStanModel(stan_file="normal-rng.stan")

fits = [mod.sample(data={}) for i in range(10)]

[fit.stan_variables()["x"][0] for fit in fits]

# Example output
[
0.627496, -0.21658899999999995,  -0.21658899999999995, -0.21658899999999995,
-0.0290533,-0.0290533,-0.0290533,-0.0290533,-0.022001700000000006,
-0.022001700000000006
]

Current Version:

cmdstanpy version 1.0.0 and cmdstan version 2.29.2

The text was updated successfully, but these errors were encountered:

WardBrian · 2022-06-14T13:42:54Z

Thanks for reporting this @cmgoold! This is really odd, it looks like different random seeds are being used:

python test.py 2>&1 | grep "seed ="
  seed = 33865
  seed = 19553
  seed = 82406
  seed = 30985
  seed = 19331
  seed = 29939
  seed = 21069
  seed = 91468
  seed = 14912
  seed = 67544

Okay, odd. If we set the output folder to . and look at the CSVs generated, the answer starts to pop up:

ls -l1 | grep csv
-rw-rw-r--  1 brian brian   23586 Jun 14 09:38 normal-rng-20220614093838_1.csv
-rw-rw-r--  1 brian brian   23570 Jun 14 09:38 normal-rng-20220614093838_2.csv
-rw-rw-r--  1 brian brian   23576 Jun 14 09:38 normal-rng-20220614093838_3.csv
-rw-rw-r--  1 brian brian   23559 Jun 14 09:38 normal-rng-20220614093838_4.csv
-rw-rw-r--  1 brian brian   22581 Jun 14 09:38 normal-rng-20220614093839_1.csv
-rw-rw-r--  1 brian brian   22628 Jun 14 09:38 normal-rng-20220614093839_2.csv
-rw-rw-r--  1 brian brian   22604 Jun 14 09:38 normal-rng-20220614093839_3.csv
-rw-rw-r--  1 brian brian   22580 Jun 14 09:38 normal-rng-20220614093839_4.csv

There are only 8 CSV files, not 40. The timestamps start to give away why - they're being overwritten by each other!

If we change your script to

fits = [mod.sample(data={}, output_dir='.', chain_ids=i*4 + 1) for i in range(10)]

to ensure each run has a unique filename, it works:

[0.226037, 1.11019, -0.362359, -0.512991, -0.624099, -1.81625, -1.64503, 0.568551, 1.17152, 1.43194]

We often have issues reported that end up being these file clobbering issues. I wonder if we could fix that on the cmdstanpy side. Thoughts @mitzimorris? We could generate new temp directories per-run

cmgoold · 2022-06-14T14:01:19Z

Thanks for the quick reply @WardBrian! That makes sense -- thanks for tracking down the issue.

mitzimorris · 2022-06-14T14:08:14Z

we could certainly generate new temp directories per-run - temp is temp.

WardBrian mentioned this issue Jun 14, 2022

Make a subdirectory of the temporary directory per-runset #569

Merged

2 tasks

WardBrian closed this as completed in #569 Jun 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected determinancy from `rng` functions across model runs #568

Unexpected determinancy from `rng` functions across model runs #568

cmgoold commented Jun 14, 2022 •

edited

Loading

WardBrian commented Jun 14, 2022 •

edited

Loading

cmgoold commented Jun 14, 2022

mitzimorris commented Jun 14, 2022

Unexpected determinancy from rng functions across model runs #568

Unexpected determinancy from rng functions across model runs #568

Comments

cmgoold commented Jun 14, 2022 • edited Loading

Summary:

Description:

Current Version:

WardBrian commented Jun 14, 2022 • edited Loading

cmgoold commented Jun 14, 2022

mitzimorris commented Jun 14, 2022

Unexpected determinancy from `rng` functions across model runs #568

Unexpected determinancy from `rng` functions across model runs #568

cmgoold commented Jun 14, 2022 •

edited

Loading

WardBrian commented Jun 14, 2022 •

edited

Loading