Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad interaction between num_chains and fixed_param sampler #495

Closed
mitzimorris opened this issue Nov 12, 2021 · 21 comments
Closed

Bad interaction between num_chains and fixed_param sampler #495

mitzimorris opened this issue Nov 12, 2021 · 21 comments
Assignees
Labels
bug Something isn't working

Comments

@mitzimorris
Copy link
Member

Summary:

somewhere late in PR #492 we introduced a bug.
possibly around changing the logic w/r/t behavoir when user tries to explicitly run chains in parallel?

Description:

the following call to sample, given a model which was compiled with stan_threads==True

datagen_fit = datagen_model.sample(iter_sampling=100000, chains=7, parallel_chains=2)

sampler tries to run multi-threaded - resulting CmdStan call is this:

DEBUG:cmdstanpy:running CmdStan, num_threads: 2
DEBUG:cmdstanpy:CmdStan args: ['/Users/mitzi/github/stan-dev/cmdstanpy/test/data/datagen_poisson_glm', 'id=1', 'random', 'seed=27365', 'output', 'file=/var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmpn5muntyf/datagen_poisson_glm-20211112112133.csv', 'method=sample', 'num_samples=100000', 'algorithm=hmc', 'adapt', 'engaged=1', 'num_chains=7']

which craps out because now there are only 2 output CSV files, not the expected 7. investigate logic and fix.

Additional Information:

Provide any additional information here.

Current Version:

@mitzimorris mitzimorris added the bug Something isn't working label Nov 12, 2021
@mitzimorris mitzimorris self-assigned this Nov 12, 2021
@mitzimorris
Copy link
Member Author

arguably, we didn't introduce a bug, fixing the logic uncovered a bug.
either way - more unit tests needed.

@WardBrian
Copy link
Member

WardBrian commented Nov 12, 2021

I'm unable to recreate this as described. Using the example model:

import os
from cmdstanpy import cmdstan_path, CmdStanModel

bernoulli_dir = os.path.join(cmdstan_path(), 'examples', 'bernoulli')
stan_file = os.path.join(bernoulli_dir, 'bernoulli.stan')

import os
os.remove(os.path.join(bernoulli_dir, 'bernoulli'))
model = CmdStanModel(stan_file=stan_file, cpp_options={'STAN_THREADS':'true'})
print(model)

data = { "N" : 10, "y" : [0,1,0,0,0,0,0,0,0,1] }

print(model.exe_info()) # confirms stan_threads is true

fit = model.sample(data=data, chains=7, parallel_chains=2, iter_sampling=100000)

print(fit.runset.csv_files) # 7 files

This runs for me without a problem using develop and 2.28.1

@mitzimorris
Copy link
Member Author

testing on develop on my machine - model compiled 2.28.1:

{'stan_version_major': '2',
 'stan_version_minor': '28',
 'stan_version_patch': '1',
 'STAN_THREADS': 'true',
 'STAN_MPI': 'false',
 'STAN_OPENCL': 'false',
 'STAN_NO_RANGE_CHECKS': 'false',
 'STAN_CPP_OPTIMS': 'false'}

maybe there are 7 files, but somehow, it can't parse one of them:

ValueError: bad Stan CSV file /var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmpn5muntyf/datagen_poisson_glm-20211112112133_2.csv, expected 100000 draws, found 0

@WardBrian
Copy link
Member

Can you cat that file or inspect it manually?

@WardBrian
Copy link
Member

Ah, this is because of that incredibly annoying fixed_param behavior

@mitzimorris
Copy link
Member Author

Untitled

@WardBrian
Copy link
Member

The bernoulli_datagen model runs in fixed_param mode due to that ... interesting... design choice in CmdStan. And, fixed_param doesn't work with num_chains, so it only runs one chain.

But, we can't tell a-priori if it will run fixed_param, so we might be cooked. stan-dev/cmdstan#1054 should be merged as a top priority IMO

@WardBrian
Copy link
Member

You can verify this using the command line interface:
./bernoulli_datagen sample num_chains=7 data file=data.json will run with fixed_params and (silently) ignore num_chains. Ugh

@WardBrian
Copy link
Member

Do you want to open a CmdStan issue or shall I? Even after stan-dev/cmdstan#1054 this will probably still happen if the user specifically requests fixed_params, so it should be noted.

As for what we do in the interfaces... I really don't know if more unit tests helps us when CmdStan itself has bad behavior, and I don't think we can do much other than try to create a nicer error message instructing users to set force... to True (which resolves this) for fixed_param samplers?

@WardBrian WardBrian changed the title num_chains logic error w/ force_one_process_per_chain arg Bad interaction between num_chains and fixed_param sampler Nov 12, 2021
@mitzimorris
Copy link
Member Author

it would be great if you do this. I will go back to banging my head against the compile logic.

as for CmdStan, the sooner we build a different CLI, the better.

@mitzimorris
Copy link
Member Author

we can do a better job of handling models with no parameters by extending the checks in exe_info and calling other methods on the executable - just as is done in CmdStan.

@WardBrian
Copy link
Member

Fair, but if the behavior is about to be removed from CmdStan I think bending over backwards for it is probably not worth it

@mitzimorris
Copy link
Member Author

mitzimorris commented Nov 12, 2021

https://github.com/stan-dev/cmdstan/blob/3a3ef24c6c7d00da78d64fdb8ccacecdc3c5eee0/src/cmdstan/command.hpp#L693

this would make life a whole lot easier sooner - or is this not exposed the way that info is?

@WardBrian
Copy link
Member

WardBrian commented Nov 12, 2021

I don't think there's any way to get that information out of the executable itself. I suppose you could try to grep the .hpp file to get the value of num_params_r__, but I'd be pretty strongly opposed to doing that as stanc doesn't guarantee that level of compatibility - the name could change whenever.

This might be a philosophical debate but ultimately I think there's a limit to how much an interface can fix the problems of the thing it is wrapping, and this is probably that barrier. Upstream changes in CmdStan and trying to produce better error messages are better than a lot of extra (and very hard to test) logic. I think we're nearing the point where it's a win when the code in this repo gets smaller, not larger

@WardBrian
Copy link
Member

WardBrian commented Nov 12, 2021

The solution we should do for now, IMO, is add a check to these lines

cmdstanpy/cmdstanpy/model.py

Lines 1467 to 1473 in 3cbfae4

if 'running fixed_param sampler' in console:
sampler_args = runset._args.method_args
assert isinstance(
sampler_args, SamplerArgs
) # make the typechecker happy
sampler_args.fixed_param = True

If one_process is true and num_chains > 1, throw an error there telling the user fixed_param doesn't support it and they should set force_one_process_per_chain to True

Edit: Actually, we probably need to do this in sample, not in _run_cmdstan, but the idea is the same.

@mitzimorris
Copy link
Member Author

agreed to points above. I can add these in the next PR.

@mitzimorris
Copy link
Member Author

mitzimorris commented Nov 12, 2021

I don't think there's any way to get that information out of the executable itself.

how does the info keyword work? it's nowhere mentioned in the arg parser code. wtf?
@rok-cesnovar - how did we hack this into the exe?

@rok-cesnovar
Copy link
Member

It's done the same way the help command works, which also isn't a classic argument: https://github.com/stan-dev/cmdstan/blob/da4be4530793a23e021fc94b7976513b5cde2816/src/cmdstan/arguments/argument_parser.hpp#L91
That is, it's not the cleanest code I have ever written :)

I see you are talking about obtaining the number of parameters. You can do that via stanc3 and the --info flag. It returns a JSON file. That doesn't help you when the CmdStanModel object was made from the exe_file, but I would say that is the less common case.

@rok-cesnovar
Copy link
Member

@mitzimorris
Copy link
Member Author

That is, it's not the cleanest code I have ever written :)

thanks, I looked there, but not closely enough. the code you wrote is fine.

@mitzimorris
Copy link
Member Author

. This is also what cmdstanr does: https://github.com/stan-dev/cmdstanr/blob/master/R/model.R#L861

OK, will do something similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants