Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict scikit-learn version to <1.2 #231

Merged

Conversation

bfhealy
Copy link
Collaborator

@bfhealy bfhealy commented Sep 11, 2023

This PR restricts the scikit-learn version in requirements.txt to <1.2 (in addition to >=1.0.2). Assuming the tests pass with the different package version, this resolves #173. The .joblib files loaded when setting --ztf-uncertainties and --ztf-sampling were likely generated using an older version of scikit-learn, raising unusual errors when attempting to load them with newer versions of the package. A longer-term solution will be to regenerate those files with newer code so we don't need to constrain the package version.

@tylerbarna
Copy link
Collaborator

does this have any impact on the version of numpy that is installed? I know I've encountered a couple of warnings when setting up a new nmma environment regarding the compatibility of scikit-learn and numpy versions

@mcoughlin
Copy link
Member

@bfhealy @tylerbarna I will check with Ari about getting these updated, but for now, this seems good.

@mcoughlin mcoughlin self-requested a review September 11, 2023 17:21
Copy link
Member

@mcoughlin mcoughlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mcoughlin mcoughlin merged commit 272fb9d into nuclear-multimessenger-astronomy:main Sep 11, 2023
@tylerbarna
Copy link
Collaborator

@bfhealy do you know if there's any compatibility issues with numpy/pandas?

@bfhealy
Copy link
Collaborator Author

bfhealy commented Sep 14, 2023

@tylerbarna So far I haven't encountered any. I set up a new environment today and didn't get any related warnings/errors.

@tylerbarna
Copy link
Collaborator

@bfhealy interesting, what python version did you set up your environment with, and did you use pip or conda for installing parallel-bilby?

@tylerbarna
Copy link
Collaborator

I'm getting a new, fun error relating to pandas when trying to generate some lightcurves from an injection of like 100. I was able to resolve the version warning by pinning numpy to 1.22.4 (at least on python 3.10), but I hit the following error:

No injection files provided, will generate injection based on the prior file provided only
17:19 bilby_pipe INFO    : Created injection file ./lightcurves/nugent-hyper.json
Traceback (most recent call last):
  File "/home/tbarna/anaconda3/envs/nmma_env/bin/light_curve_generation", line 8, in <module>
    sys.exit(main())
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/nmma/em/create_lightcurves.py", line 333, in main
    data = create_light_curve_data(
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/nmma/em/injection.py", line 74, in create_light_curve_data
    ztfuncer = load(f)
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/joblib/numpy_pickle.py", line 648, in load
    obj = _unpickle(fobj)
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/joblib/numpy_pickle.py", line 577, in _unpickle
    obj = unpickler.load()
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/pickle.py", line 1213, in load
    dispatch[key[0]](self)
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/pickle.py", line 1590, in load_reduce
    stack[-1] = func(*args)
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/pandas/core/internals/blocks.py", line 2400, in new_block
    return klass(values, ndim=ndim, placement=placement, refs=refs)
TypeError: Argument 'placement' has incorrect type (expected pandas._libs.internals.BlockPlacement, got slice)

@bfhealy
Copy link
Collaborator Author

bfhealy commented Sep 14, 2023

@tylerbarna I used python 3.10 and conda installed parallel-bilby (version 2.0.2). My numpy version is 1.24.3, and pandas is 2.1.0.

Could you please share the commands you ran so I can give them a try?

@tylerbarna
Copy link
Collaborator

@bfhealy here's the repo, should work by just running the script with the nmma environment active
https://github.com/tylerbarna/nmma-model-recovery

@tylerbarna
Copy link
Collaborator

tylerbarna commented Sep 15, 2023

@bfhealy were you able to try running it? It occurs to me it depends on there being another repo for the priors, which is https://github.com/tylerbarna/dsmma_kn_23/tree/main/priors

I just pushed a commit that should include a copy of the priors inside the nmma-model-recovery repo so there isn't that dependency

@bfhealy
Copy link
Collaborator Author

bfhealy commented Sep 15, 2023

@tylerbarna Thanks for adding the priors. While running your script I encountered the same TypeError you shared above. I don't get the error if I comment --ztf-uncertainties, so I think once again this has to do with the pickled ZTF-related files we plan to update. For now I was able to get things to work by downgrading pandas to 1.5.3 (pip install 'pandas<2.0').

@tylerbarna
Copy link
Collaborator

@bfhealy would you mind pulling one more time and running a basic analysis on one of the generated lightcurves I pushed, something like

light_curve_analysis --data lightcurves/nugent-hyper_0.json --model nugent-hyper --prior priors/nugent-hyper.prior --remove-nondetections --trigger-time 44244

I've been encountering an issue with my ev.dat being empty and haven't been able to figure out if it's an issue with the environment or something in the way I'm generating the lightcurves

@bfhealy
Copy link
Collaborator Author

bfhealy commented Sep 15, 2023

@tylerbarna I ran that command and sampling completed successfully. I got an ev.dat file that's 2MB in size. Perhaps a fresh environment installation would help?

I do get several warnings about a change in prior name:
Warning: the 'KNtimeshift' parameter is deprecated as of nmma 0.0.19, please update your prior to use 'timeshift' instead

@tylerbarna
Copy link
Collaborator

@bfhealy very odd, I've encountered this same issue on two different systems now, both MSI and my local PC (WSL).

12:06 bilby INFO    : Using temporary file /tmp/tmpq58ivb8q
 *****************************************************
 MultiNest v3.10
 Copyright Farhan Feroz & Mike Hobson
 Release Jul 2015

 no. of live points = 2048
 dimensionality =    4
 resuming from previous job
 *****************************************************
 Starting MultiNest
Acceptance Rate:                        1.000000
Replacements:                               2048
Total Samples:                              2048
Nested Sampling ln(Z):            **************
12:06 bilby INFO    : Overwriting outdir/pm_injection/ with /tmp/tmpq58ivb8q/
 ln(ev)=  -7.0874296015377425E-016 +/-   5.8827365954340079E-010
 Total Likelihood Evaluations:         2048
 Sampling finished. Exiting MultiNest
  analysing data from /tmp/tmpq58ivb8q/.txt
12:06 bilby INFO    : Overwriting outdir/pm_injection/ with /tmp/tmpq58ivb8q/
/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/bilby/core/sampler/pymultinest.py:193: UserWarning: genfromtxt: Empty input file: "outdir/pm_injection//ev.dat"
  dead_points = np.genfromtxt(dir_ + "/ev.dat")
Traceback (most recent call last):
  File "/home/tbarna/anaconda3/envs/nmma_env/bin/light_curve_analysis", line 8, in <module>
    sys.exit(main())
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/nmma/em/analysis.py", line 909, in main
    analysis(args)
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/nmma/em/analysis.py", line 655, in analysis
    result = bilby.run_sampler(
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/bilby/core/sampler/__init__.py", line 234, in run_sampler
    result = sampler.run_sampler()
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/bilby/core/sampler/base_sampler.py", line 97, in wrapped
    output = method(self, *args, **kwargs)
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/bilby/core/sampler/pymultinest.py", line 178, in run_sampler
    self.result.nested_samples = self._nested_samples
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/bilby/core/sampler/pymultinest.py", line 201, in _nested_samples
    np.vstack([dead_points, live_points]).copy(),
  File "<__array_function__ internals>", line 180, in vstack
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/numpy/core/shape_base.py", line 282, in vstack
    return _nx.concatenate(arrs, 0)
  File "<__array_function__ internals>", line 180, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 0 and the array at index 1 has size 7

@bfhealy
Copy link
Collaborator Author

bfhealy commented Sep 15, 2023

@tylerbarna Hmm, since the issue is with the sampling I wonder if it has to do with pymultinest. There was a recent new release of version 2.12 on pypi. Maybe it's worth upgrading that if you haven't already?

@tylerbarna
Copy link
Collaborator

@bfhealy just checked, looks like pymultinest is already on version 2.12

@tylerbarna
Copy link
Collaborator

here's the output I'm getting from conda list. @bfhealy could you print out your versions so we can figure out which packages might be on different versions?

@mcoughlin
Copy link
Member

12:06 bilby INFO : Using temporary file /tmp/tmpq58ivb8q


MultiNest v3.10
Copyright Farhan Feroz & Mike Hobson
Release Jul 2015

no. of live points = 2048
dimensionality = 4
resuming from previous job


Starting MultiNest
Acceptance Rate: 1.000000
Replacements: 2048
Total Samples: 2048
Nested Sampling ln(Z): **************
12:06 bilby INFO : Overwriting outdir/pm_injection/ with /tmp/tmpq58ivb8q/
ln(ev)= -7.0874296015377425E-016 +/- 5.8827365954340079E-010
Total Likelihood Evaluations: 2048
Sampling finished. Exiting MultiNest

This resuming from previous job seems pretty weird.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

light_curve_generation ztf sampling flag causing error (and question regarding lightcurve file structure)
3 participants