Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore GalSim unit tests #534

Closed
gostevehoward opened this issue Jan 18, 2017 · 9 comments
Closed

Restore GalSim unit tests #534

gostevehoward opened this issue Jan 18, 2017 · 9 comments
Assignees

Comments

@gostevehoward
Copy link
Collaborator

Right now, three out of the five cases exercised in test_galsim_benchmarks.jl are commented out because they're not working. I think I had all benchmark cases working well at some point but they weren't in unit tests so (as was inevitable) they got broken, and then when new cases broke they simply got commented out (broken window theory). We should get all five cases working again, and we shouldn't comment them out in the future, and maybe we should add more cases to the unit test.

This will also be helped by adding better summary output to the GalSim benchmark script (probably as a result of #510). The point is to get people to notice when their changes degrade GalSim benchmark performance; currently that just doesn't happen.

@rgiordan

@gostevehoward
Copy link
Collaborator Author

For what it's worth, I am seeing the simple_star benchmark working just fine on current master; are you seeing otherwise?

│ Row │ label         │ field                     │ ground_truth │ single_inferred │ joint_inferred │ error_sds  │
├─────┼───────────────┼───────────────────────────┼──────────────┼─────────────────┼────────────────┼────────────┤
│ 1   │ "simple_star" │ "X center (world coords)" │ 0.005335     │ 0.005335        │ 0.005335       │ NA         │
│ 2   │ "simple_star" │ "Y center (world coords)" │ 0.005335     │ 0.005335        │ 0.005335       │ NA         │
│ 3   │ "simple_star" │ "Brightness (nMgy)"       │ 40           │ 39.9827         │ 39.9827        │ 0.0433057  │
│ 4   │ "simple_star" │ "Color band 1-2 ratio"    │ 3.99098      │ 4.00119         │ 4.00119        │ 0.149164   │
│ 5   │ "simple_star" │ "Color band 2-3 ratio"    │ 1.88395      │ 1.88477         │ 1.88477        │ 0.0434862  │
│ 6   │ "simple_star" │ "Color band 3-4 ratio"    │ 1.3179       │ 1.31725         │ 1.31725        │ 0.0492514  │
│ 7   │ "simple_star" │ "Color band 4-5 ratio"    │ 1.16982      │ 1.16992         │ 1.16992        │ 0.00864011 │
│ 8   │ "simple_star" │ "Probability of galaxy"   │ 0            │ 0.00593759      │ 0.00528062     │ NA         │

@gostevehoward
Copy link
Collaborator Author

Well the simple_star and star_with_noise benchmarks seem to be passing for me on current master (after Jeff merged #507). galaxy_with_noise fails but, as I now recall and as was commented in the code, it's a result of the bias from load_active_pixels!, issue #482. The solution is supposed to be the shiny new component-detection/initialization system in #157, which is a larger project.

My feeling is we ought to allow an override right now for the GalSim benchmarks, setting noise_fraction very low (like -0.5 or even -1, effectively disabling the filter). It seems better than commenting out test cases.

@rgiordan
Copy link
Contributor

Definitely better than commenting out test cases, but not as good as having test cases that actually work. Sorry if this is a question already answered elsewhere, but I've been out of the loop: if noise biases GalSim results so badly, why do we believe our results on actual catalogs?

#157 is a good idea, but it seems like it might take a while. Is there any particular reason not to go back to the way active pixels used to be selected before Jeff re-did it to use the image? (See my Dec 9th comment on #482.) Or am I misunderstanding the problem?

@rgiordan
Copy link
Contributor

Yes, after rebasing I now see the correct galaxy probability estimates. That's good! I don't quite see how #507 could have fixed it, though, do you @gostevehoward ?

@gostevehoward
Copy link
Collaborator Author

gostevehoward commented Jan 18, 2017 via email

@rgiordan
Copy link
Contributor

@jeff-regier , any reason these arguments about stripe 82 don't apply equally to GalSim? If not, why do you think the GalSim results are biased? (Disclaimer: I haven't run these tests myself, so I'm just taking everyone's word that they are failing because of load_active_pixels.)

@gostevehoward
Copy link
Collaborator Author

gostevehoward commented Jan 18, 2017 via email

@jeff-regier
Copy link
Owner

It's possible that load_active_pixels! introduces some bias. But I'd expect that to affect the mog version of Celeste too, not just the FFT version.

To prevent any bias, when you call get_sky_patches, pass it the argument min_radius_pix=50, if 50 is large enough to contain all the pixels that could possibly be relevant. (By default min_radius_pix=8.)

gostevehoward added a commit to gostevehoward/Celeste.jl that referenced this issue Jan 25, 2017
It's currently necessary to override this parameter to make galsim unit tests
pass (see jeff-regier#534).

This is the simplest way to code it but it's ugly in my opinion. We should
discuss alternatives in the PR thread or an issue thread.
jeff-regier pushed a commit that referenced this issue Jan 26, 2017
* permit overriding min_radius_px from top-level callers

It's currently necessary to override this parameter to make galsim unit tests
pass (see #534).

This is the simplest way to code it but it's ugly in my opinion. We should
discuss alternatives in the PR thread or an issue thread.

* enable all galsim "unit test" cases now that active radius is expanded

* updates calls to load_active_pixels! in unit tests
@gostevehoward
Copy link
Collaborator Author

Fixed by #538

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants