galaxy bias/clustering test #10

yymao · 2017-11-06T16:09:33Z

See more details in LSSTDESC/DC2-production#20.

Note that the wp(rp) code in v1 does not work on light cones. When we have proto-dc2 snapshots we can use the old code. In the meantime we should find new correlation code for light cones.

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

j-dr · 2017-11-06T17:07:34Z

If we're okay with including another dependency, I would highly recommend that we use Corrfunc. It is very fast, has a nice python API and has many different pair counting functions already implemented.

rmandelb · 2017-11-06T17:12:18Z

That code is quite convenient for scalar correlation functions. If there is any chance we're going to also want validation tests with spin-2 quantities like shear, then I think we should instead use a code like treecorr, which is very fast and does include correlations of spin-2 quantities (also with a nice python API).

j-dr · 2017-11-06T17:19:46Z

Good point. My only problem with treecorr is that it doesn't support periodic boundary conditions as far as I know, but that may not matter if we're just focusing on lightcone based statistics for DESCQA2. Corrfunc only has GSL and numpy as dependencies, so it might be worth considering using both.

evevkovacs · 2017-11-06T17:52:51Z

Another possibility is HaloTools

yymao · 2017-11-06T18:38:38Z

An additional benefit of using treecorr or halotools is that their lead developers are very active DESC members.

That being said, I'll be happy to let whoever volunteer to work on this test to choose the most suitable package.

yymao · 2017-12-05T03:43:59Z

Just a quick update --- we've decided on using treecorr but had some issues installing it on the NERSC DESC Python environment. Will continue to work on resolving this issue...

vvinuv · 2017-12-05T03:47:17Z

Oops, still not fixed?

…

On Dec 4, 2017 9:44 PM, "Yao-Yuan Mao" ***@***.***> wrote: Just a quick update --- we've decided on using treecorr but had some issues installing it on the NERSC DESC Python environment. Will continue to work on resolving this issue... — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#10 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJ23XdtULC_NpmFXCUQE9uINCOMpf4Yks5s9LwAgaJpZM4QTedN> .

yymao · 2017-12-14T16:35:43Z

We (mostly @vvinuv) are still working on fixing some potential bugs (see progress in #38), but otherwise this test is close to done. We still need validation criteria though!

Here's an example plot for the buzzard catalog, taken from this descqa run

slosar · 2017-12-19T11:54:21Z

Do we still have just lightcones and correlation functions? Am I correct to see that your galaxies are a factor of 2-3 overclustered? Having power spectrum of period box would be my ideal test but if all you have is lightcones then I guess we'll have to live with projected correlation funcs.

yymao · 2017-12-19T14:19:04Z

Updated test: https://portal.nersc.gov/project/lsst/descqa/v2/?run=2017-12-18&test=tpcf_Wang2013_rSDSS (results are similar)

@slosar The catalogs are over-clustered probably because the catalog is not deep enough?

@evevkovacs We do have snapshots, don't we? This question seems to appear many times...

evevkovacs · 2017-12-19T17:31:10Z

@vvinuv @yao @slosar I think it is important to debug the test first and make sure it is correct.

slosar · 2017-12-19T19:56:30Z

@yymao Which catalog is not deep enough? If you galaxies in a mag range then they are what they are, by definition you are deep enough? (in the limit of narrow enough bins that bias ~ const across mag bin)

yymao · 2017-12-19T20:00:07Z

@slosar mock catalogs have cutoff in redshift: protoDC2 light cone goes to only z=1, and the Buzzard light cone goes to z=2.1.

vvinuv · 2017-12-19T20:06:07Z

@yymao We are using only galaxies less than redshift of 0.3 and have some magnitude limit. Therefore, I think we will have similar deepness in catalog compared to the data.

rmandelb · 2017-12-20T02:50:31Z

Can you clarify what is being plotted? It says xi(theta) which confuses me a bit; that is, usually xi is used for a 3D correlation function as a function of r. Is this w(theta), the projected 2D angular cross-correlation? The issue with that one is that it includes information both about the 3D clustering and the N(z). I mean, if you have a sample with the very same 3D clustering but a different N(z), then w(theta) will look different. If we have a separate N(z) test already, then I think we want this to be a test of 3D clustering rather than w(theta), so as to ask a single specific question.

So if this is w(theta), then in fact my first question would simply be whether the mock galaxies have the same dN/dz as the real ones that were used as the validation data?

Also, apologies, but what Wang et al (2013) paper is it? If I look that up on ADS there are lots of possibilities but then none that I glanced at seem to have the data shown here.

slosar · 2017-12-20T16:56:11Z

I would also prefer to see 3D clustering test as it is just more accurate representation of the exact test that you are doing. The validation should be over the mag limit and redshift limit of the test sample -- if it is fine at low-z, I'm ok to believe it must be reasonable at higher z.

yymao · 2017-12-20T18:44:25Z

@rmandelb I think what is plotted is projected correlation (and the label should be changed) but @vvinuv can correct me if I'm wrong.

@slosar when you say 3D clustering, is the galaxy sample you have in mind within a thin redshift range with an absolute magnitude cut?

vvinuv · 2017-12-20T22:39:10Z

@rmandelb the label on the y-axis is wrong. I meant w(theta) which is the angular correlation function NOT projected correlation. I use the Wang et al paper (http://adsabs.harvard.edu/abs/2013MNRAS.432.1961W) as it is one of the latest measurements from SDSS galaxies. However, Wang et al work do not use the redshift information and all they use only r-band magnitude less than 21 (@slosar also I was wrong to say that Wang et al use redshift information). The recent results mostly agrees with the validation data for both catalogs https://portal.nersc.gov/project/lsst/descqa/v2/?run=2017-12-20_1&test=tpcf_Wang2013_rSDSS .

I haven't tested the code for 3D correlation function. I am not quite sure whether people use the 3D correlation function from observations.

It you are interested in the status of projected correlation function keep on reading.

We did a validation work on the projected correlation which is a function of comoving distance as defined in Zehavi et al 2011. They use SDSS galaxies in different brightness and in narrow redshift bins to find the projected correlation. In our earlier validation work the projected correlation function mostly agreed with Zehavi et al 2011. However, the results of tests during and after the LSST sprint meeting do not agree with Zehavi et al 2011. I haven't checked if there are any bugs in the test.

rmandelb · 2017-12-21T02:26:37Z

Hi @vvinuv - thanks for clarifying. Indeed in my message I accidentally wrote "projected 2D angular cross-correlation" when I meant "angular correlation function"... i.e., my statements about this quantity including information both about the intrinsic clustering and the N(z) apply to the angular correlation functions that you are showing. So my point still stands: comparing this quantity with the data conflates two things, the intrinsic clustering of the sample and its N(z). I thought the point of the clustering-related tests was to check whether the bias as a function of luminosity and scale makes sense (since the LSS, WL, and PZ groups all care about this), so you should be doing a calculation that gets at the 3D clustering. @slosar seems to agree.

I am not quite sure whether people use the 3D correlation function from observations.

Depends what you mean. To calculate it, you require spectroscopy. So it can only be done with certain samples (and even then, typically the projected version is used, i.e., wp(rp), which involves projecting xi(r) over projected line-of-sight distance out to some separation like 60 Mpc/h). Of course we won't have spectra for LSST and won't measure the 3D correlation function, but that is not the issue here. The purpose of the test was to check that the galaxy bias is reasonable and to do that we should measure something related to the intrinsic clustering, like xi(r) or wp(rp), and not w(theta) which includes the N(z) information as well.

If there was a plan of including validation tests on the intrinsic clustering, the N(z), and w(theta), then that seems redundant. We only need the first two.

slosar · 2017-12-21T12:32:23Z

@vvinuv Rachel said everything: it is ok to "cheat" in validation to make sure your catalog is correct and to use the actual distances (even with spectro survey you have RSD). In particular all projected corr function (either to theta or to rp) are integrals over the 3D which is the fundamental quantity, so even if not observable, if you get that right you'll get everything else right. Also, if you get that wrong, knowing where it is wrong in 3D will make it easier to debug the problem.

vvinuv · 2017-12-21T17:11:58Z

Thanks @rmandelb and @slosar . As @rmandelb pointed out that angular correlation function is a combination of redshift and intrinsic clustering. I think @evevkovacs showed that N(z) of catalogs mostly agrees with the DEEP2 data and I assume this is also true for SDSS galaxies. In that case the recent angular correlation test was mostly agrees with the observation (if you like to see it please go to this link https://portal.nersc.gov/project/lsst/descqa/v2/?run=2017-12-21_2&test=tpcf_Wang2013_rSDSS). On the other hand there are several assumptions going on in the validation test for angular correlation function. Therefore, I totally agree with @rmandelb and @slosar that the validation test should be separating the intrinsic clustering and N(z).

As we are in the same page of intrinsic clustering, I was testing for projected correlation function ( wp(rp)) not RSD. The validation test of wp(rp) have some bugs. I am trying to identify the bugs either in the code or in the catalogs.

rmandelb · 2017-12-22T00:54:26Z

I'm confused; what changed between the plot you showed above (which looks like a bad match) and the plots that are linked in your latest comment (which look significantly better)?

yymao · 2018-01-09T02:33:45Z

@morriscb protoDC2 has only z < 1 galaxies (a cut is made at z=1). Buzzard has galaxies up to z=2.1 (not sure if a hard cut is made but I don't think so).

morriscb · 2018-01-09T03:07:49Z

Okay, so I assume that means galaxies up to those redshifts are being used in this test which is what I wanted to know. I wasn't sure if a cut was being made in redshift below the limits of the simulations or not. Thanks.

yymao · 2018-01-09T03:10:30Z

@morriscb for the correlation functions with apparent magnitude cuts, yes. For correlation functions with absolute magnitude cuts there are additional redshift cuts applied before the correlation functions are calculated.

vvinuv · 2018-01-09T05:00:05Z

@morriscb the redshift of the sample with most luminous galaxies is less than 0.25 and less luminous samples have even lower redshift. I think you need the bias evolution from these type of samples. I don't know if there are any such observed correlation functions from high redshift galaxies. As @yymao said the angular correlation functions with apparent magnitude has objects till z=1 for protoDC2 and no redshift cut is applied for buzzard catalog.

…

On 8 January 2018 at 21:10, Yao-Yuan Mao ***@***.***> wrote: @morriscb <https://github.com/morriscb> for the correlation functions with apparent magnitude cuts, yes. For correlation functions with absolute magnitude cuts there are additional redshift cuts applied before the correlation functions are calculated. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJ23fT2XH-BSbWPjU5MCmjAEVCLx64wks5tItingaJpZM4QTedN> .

-- Vinu Vikraman http://www.hep.anl.gov/vvikraman/ <http://www.sas.upenn.edu/~vinu/>

rmandelb · 2018-01-09T13:50:30Z

@vvinuv - what about https://arxiv.org/abs/1210.6694 ? (see e.g. figures 10-12)

I agree in general with Chris that we do want to have some sanity check of the galaxy biases at higher redshifts. This will be important for LSS, LSS+WL combined analysis, and PZ.

vvinuv · 2018-01-09T13:54:37Z

@rmandelb <https://github.com/rmandelb> Thanks for the paper! I will check the results.

…

On 9 January 2018 at 07:50, Rachel Mandelbaum ***@***.***> wrote: @vvinuv <https://github.com/vvinuv> - what about https://arxiv.org/abs/1210.6694 ? (see e.g. figures 10-12) I agree in general with Chris that we do want to have some sanity check of the galaxy biases at higher redshifts. This will be important for LSS, LSS+WL combined analysis, and PZ. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJ23d1pD5F-Zv83XbpJdEtgQqexB8Blks5tI26ngaJpZM4QTedN> .

-- Vinu Vikraman http://www.hep.anl.gov/vvikraman/ <http://www.sas.upenn.edu/~vinu/>

rmandelb · 2018-01-09T13:57:14Z

I would add that Chris can probably suggest a specific validation criterion. One more thing to think about:

We don't necessarily need a validation test on the clustering signal. We can do a validation test on the galaxy bias as a function of redshift / luminosity, using the fact that we know the matter clustering (so bias = sqrt((xi gg) / (xi mm))).

evevkovacs · 2018-01-09T16:43:01Z

@yymao @vvinuv @rmandelb In future, to avoid confusion, we should make the cuts on the catalog data as close as possible to the data to which they are being compared, and these cuts should be stated clearly on the plots and in the summary text files. If additional redshift etc. cuts are being made, they should also be on the plots and in the files. I think this should be our standard practice for all validation tests.

yymao · 2018-01-09T20:01:06Z

@evevkovacs hmm I think we are already doing that right now, no?

evevkovacs · 2018-01-09T20:19:29Z

Well, there were some questions above about the redshift ranges above. I checked the plots and couldn't see any labels pertaining to redshift, so those should be included. However I did notice that a file called config.yaml is being printed out in the summary, which has some information. Not all our tests have this file in the summary section. Is it a copy of the yaml file for the test? This would be a useful thing to have printed in the summary section by default.

morriscb · 2018-01-09T21:32:37Z

@vvinuv For our use case we don't need to be assured that the clustering amplitude is entirely physical, just that there is a trend in correlation amplitude with absolute magnitude and it isn't flat like you observed in a plot previously created.

A specific test for our use case could be selecting one of the absolute magnitude bins and measure, as @rmandelb suggested, bias = sqrt((xi gg) / (xi mm)) at several different redshifts and then assert that d bias/dz != 0, the derivative of the bias with respect to redshift is non-zero. Having this will at least allow us to test bias mitigation techniques in the context of clustering redshifts.

sschmidt23 · 2018-01-09T22:59:07Z

As Chris said, one of the main things we want to check in DC2 is our ability to correct for galaxy bias evolution, so I think we need d bias/dz >~0.1-0.2 at least, (though it may be larger) for several absolute magnitude ranges out to z=1 as a good quantitative criteria for now. That is, we want to be sure that the bias evolution is significantly non-zero and measurable beyond statistical errors. I don't have an intuition for what the bias will do at higher redshifts, we'll have to think about that a bit more for when the 1<z<3 DC2 catalog is available.

aphearin · 2018-01-09T23:35:57Z

@sschmidt23 @morriscb - In the ideal case, an actual test function is written that is incorporated into DESCQA - this is the preferred workflow for working group members to make a specific request of mock catalogs.

At minimum, could you be more precise in specifying d bias/dz is defined? Galaxy bias is only defined for a specific galaxy sample. For which specific galaxy sample selection function(s) would you like to see bias evolution?

morriscb · 2018-01-09T23:49:19Z

Hi @aphearin, @rmandelb asked for a suggestion in the thread so I spit balled an idea. Also I mentioned a sample in my post as picking one of the absolute magnitude bins that are already being used in the clustering amplitude vs abs mag plots that were shown on this issue. The exact sample isn't really that important for us, the test only need show that the simulations can produce bias evolution with redshift.

j-dr · 2018-01-19T23:58:11Z

Something that isn't being covered in this issue, but will be important for clusters is small scale, color dependent clustering (particularly for red galaxies...). Not sure if I should open a new issue related to this. Happy to do so if we don't want to cram too many things into one test.

This will definitely be important for cluster miscentering though, so thought I would bring it to people's attention. @erykoff

rmandelb · 2018-01-21T19:59:24Z

@j-dr - does it make sense to do this as a test of small-scale clustering split by color? Or could we do a more specific test that focuses on the populations of cluster-mass halos?

My gut feeling is that this deserves a separate issue, because the goal of the validation test (what science it will enable) is quite different from the goal of the large-scale clustering validation test.

aphearin · 2018-01-21T20:05:42Z

I agree entirely with @rmandelb - this warrants a separate issue. The science targets driving these two validations are pretty distinct, as is the labor required of catalog producers.

j-dr · 2018-01-22T00:47:42Z

I'm happy to separate this into a separate test. I'll open a new issue and we can discuss there.

yymao · 2018-02-14T20:16:14Z

@rmandelb @vvinuv @j-dr @morriscb @sschmidt23 @slosar @aphearin @patricialarsen

We had lots of discussion on this thread about galaxy clustering and galaxy bias but haven't reached a concrete plan. So let me see if I can capture the essentials and draft a plan here.

➡️ Current implementation can be found here, for reference.

The galaxy-galaxy clustering signal should have absolute magnitude dependence. I think @vvinuv's test already cover this (comparing with Zehavi 11) but we need some criteria. @rmandelb @slosar, any suggestion?
The galaxy bias should have redshift dependence. I think this sounds a different test. It is related to this but the representation is pretty different. @morriscb @sschmidt23, would you agree? Should we open a new issue for the redshift dependent bias test?
I think we should open a new issue for color-dependent clustering. @j-dr @aphearin would you agree? I think @aphearin already has implemented the color-dependent clustering test outside DESCQA. With some efforts we should be able to port it in.
@patricialarsen is there other requests from TJP for this test?

aphearin · 2018-02-14T20:21:19Z

I think we should open a new issue for color-dependent clustering. @j-dr @aphearin would you agree? I think @aphearin already has implemented the color-dependent clustering test outside DESCQA. With some efforts we should be able to port it in.

Yes, I agree this should be a separate test. I am happy to share my (Halotools-based) code for this purpose, although it is based on snapshots with xyz coordinates.

j-dr · 2018-02-14T20:27:35Z

I also agree that color dependent clustering should be a separate test. Can't we just repurpose what @vvinuv has done for magnitude dependent clustering since that is already implemented on lightcones which is what we really want to test at the end of the day?

vvinuv · 2018-02-14T20:38:20Z

@j-dr you could repurpose the code since it is already implemented on light cone.

…

On 14 February 2018 at 14:27, Joe DeRose ***@***.***> wrote: I also agree that color dependent clustering should be a separate test. Can't we just repurpose what @vvinuv <https://github.com/vvinuv> has done for magnitude dependent clustering since that is already implemented on lightcones which is what we really want to test at the end of the day? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJ23TyrtfikuvNUZ4zJsX07_RU3H_gPks5tU0G4gaJpZM4QTedN> .

-- Vinu Vikraman http://www.hep.anl.gov/vvikraman/ <http://www.sas.upenn.edu/~vinu/>

yymao · 2018-06-14T16:29:07Z

This has been implemented in #91.

Shear test

yymao added help wanted validation: extragalactic labels Nov 6, 2017

yymao removed the help wanted label Dec 4, 2017

yymao assigned vvinuv Dec 4, 2017

This was referenced Dec 7, 2017

Added validation tests for 2pt correlation function for SDSS r-band #38

Merged

2pt correlation (e.g. ellipticity-direction) for IA model testing #42

Open

yymao added the needs validation criteria label Dec 14, 2017

katrinheitmann mentioned this issue Dec 14, 2017

validation of DC2 galaxy bias model with DESCQA LSSTDESC/DC2-production#20

Closed

slosar mentioned this issue Jan 17, 2018

Suite of validation tests on DC2 extragalactic catalogs #50

Open

rmandelb added the required label Jan 17, 2018

j-dr mentioned this issue Jan 22, 2018

Radial profiles of red-sequence cluster members #63

Closed

yymao mentioned this issue Feb 14, 2018

color-dependent galaxy clustering #73

Closed

4 tasks

yymao mentioned this issue Feb 16, 2018

redshift-dependent galaxy bias #75

Closed

4 tasks

yymao closed this as completed Jun 14, 2018

patricialarsen added a commit that referenced this issue Feb 21, 2023

Merge pull request #10 from patricialarsen/shear_test

9891f4f

Shear test

galaxy bias/clustering test #10

galaxy bias/clustering test #10

Comments

yymao commented Nov 6, 2017 • edited Loading

j-dr commented Nov 6, 2017 • edited Loading

rmandelb commented Nov 6, 2017

j-dr commented Nov 6, 2017

evevkovacs commented Nov 6, 2017

yymao commented Nov 6, 2017

yymao commented Dec 5, 2017

vvinuv commented Dec 5, 2017 via email

yymao commented Dec 14, 2017

slosar commented Dec 19, 2017

yymao commented Dec 19, 2017

evevkovacs commented Dec 19, 2017

slosar commented Dec 19, 2017

yymao commented Dec 19, 2017

vvinuv commented Dec 19, 2017

rmandelb commented Dec 20, 2017

slosar commented Dec 20, 2017

yymao commented Dec 20, 2017

vvinuv commented Dec 20, 2017

rmandelb commented Dec 21, 2017

slosar commented Dec 21, 2017

vvinuv commented Dec 21, 2017

rmandelb commented Dec 22, 2017

yymao commented Jan 9, 2018

morriscb commented Jan 9, 2018

yymao commented Jan 9, 2018

vvinuv commented Jan 9, 2018 via email

rmandelb commented Jan 9, 2018

vvinuv commented Jan 9, 2018 via email

rmandelb commented Jan 9, 2018

evevkovacs commented Jan 9, 2018

yymao commented Jan 9, 2018

evevkovacs commented Jan 9, 2018

morriscb commented Jan 9, 2018

sschmidt23 commented Jan 9, 2018

aphearin commented Jan 9, 2018

morriscb commented Jan 9, 2018

j-dr commented Jan 19, 2018

rmandelb commented Jan 21, 2018

aphearin commented Jan 21, 2018

j-dr commented Jan 22, 2018

yymao commented Feb 14, 2018

aphearin commented Feb 14, 2018

j-dr commented Feb 14, 2018

vvinuv commented Feb 14, 2018 via email

yymao commented Jun 14, 2018

yymao commented Nov 6, 2017 •

edited

Loading

j-dr commented Nov 6, 2017 •

edited

Loading