-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
galaxy bias/clustering test #10
Comments
If we're okay with including another dependency, I would highly recommend that we use Corrfunc. It is very fast, has a nice python API and has many different pair counting functions already implemented. |
That code is quite convenient for scalar correlation functions. If there is any chance we're going to also want validation tests with spin-2 quantities like shear, then I think we should instead use a code like treecorr, which is very fast and does include correlations of spin-2 quantities (also with a nice python API). |
Good point. My only problem with treecorr is that it doesn't support periodic boundary conditions as far as I know, but that may not matter if we're just focusing on lightcone based statistics for DESCQA2. Corrfunc only has GSL and numpy as dependencies, so it might be worth considering using both. |
Another possibility is HaloTools |
An additional benefit of using treecorr or halotools is that their lead developers are very active DESC members. That being said, I'll be happy to let whoever volunteer to work on this test to choose the most suitable package. |
Just a quick update --- we've decided on using treecorr but had some issues installing it on the NERSC DESC Python environment. Will continue to work on resolving this issue... |
Oops, still not fixed?
…On Dec 4, 2017 9:44 PM, "Yao-Yuan Mao" ***@***.***> wrote:
Just a quick update --- we've decided on using treecorr but had some
issues installing it on the NERSC DESC Python environment. Will continue to
work on resolving this issue...
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#10 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJ23XdtULC_NpmFXCUQE9uINCOMpf4Yks5s9LwAgaJpZM4QTedN>
.
|
We (mostly @vvinuv) are still working on fixing some potential bugs (see progress in #38), but otherwise this test is close to done. We still need validation criteria though! Here's an example plot for the buzzard catalog, taken from this descqa run |
Do we still have just lightcones and correlation functions? Am I correct to see that your galaxies are a factor of 2-3 overclustered? Having power spectrum of period box would be my ideal test but if all you have is lightcones then I guess we'll have to live with projected correlation funcs. |
Updated test: https://portal.nersc.gov/project/lsst/descqa/v2/?run=2017-12-18&test=tpcf_Wang2013_rSDSS (results are similar) @slosar The catalogs are over-clustered probably because the catalog is not deep enough? @evevkovacs We do have snapshots, don't we? This question seems to appear many times... |
@yymao Which catalog is not deep enough? If you galaxies in a mag range then they are what they are, by definition you are deep enough? (in the limit of narrow enough bins that bias ~ const across mag bin) |
@slosar mock catalogs have cutoff in redshift: protoDC2 light cone goes to only z=1, and the Buzzard light cone goes to z=2.1. |
@yymao We are using only galaxies less than redshift of 0.3 and have some magnitude limit. Therefore, I think we will have similar deepness in catalog compared to the data. |
Can you clarify what is being plotted? It says xi(theta) which confuses me a bit; that is, usually xi is used for a 3D correlation function as a function of r. Is this w(theta), the projected 2D angular cross-correlation? The issue with that one is that it includes information both about the 3D clustering and the N(z). I mean, if you have a sample with the very same 3D clustering but a different N(z), then w(theta) will look different. If we have a separate N(z) test already, then I think we want this to be a test of 3D clustering rather than w(theta), so as to ask a single specific question. So if this is w(theta), then in fact my first question would simply be whether the mock galaxies have the same dN/dz as the real ones that were used as the validation data? Also, apologies, but what Wang et al (2013) paper is it? If I look that up on ADS there are lots of possibilities but then none that I glanced at seem to have the data shown here. |
I would also prefer to see 3D clustering test as it is just more accurate representation of the exact test that you are doing. The validation should be over the mag limit and redshift limit of the test sample -- if it is fine at low-z, I'm ok to believe it must be reasonable at higher z. |
@rmandelb the label on the y-axis is wrong. I meant w(theta) which is the angular correlation function NOT projected correlation. I use the Wang et al paper (http://adsabs.harvard.edu/abs/2013MNRAS.432.1961W) as it is one of the latest measurements from SDSS galaxies. However, Wang et al work do not use the redshift information and all they use only r-band magnitude less than 21 (@slosar also I was wrong to say that Wang et al use redshift information). The recent results mostly agrees with the validation data for both catalogs https://portal.nersc.gov/project/lsst/descqa/v2/?run=2017-12-20_1&test=tpcf_Wang2013_rSDSS . I haven't tested the code for 3D correlation function. I am not quite sure whether people use the 3D correlation function from observations. It you are interested in the status of projected correlation function keep on reading. We did a validation work on the projected correlation which is a function of comoving distance as defined in Zehavi et al 2011. They use SDSS galaxies in different brightness and in narrow redshift bins to find the projected correlation. In our earlier validation work the projected correlation function mostly agreed with Zehavi et al 2011. However, the results of tests during and after the LSST sprint meeting do not agree with Zehavi et al 2011. I haven't checked if there are any bugs in the test. |
Hi @vvinuv - thanks for clarifying. Indeed in my message I accidentally wrote "projected 2D angular cross-correlation" when I meant "angular correlation function"... i.e., my statements about this quantity including information both about the intrinsic clustering and the N(z) apply to the angular correlation functions that you are showing. So my point still stands: comparing this quantity with the data conflates two things, the intrinsic clustering of the sample and its N(z). I thought the point of the clustering-related tests was to check whether the bias as a function of luminosity and scale makes sense (since the LSS, WL, and PZ groups all care about this), so you should be doing a calculation that gets at the 3D clustering. @slosar seems to agree.
Depends what you mean. To calculate it, you require spectroscopy. So it can only be done with certain samples (and even then, typically the projected version is used, i.e., wp(rp), which involves projecting xi(r) over projected line-of-sight distance out to some separation like 60 Mpc/h). Of course we won't have spectra for LSST and won't measure the 3D correlation function, but that is not the issue here. The purpose of the test was to check that the galaxy bias is reasonable and to do that we should measure something related to the intrinsic clustering, like xi(r) or wp(rp), and not w(theta) which includes the N(z) information as well. If there was a plan of including validation tests on the intrinsic clustering, the N(z), and w(theta), then that seems redundant. We only need the first two. |
@vvinuv Rachel said everything: it is ok to "cheat" in validation to make sure your catalog is correct and to use the actual distances (even with spectro survey you have RSD). In particular all projected corr function (either to theta or to rp) are integrals over the 3D which is the fundamental quantity, so even if not observable, if you get that right you'll get everything else right. Also, if you get that wrong, knowing where it is wrong in 3D will make it easier to debug the problem. |
Thanks @rmandelb and @slosar . As @rmandelb pointed out that angular correlation function is a combination of redshift and intrinsic clustering. I think @evevkovacs showed that N(z) of catalogs mostly agrees with the DEEP2 data and I assume this is also true for SDSS galaxies. In that case the recent angular correlation test was mostly agrees with the observation (if you like to see it please go to this link https://portal.nersc.gov/project/lsst/descqa/v2/?run=2017-12-21_2&test=tpcf_Wang2013_rSDSS). On the other hand there are several assumptions going on in the validation test for angular correlation function. Therefore, I totally agree with @rmandelb and @slosar that the validation test should be separating the intrinsic clustering and N(z). As we are in the same page of intrinsic clustering, I was testing for projected correlation function ( wp(rp)) not RSD. The validation test of wp(rp) have some bugs. I am trying to identify the bugs either in the code or in the catalogs. |
I'm confused; what changed between the plot you showed above (which looks like a bad match) and the plots that are linked in your latest comment (which look significantly better)? |
@morriscb protoDC2 has only z < 1 galaxies (a cut is made at z=1). Buzzard has galaxies up to z=2.1 (not sure if a hard cut is made but I don't think so). |
Okay, so I assume that means galaxies up to those redshifts are being used in this test which is what I wanted to know. I wasn't sure if a cut was being made in redshift below the limits of the simulations or not. Thanks. |
@morriscb for the correlation functions with apparent magnitude cuts, yes. For correlation functions with absolute magnitude cuts there are additional redshift cuts applied before the correlation functions are calculated. |
@morriscb the redshift of the sample with most luminous galaxies is less
than 0.25 and less luminous samples have even lower redshift. I think you
need the bias evolution from these type of samples. I don't know if there
are any such observed correlation functions from high redshift galaxies. As
@yymao said the angular correlation functions with apparent magnitude has
objects till z=1 for protoDC2 and no redshift cut is applied for buzzard
catalog.
…On 8 January 2018 at 21:10, Yao-Yuan Mao ***@***.***> wrote:
@morriscb <https://github.com/morriscb> for the correlation functions
with apparent magnitude cuts, yes. For correlation functions with absolute
magnitude cuts there are additional redshift cuts applied before the
correlation functions are calculated.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJ23fT2XH-BSbWPjU5MCmjAEVCLx64wks5tItingaJpZM4QTedN>
.
--
Vinu Vikraman
http://www.hep.anl.gov/vvikraman/ <http://www.sas.upenn.edu/~vinu/>
|
@vvinuv - what about https://arxiv.org/abs/1210.6694 ? (see e.g. figures 10-12) I agree in general with Chris that we do want to have some sanity check of the galaxy biases at higher redshifts. This will be important for LSS, LSS+WL combined analysis, and PZ. |
@rmandelb <https://github.com/rmandelb> Thanks for the paper! I will check
the results.
…On 9 January 2018 at 07:50, Rachel Mandelbaum ***@***.***> wrote:
@vvinuv <https://github.com/vvinuv> - what about
https://arxiv.org/abs/1210.6694 ? (see e.g. figures 10-12)
I agree in general with Chris that we do want to have some sanity check of
the galaxy biases at higher redshifts. This will be important for LSS,
LSS+WL combined analysis, and PZ.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJ23d1pD5F-Zv83XbpJdEtgQqexB8Blks5tI26ngaJpZM4QTedN>
.
--
Vinu Vikraman
http://www.hep.anl.gov/vvikraman/ <http://www.sas.upenn.edu/~vinu/>
|
I would add that Chris can probably suggest a specific validation criterion. One more thing to think about: We don't necessarily need a validation test on the clustering signal. We can do a validation test on the galaxy bias as a function of redshift / luminosity, using the fact that we know the matter clustering (so bias = sqrt((xi gg) / (xi mm))). |
@yymao @vvinuv @rmandelb In future, to avoid confusion, we should make the cuts on the catalog data as close as possible to the data to which they are being compared, and these cuts should be stated clearly on the plots and in the summary text files. If additional redshift etc. cuts are being made, they should also be on the plots and in the files. I think this should be our standard practice for all validation tests. |
@evevkovacs hmm I think we are already doing that right now, no? |
Well, there were some questions above about the redshift ranges above. I checked the plots and couldn't see any labels pertaining to redshift, so those should be included. However I did notice that a file called config.yaml is being printed out in the summary, which has some information. Not all our tests have this file in the summary section. Is it a copy of the yaml file for the test? This would be a useful thing to have printed in the summary section by default. |
@vvinuv For our use case we don't need to be assured that the clustering amplitude is entirely physical, just that there is a trend in correlation amplitude with absolute magnitude and it isn't flat like you observed in a plot previously created. A specific test for our use case could be selecting one of the absolute magnitude bins and measure, as @rmandelb suggested, bias = sqrt((xi gg) / (xi mm)) at several different redshifts and then assert that d bias/dz != 0, the derivative of the bias with respect to redshift is non-zero. Having this will at least allow us to test bias mitigation techniques in the context of clustering redshifts. |
As Chris said, one of the main things we want to check in DC2 is our ability to correct for galaxy bias evolution, so I think we need d bias/dz >~0.1-0.2 at least, (though it may be larger) for several absolute magnitude ranges out to z=1 as a good quantitative criteria for now. That is, we want to be sure that the bias evolution is significantly non-zero and measurable beyond statistical errors. I don't have an intuition for what the bias will do at higher redshifts, we'll have to think about that a bit more for when the 1<z<3 DC2 catalog is available. |
@sschmidt23 @morriscb - In the ideal case, an actual test function is written that is incorporated into DESCQA - this is the preferred workflow for working group members to make a specific request of mock catalogs. At minimum, could you be more precise in specifying d bias/dz is defined? Galaxy bias is only defined for a specific galaxy sample. For which specific galaxy sample selection function(s) would you like to see bias evolution? |
Hi @aphearin, @rmandelb asked for a suggestion in the thread so I spit balled an idea. Also I mentioned a sample in my post as picking one of the absolute magnitude bins that are already being used in the clustering amplitude vs abs mag plots that were shown on this issue. The exact sample isn't really that important for us, the test only need show that the simulations can produce bias evolution with redshift. |
Something that isn't being covered in this issue, but will be important for clusters is small scale, color dependent clustering (particularly for red galaxies...). Not sure if I should open a new issue related to this. Happy to do so if we don't want to cram too many things into one test. This will definitely be important for cluster miscentering though, so thought I would bring it to people's attention. @erykoff |
@j-dr - does it make sense to do this as a test of small-scale clustering split by color? Or could we do a more specific test that focuses on the populations of cluster-mass halos? My gut feeling is that this deserves a separate issue, because the goal of the validation test (what science it will enable) is quite different from the goal of the large-scale clustering validation test. |
I agree entirely with @rmandelb - this warrants a separate issue. The science targets driving these two validations are pretty distinct, as is the labor required of catalog producers. |
I'm happy to separate this into a separate test. I'll open a new issue and we can discuss there. |
@rmandelb @vvinuv @j-dr @morriscb @sschmidt23 @slosar @aphearin @patricialarsen We had lots of discussion on this thread about galaxy clustering and galaxy bias but haven't reached a concrete plan. So let me see if I can capture the essentials and draft a plan here. ➡️ Current implementation can be found here, for reference.
|
Yes, I agree this should be a separate test. I am happy to share my (Halotools-based) code for this purpose, although it is based on snapshots with xyz coordinates. |
I also agree that color dependent clustering should be a separate test. Can't we just repurpose what @vvinuv has done for magnitude dependent clustering since that is already implemented on lightcones which is what we really want to test at the end of the day? |
@j-dr you could repurpose the code since it is already implemented on light
cone.
…On 14 February 2018 at 14:27, Joe DeRose ***@***.***> wrote:
I also agree that color dependent clustering should be a separate test.
Can't we just repurpose what @vvinuv <https://github.com/vvinuv> has done
for magnitude dependent clustering since that is already implemented on
lightcones which is what we really want to test at the end of the day?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJ23TyrtfikuvNUZ4zJsX07_RU3H_gPks5tU0G4gaJpZM4QTedN>
.
--
Vinu Vikraman
http://www.hep.anl.gov/vvikraman/ <http://www.sas.upenn.edu/~vinu/>
|
This has been implemented in #91. |
See more details in LSSTDESC/DC2-production#20.
Note that the wp(rp) code in v1 does not work on light cones. When we have proto-dc2 snapshots we can use the old code. In the meantime we should find new correlation code for light cones.
The text was updated successfully, but these errors were encountered: