-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an option for a new flexible binary instance catalog #71
Comments
Following up in this issue on a discussion that @danielsf and I had: We have now modified imSim to read the PhoSim instance style catalog again. This is attractive since it has reduced overhead/"paperwork" when we are doing things like DC2. It means we don't have to keep track of two different kinds of instance catalogs. Also, it is straight-forward because CatSim can do exactly what we need. PhoSim wants a description of exactly what the source looks like at that time above the atmosphere. This means the user is responsible for proper motion and nutation etc. For CatSim use, this is no problem. But, for people doing their own studies with hand crafted instance catalogs, this requires extra work, where it is easy to make mistakes. So, aside from things like binary formats etc which is what this issue was originally about, we think it might be good in the future to have a flag that would allow us to choose either PhoSim format instance catalogs or a native imSim format that used ICRS + proper motion entries like we had before. |
Rename this issue to make clearer what it involves. |
With all of the work on various pipelines and the extremely large amount of gzipped text file instance catalogs we have had to deal with for DC2, now is a good time to reconsider the instance catalog format used by imSim. This issue is is about undertaking a design period and then implementing a new binary instance catalog that could be used by imSim or other programs. This would hopefully be much more compact than the current instance catalogs and would also be more flexible in that we could easily pass more information and could also more easily allow for multiple options or descriptions of the input information. Note: we would want to carefully think about how to do this and either use formats and tools that allow us to convert to and from text formats or supply our own. We would still want people doing simple studies to be able to write text format files (possibly just the current PhoSim format) but we would either have a new option for binary files in addition, or a a way to covert text files to the new format for use either externally or on read in. One useful first study might be to do a simple estimate of the size savings if we wrote the current instance catalogs in a direct binary representation. |
I'd recommend writing a simple standalone program that can convert the binary to PhoSim format. Then we can pass around only binary files even for 2.0p. Then part of the script for running PhoSim would be to convert the instance catalog to ASCII at the start and delete that file at the end. |
There are potentially two different jobs here:
We could potentially do the 1st before the 2nd. We would like to target this for DC2 but maybe not before the end of the year. These changes also need a bidirectional binary text to binary converter. |
Note one possible performance enhancement with a more flexible format would be the ability deal with all of the components of a galaxy (disk, bulge, knots etc) at once. This way would would only need to do some operations like sizing once. |
We won't be doing a partial implementation for DC2 of the binary instance format, so I am removing the DC2 label. |
Consolidating discussions: closing #222 which also discusses this. |
@jchiang87 is currently testing a pandas based strawman for galaxies only as a followup to discussions at the Tucson meeting. |
One of the initial driving development philosophies for imSim was to make it a drop in for PhoSim which could read the same instance files. This was very successful for allowing us to use it quickly and effectively.
However, as we have used it we have found we wanted to do somethings differently (like issues related to proper motion and nutation) and now the formats are in fact not exactly equivalent. Now that we have more of an infrastructure built up around the imSim ecosystem @jchiang87 and others have suggested we start from scratch and build a new parser that has all of the features we really want and need. Previous discussion about this topic has happened in email and also can also be found in LSSTDESC/DC2-production#16.
This parser could use binary data input files with several related pandas dataframes that would be compact and would also serve to store all of the related truth information in a way we can't now.
This format needs to serve several roles:
Having the ability to also have text based input (especially for 2 and 3) is also very desirable.
In the short term we should discuss whether this is a DC2 or DC3 era job. This format should be clearly defined so that if could also be used by other tools (including PhoSim) if desired.
The text was updated successfully, but these errors were encountered: