Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option for a new flexible binary instance catalog #71

Open
cwwalter opened this issue Oct 22, 2017 · 9 comments
Open

Add an option for a new flexible binary instance catalog #71

cwwalter opened this issue Oct 22, 2017 · 9 comments
Labels

Comments

@cwwalter
Copy link
Member

cwwalter commented Oct 22, 2017

One of the initial driving development philosophies for imSim was to make it a drop in for PhoSim which could read the same instance files. This was very successful for allowing us to use it quickly and effectively.

However, as we have used it we have found we wanted to do somethings differently (like issues related to proper motion and nutation) and now the formats are in fact not exactly equivalent. Now that we have more of an infrastructure built up around the imSim ecosystem @jchiang87 and others have suggested we start from scratch and build a new parser that has all of the features we really want and need. Previous discussion about this topic has happened in email and also can also be found in LSSTDESC/DC2-production#16.

This parser could use binary data input files with several related pandas dataframes that would be compact and would also serve to store all of the related truth information in a way we can't now.

This format needs to serve several roles:

  1. Production runs generated from CatSim for the data challenges
  2. R&D work on the small scale
  3. Sensor only or calibration simulations for validation.

Having the ability to also have text based input (especially for 2 and 3) is also very desirable.

In the short term we should discuss whether this is a DC2 or DC3 era job. This format should be clearly defined so that if could also be used by other tools (including PhoSim) if desired.

@cwwalter
Copy link
Member Author

cwwalter commented Apr 6, 2018

Following up in this issue on a discussion that @danielsf and I had: We have now modified imSim to read the PhoSim instance style catalog again. This is attractive since it has reduced overhead/"paperwork" when we are doing things like DC2. It means we don't have to keep track of two different kinds of instance catalogs. Also, it is straight-forward because CatSim can do exactly what we need.

PhoSim wants a description of exactly what the source looks like at that time above the atmosphere. This means the user is responsible for proper motion and nutation etc. For CatSim use, this is no problem. But, for people doing their own studies with hand crafted instance catalogs, this requires extra work, where it is easy to make mistakes.

So, aside from things like binary formats etc which is what this issue was originally about, we think it might be good in the future to have a flag that would allow us to choose either PhoSim format instance catalogs or a native imSim format that used ICRS + proper motion entries like we had before.

@cwwalter cwwalter changed the title Make a new more flexible parser Add an option for a new flexible binary instance catalog Sep 24, 2018
@cwwalter
Copy link
Member Author

Rename this issue to make clearer what it involves.

@cwwalter
Copy link
Member Author

With all of the work on various pipelines and the extremely large amount of gzipped text file instance catalogs we have had to deal with for DC2, now is a good time to reconsider the instance catalog format used by imSim.

This issue is is about undertaking a design period and then implementing a new binary instance catalog that could be used by imSim or other programs.

This would hopefully be much more compact than the current instance catalogs and would also be more flexible in that we could easily pass more information and could also more easily allow for multiple options or descriptions of the input information.

Note: we would want to carefully think about how to do this and either use formats and tools that allow us to convert to and from text formats or supply our own. We would still want people doing simple studies to be able to write text format files (possibly just the current PhoSim format) but we would either have a new option for binary files in addition, or a a way to covert text files to the new format for use either externally or on read in.

One useful first study might be to do a simple estimate of the size savings if we wrote the current instance catalogs in a direct binary representation.

@rmjarvis
Copy link
Contributor

I'd recommend writing a simple standalone program that can convert the binary to PhoSim format. Then we can pass around only binary files even for 2.0p. Then part of the script for running PhoSim would be to convert the instance catalog to ASCII at the start and delete that file at the end.

@cwwalter cwwalter added the DC2 label Nov 19, 2018
@cwwalter
Copy link
Member Author

There are potentially two different jobs here:

  • Use a binary format just to save space and speed.
  • Make the format itself more flexible

We could potentially do the 1st before the 2nd. We would like to target this for DC2 but maybe not before the end of the year.

These changes also need a bidirectional binary text to binary converter.

@cwwalter
Copy link
Member Author

Note one possible performance enhancement with a more flexible format would be the ability deal with all of the components of a galaxy (disk, bulge, knots etc) at once. This way would would only need to do some operations like sizing once.

@cwwalter cwwalter removed the DC2 label Nov 27, 2018
@cwwalter
Copy link
Member Author

We won't be doing a partial implementation for DC2 of the binary instance format, so I am removing the DC2 label.

@cwwalter
Copy link
Member Author

cwwalter commented Mar 9, 2020

Consolidating discussions: closing #222 which also discusses this.

@cwwalter
Copy link
Member Author

cwwalter commented Mar 9, 2020

@jchiang87 is currently testing a pandas based strawman for galaxies only as a followup to discussions at the Tucson meeting.

@cwwalter cwwalter mentioned this issue Apr 2, 2020
6 tasks
@cwwalter cwwalter added this to the imSim 2.0 milestone Aug 17, 2022
@cwwalter cwwalter removed this from the imSim 2.0 milestone Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants