speed up object trimming and skip ingest of drawn objects for restarts #174

jchiang87 · 2018-09-25T00:04:16Z

No description provided.

… object lists

… debug statements and suppress ErfaWarnings

… test

coveralls · 2018-09-25T00:19:12Z

Coverage decreased (-0.3%) to 73.629% when pulling 449fd8d on u/jchiang/trim_baseline_version into 4aecf75 on dc2_run2.0_rc.

jchiang87

Hi @cwwalter,
Could you have a look at this please? We'd like to make these changes available for the restarts since they should speed things up as well as reduce the memory usage.

There are two major changes which I will summarize here and comment on below.

A change in sources_from_list in the code that finds the objects per sensor. The new version only performs that calculation for the sensor being simulated instead of looping over all 189, thereby save a lot of time.
Reading in the drawn objects from the checkpoint files so that most of the overhead associated with ingesting them from the instance catalogs can be skipped.

jchiang87 · 2018-09-25T00:28:50Z

python/desc/imsim/imSim.py

    out_obj_dict = {}
    for det in lsst_camera():
        chip_name = det.getName()
+        if target_chip is not None and chip_name != target_chip:


If there is a specific sensor being considered (target_chip), then this if statement will skip the calculations below for the other 188 sensors.

jchiang87 · 2018-09-25T00:32:04Z

python/desc/imsim/trim.py

        ra0, dec0 = self.compute_chip_center(chip_name)
        seps = degrees_separation(ra0, dec0, self._ra, self._dec)
-        index = np.where(seps < radius)
+        if chip_name in self.trimmer.drawn_objects:
+            not_drawn = [not x in self.trimmer.drawn_objects[chip_name]


If this run starts from a checkpoint, this code omits objects that have already been drawn from the objects that are passed to the downstream object processing code. In addition to speeding up that processing, the per sensor memory usage can be substantially reduced.

Old comment I originally typed:

I'm having a hard time understanding the 'chip_name in self.trimmer.drawn_objects' since in the gs_intepreter the list is made out of uniqueIds here: self.drawn_objects.add(gsObject.uniqueId). How does the chip_name get in? I think it is related to the fact that the sensor name is in the checkpoint file?

OK, I think I understand this now. The trimmer drawn_objects is different:

self.drawn_objects[detname] = ckpt['drawn_objects']

I think maybe the names of the two variables should be different for clarity. Perhaps one could be drawn_objects_on_sensor or something?

jchiang87 · 2018-10-04T22:02:18Z

I'm going to merge this tomorrow at 9am PT unless I hear otherwise.

cwwalter

OK, finally finished. Sorry for delay; this looks good. A few comments and suggestions below.

cwwalter · 2018-10-05T15:07:29Z

bin.src/make_flats.py

-    my_flat.write('flat_{}_{}_{}.fits'.format(visit, ccd_id, obs_md.bandpass))
+    prefix = config['persistence']['eimage_prefix']
+    my_flat.write('{}_{}_{}_{}.fits'.format(prefix, visit, ccd_id,
+                                            obs_md.bandpass))


It looks like this makes it so Flats are not differently labeled than normal e-image files. Is that the intention?

That's right. The script that converts the eimage files to raw files expects the lsst_e prefix, and since flats are ingested just like any other file, it made sense just to use the same naming convention. The 'IMGTYPE' keyword in the phdu should identify them as flats.

OK, got it. I see. I guess at least the visit number will be different. Maybe we could have some convention for flat visit # sequences in order to be able to quickly guess from the file name? AIso, I suppose they typically won't show up in the same directories.

I proposed a convention in the ImageProcessingPipeline wiki in the Generating Calibration Products entry: 3xxxxxx, 4xxxxxx, 5xxxxxx for biases, darks, and flats, respectively.

cwwalter · 2018-10-05T15:17:00Z

python/desc/imsim/imSim.py

@@ -152,12 +152,16 @@ def metadata_from_file(file_name):
    return commands


-def sources_from_list(object_lines, obs_md, phot_params, file_name):
+def sources_from_list(object_lines, obs_md, phot_params, file_name,
+                      target_chip=None, log_level='INFO'):


This function should have a docstring.

cwwalter · 2018-10-05T15:26:52Z

python/desc/imsim/trim.py

        ra0, dec0 = self.compute_chip_center(chip_name)
        seps = degrees_separation(ra0, dec0, self._ra, self._dec)
-        index = np.where(seps < radius)
+        if chip_name in self.trimmer.drawn_objects:
+            not_drawn = [not x in self.trimmer.drawn_objects[chip_name]


Old comment I originally typed:

I'm having a hard time understanding the 'chip_name in self.trimmer.drawn_objects' since in the gs_intepreter the list is made out of uniqueIds here: self.drawn_objects.add(gsObject.uniqueId). How does the chip_name get in? I think it is related to the fact that the sensor name is in the checkpoint file?

OK, I think I understand this now. The trimmer drawn_objects is different:

self.drawn_objects[detname] = ckpt['drawn_objects']

I think maybe the names of the two variables should be different for clarity. Perhaps one could be drawn_objects_on_sensor or something?

cwwalter · 2018-10-05T15:32:59Z

python/desc/imsim/trim.py


        # Collect the selected objects.
        selected = [self.object_lines[i] for i in index[0]]
        if sort_magnorm:
            # Sort by magnorm.
+            self.trimmer.logger.debug('sorting by magnorm')


Is this going to happen for all runs? Only ones when multiprocessing is used?

It looks like InstCatTrimmer is always being called from parsePhoSimInstanceFile.

For simple debugging for individuals (rather than big production) having the processing order different than what is in the file will be confusing and will make debugging difficult. If it is always going to happen maybe we could make the default to be non-sorted and then allow a flag?

Is this a real use-case you've encountered? The sorting has been being done for quite a while now. If we want to make it configurable, it would be better to do this via the config file instead of a command line flag. And if it doesn't come up often, I think it is better to sort by default. In any case, the effect of this will probably change in the near future if we break up bright objects in to chunks and randomize the ordering of the object drawing.

Yes, but maybe the last time I did it and really cared about the order was before the sorting started? If you have a catalog and you want to go through and track what it is doing line by line it would be confusing if the order wasn't what you input.

Also, when doing testing I often will do (say) the 1st 2000 objects for speed. What happens in that case? Do you get the 2000 brightest objects? This might actually explain some confusing things I have seen recently when I predicted times based on running a fraction of a file.

True about things changing if we break things up. Maybe we should set up three modes if we do that: "as read", "magnum sorted", and "interleaved"?

I added the config parameter. We can revisit this when we look into breaking up the bright objects.

…naming and integration into lsst_distrib

jchiang87 · 2018-10-05T21:27:16Z

I'll merge this at 6pm PT today.

cwwalter · 2018-10-05T22:16:23Z

OK these all look good. Thanks!

jchiang87 added 6 commits September 18, 2018 23:39

only calculate which objects land on a chip for the requested chip

a0cde95

exclude drawn objects from checkpoint files when compiling per sensor…

74a1288

… object lists

gather and pass checkpoint files for excluding from object lists; add…

5dd2e46

… debug statements and suppress ErfaWarnings

use eimage filename for flats

5e95e2b

enable debug level logging in InstCatTrimmer

5378ef2

add debug statements; replace str.split with more efficient substring…

19d7950

… test

jchiang87 changed the title ~~U/jchiang/trim baseline version~~ speed up object trimming and skip ingest of drawn objects for restarts Sep 25, 2018

jchiang87 commented Sep 25, 2018

View reviewed changes

jchiang87 requested a review from cwwalter September 25, 2018 00:34

restore default read_config() call

ff32c95

cwwalter reviewed Oct 5, 2018

View reviewed changes

jchiang87 added 5 commits October 5, 2018 11:30

add docstring to sources_from_list

f756e7f

rename InstCatTrimmer.drawn_objects to InstCatTrimmer.drawn_objects_dict

fad0b0e

control sorting by magnorm via config file entry

52ba4e5

set sort_magnorm default to True

73710f2

workaround for travis-ci builds until dust settles for obs_lsstCam re…

449fd8d

…naming and integration into lsst_distrib

jchiang87 merged commit ab550a9 into dc2_run2.0_rc Oct 6, 2018

cwwalter mentioned this pull request Nov 1, 2018

Track issues and PRs that were moved to the run 2.0i release branch #165

Closed

15 tasks

cwwalter mentioned this pull request Nov 14, 2018

U/jchiang/simple faint interface #184

Merged

jchiang87 deleted the u/jchiang/trim_baseline_version branch November 15, 2018 05:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speed up object trimming and skip ingest of drawn objects for restarts #174

speed up object trimming and skip ingest of drawn objects for restarts #174

jchiang87 commented Sep 25, 2018

coveralls commented Sep 25, 2018 •

edited

Loading

jchiang87 left a comment

jchiang87 Sep 25, 2018

jchiang87 Sep 25, 2018

cwwalter Oct 5, 2018

jchiang87 commented Oct 4, 2018

cwwalter left a comment

cwwalter Oct 5, 2018

jchiang87 Oct 5, 2018

cwwalter Oct 5, 2018 •

edited

Loading

jchiang87 Oct 5, 2018

cwwalter Oct 5, 2018

jchiang87 Oct 5, 2018

cwwalter Oct 5, 2018

cwwalter Oct 5, 2018

jchiang87 Oct 5, 2018

cwwalter Oct 5, 2018

jchiang87 Oct 5, 2018

jchiang87 commented Oct 5, 2018

cwwalter commented Oct 5, 2018

speed up object trimming and skip ingest of drawn objects for restarts #174

speed up object trimming and skip ingest of drawn objects for restarts #174

Conversation

jchiang87 commented Sep 25, 2018

coveralls commented Sep 25, 2018 • edited Loading

jchiang87 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jchiang87 commented Oct 4, 2018

cwwalter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cwwalter Oct 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jchiang87 commented Oct 5, 2018

cwwalter commented Oct 5, 2018

coveralls commented Sep 25, 2018 •

edited

Loading

cwwalter Oct 5, 2018 •

edited

Loading