Treatment of (Dynamic) Shadows in Eiger Data Sets

Introduction

A question which occurred is how to handle shadows on Eiger data sets - e.g. from back stops (static) or goniometers (dynamic). At the moment we encode in the data analysis software a model of e.g. goniometers which can be used to mask pieces of the image array where the data are known to be compromised - however this does not (as a principle) scale well. There is an existing mask definition which is used to define the bad pixels and tile joins, which could be extended for this use, but that typically comes from the detector which should - ideally - have no idea about the wider beamline environment.

Proposed Solution

Add a new data set e.g. /entry/shadow (which could be external) which contains a small-bit-depth data array either the same size as the image (for static shadows) or the same size as the data set in it’s entirety (for dynamic ones). This may sound insane but there are some practical benefits:

the data compress very well - especially useful if chunked the same way as the image data
the calculations are done once and saved henceforth rather than every time the data are processed
the acquisition system is responsible for creating the shadow mask rather than the data analysis developers guessing

First Implementation

dials.import /path/to/eiger_master.h5
dev.dials.generate_shadow_mask compression=lz4 shadowed.json output=Thau_5_2_shadow.h5

will give

Grey-Area i04-eiger-shadows :) $ dev.dials.generate_shadow_mask compression=lz4 shadowed.json output=Thau_5_2_shadow.h5
The following parameters have been modified:

output = "Thau_5_2_shadow.h5"
compression = gzip lzf *lz4
input {
  datablock = shadowed.json
}

..........................................................................22.92s
Grey-Area i04-eiger-shadows :) $ du -hs Thau_5_2_shadow.h5 
 37M	Thau_5_2_shadow.h5

for 1,800 x Eiger2 16M images - not too bad. Linking in to the master file as:

import h5py
f = h5py.File('Thau_5_2_master.h5', 'r+')
f['/entry/shadow'] = h5py.ExternalLink('Thau_5_2_shadow.h5', '/shadow')
f.close()

seems straightforward. Using the mask is also straightforward - if e.g. in XDS where values of -2 are known to be untrusted the calling code can use vector calculations as

image = -2 * mask + (~mask) * image

to apply the mask to the original image data.

Further discussion from mailing list

One could think of this also as

 /entry/shadow/static_mask   External Link {identifier_shadow.h5//static_mask}
 /entry/shadow/dynamic_mask  External Link {identifier_shadow.h5//dynamic_mask}

/static_mask a single 2D array with the same size as the image (similar to /pixel_mask)
/dynamic_mask being an (optional) number of N 2D arrays (with each 2D array again the size of an image).

One would need to decide if N is always the same as nimages or if it is the number of affected images (and each 2D array has an attribute to let the application know which image it should apply to)

GW response to latter is that we can handle this with HDF5 fill values as done above in the dials tool. Keeping 2D and 3D mask images separate though seems like a good idea, as well as being highly prescriptive over the shape of the arrays (i.e. 2D must be the same shape as an image, 3D must be the shape of the raw data array).

Further offline discussion

The static and dynamic masks may be later updated e.g. by user operation so a further option is to have HDF5 hard links within the file to the locations agreed (i.e. /entry/shadow etc.) from where they are e.g. /entry/shadow-20190206085241. Since the masks are likely to be relatively small this should be harmless.

Updates

As of commit [master 6ba461279] the dials tool mentioned above makes the shadow dataset in /entry/shadow/dynamic_mask. Example data set with this dynamic shadow mask can be found at https://zenodo.org/record/2559150

Provide feedback

Saved searches

Use saved searches to filter your results more quickly