Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROIs] Use FOV ROIs for illumination correction (at highest-resolution level) #21

Closed
tcompa opened this issue Jul 20, 2022 · 8 comments
Closed
Labels
Tables AnnData and ROI/feature tables

Comments

@tcompa
Copy link
Collaborator

tcompa commented Jul 20, 2022

Starting from discussion in

we are now splitting the work on ROIs into several steps/issues:

  1. Parsing pixel sizes and FOV ROIs.
  2. Transform physical-unit ROI into pixels, at arbitrary pyramid level.
  3. Use FOV ROIs for illumination correction (at highest-resolution level). (THIS ISSUE)
  4. Use FOV ROIs for labeling (at arbitrary pyramid level).

Useful resources:


This is the simplest task which reads/writes ROIs, because it only works at pyramid level 0 and it only works on FOVs. For this reason (and provided that the array is chunked by images), ROIs are not overlapping (in physical space or w.r.t. chunks) and there is no issue with parallel writing.

Tentative plan:

  1. Read ROIs from some path, and transform them to level-0 indices.
  2. Switch from map_blocks to an explicit loop over ROIs which constructs an array by delayed correction functions.
  3. Write to disk, triggering parallel execution of correction function for all FOVs
@tcompa
Copy link
Collaborator Author

tcompa commented Jul 20, 2022

Context for why the ROI logic may be useful here: If we have very few sparse FOVs in a well, we only want to correct the corresponding ROIs, and not a large number of (probably empty) chunks.

@jluethi
Copy link
Collaborator

jluethi commented Jul 20, 2022

Does it make sense to have a library function that returns the dask array for a given ROI based on its coordinates?
And a second one that writes to the Zarr file given an output image & coordinates?

We could then wrap the whole thing in a generic library function that can iterate over ROIs by calling those two functions + some computation function in the middle (that is an input by the user). And the generic wrapper can then worry about parallel writing access and implementing improvements, while the task just specify how they want it to be run.

For example, a task may specify that it wants all ROIs to be run in parallel. As a first implementation, we still do sequential processing. Then we can make that function more complex over time, i.e. implement the parallelization, then implement specific checks about whether the parallelization is save or whether we need to use advanced dask/zarr writing functionality. But none of that needs to influence the task implementation anymore.

We don't need to start this general, but let's think about whether we can generalize to this level to keep the task logic simple :)

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 21, 2022
@tcompa
Copy link
Collaborator Author

tcompa commented Jul 21, 2022

For the prototype in fractal-analytics-platform/fractal-client@21ce80d, we replaced map_blocks with a loop over FOV ROIs. This produces output that is at least reasonable, but more testing is needed (including both with/without overwrite).

Note that it obviously breaks tests, because ROI logic is not yet implemented there. To do.

tcompa referenced this issue in fractal-analytics-platform/fractal-client Jul 22, 2022
@tcompa
Copy link
Collaborator Author

tcompa commented Jul 22, 2022

At the moment, the illumination-correction task still keeps track of the image sizes (which are parsed, and not hardcoded) and verifies that the array is chunked by images.

The relevant parts of the new functions are

    # Read FOV ROIs
    FOV_ROI_table = ad.read_zarr(f"{zarrurl}tables/FOV_ROI_table")

    # Read pixel sizes from zattrs file
    pixel_sizes_zyx = extract_zyx_pixel_sizes_from_zattrs(zarrurl + ".zattrs")

    # Create list of indices for 3D FOVs spanning the entire Z direction
    list_indices = convert_ROI_table_to_indices(
        FOV_ROI_table,
        level=0,
        coarsening_xy=coarsening_xy,
        pixel_sizes_zyx=pixel_sizes_zyx,
    )

    # Extract image size from FOV-ROI indices
    # Note: this works at level=0, where FOVs should all be of the exact same
    #       size (in pixels)
    ref_img_size = None
    for indices in list_indices:
        img_size = (indices[3] - indices[2], indices[5] - indices[4])
        if ref_img_size is None:
            ref_img_size = img_size
        else:
            if img_size != ref_img_size:
                raise Exception(
                    "ERROR: inconsistent image sizes in list_indices"
                )
    img_size_y, img_size_x = img_size[:]

    [...]


    # Create the final list of single-Z-layer FOVs
    list_indices = split_3D_indices_into_z_layers(list_indices)

    # Prepare delayed function
    delayed_correct = dask.delayed(correct)

    # Loop over channels
    data_czyx_new = []
    for ind_ch, ch in enumerate(chl_list):

        data_zyx = data_czyx[ind_ch]
        illum_img = corrections[ch]

        data_zyx_new = da.empty_like(data_zyx)

        for indices in list_indices:

            s_z, e_z, s_y, e_y, s_x, e_x = indices[:]
            shape = [e_z - s_z, e_y - s_y, e_x - s_x]
            if min(shape) == 0:
                raise Exception(f"ERROR: ROI indices lead to shape {shape}")
            new_img = delayed_correct(
                data_zyx[s_z:e_z, s_y:e_y, s_x:e_x],
                illum_img,
                background=background,
            )
            data_zyx_new[s_z:e_z, s_y:e_y, s_x:e_x] = da.from_delayed(
                new_img, shape, dtype
            )
        data_czyx_new.append(data_zyx_new)
    accumulated_data = da.stack(data_czyx_new, axis=0)

@jluethi
Copy link
Collaborator

jluethi commented Jul 22, 2022

Love to see how this task code is evolving! And, if I understand this correctly, the only illum_corr specific part is the function & params that are called in the delayed_correct, no?

Could we abstract this by having a lib function that does all of this and just takes a few inputs, like path of the relevant ROI table, the channel it should be applied to (not always all channels for future functions), the function that should be applied & some kwargs?

And then we might have two lib functions: One for plane-wise execution, one for 3D execution. Tasks then just call the lib function with the input args, the actual processing function and some kwargs as optional parameters. That would make it fairly easy for people to write new tasks without having to worry about the whole dask part :)

(not something we need to generalize right now, more to think about tasks moving in that direction)

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 22, 2022

Maybe let's think about an external function once we have at least two use cases ;)

That is, let's compare the code for two "similar" tasks (illumination correction and per-FOV labeling) and then we'll see whether the overlapping part can be turned into an external function.

In general I think it's a good idea, but perhaps a couple of points need to be cleared, before this abstracting this into a function.

  1. If we reach a point where we understand that it's safe to simultaneously write the same chunk twice (i.e. if we understand that dask already takes this issue into account), then we can make this function quite general. If not, illumination correction is quite special in that it applies a function in parallel over ROIs which correspond to chunks (and this needs to be checked explicitly).
  2. If we want to support (in the future) complex (non-orthogonal ) ROIs, things may differ within this external function. But we can also decide that we stick with the current status (not supporting complex ROIs), and then we are OK.
  3. We should make sure that the function works at an arbitrary pyramid level (generally doable, but to be checked carefully).

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 22, 2022

Meanwhile, illumination correction is currently working on a per-ROI level (tested on the usual small UZH dataset, with one well and 2x2 sites).
I'm closing this, and let's reopen it if tests show some unexpected behavior.

@jluethi
Copy link
Collaborator

jluethi commented Jul 22, 2022

Yes, fully agree. Just started to see how it could come together nicely based on the above code :) Let's keep this in mind for a task refactor.

Your 3 points are really good things to think about once we go towards more abstraction here.

@tcompa tcompa closed this as completed Jul 22, 2022
@jluethi jluethi transferred this issue from fractal-analytics-platform/fractal-client Sep 2, 2022
@jluethi jluethi moved this from Done to Done Archive in Fractal Project Management Oct 5, 2022
@tcompa tcompa added the Tables AnnData and ROI/feature tables label Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Tables AnnData and ROI/feature tables
Projects
None yet
Development

No branches or pull requests

2 participants