Method to fill holes in segmentations

**Is your feature request related to a problem? Please describe.**

We have 3D segmentation masks. Every segmented shape is not supposed to have holes within its borders.
Any wholes might be considered potential artifacts.
Especially for our training data we would like to close such holes.

For an example, the following matrix:

```python
[
  [
    [ 1, 1, 1, 2, 2, 2 ],
    [ 1, 1, 1, 2, 2, 2 ],
    [ 1, 1, 1, 2, 2, 2 ],
    [ 0, 3, 0, 0, 4, 0 ],
    [ 3, 3, 3, 4, 0, 4 ],
    [ 0, 3, 0, 0, 4, 0 ],
  ],
  [
    [ 1, 1, 1, 2, 2, 2 ],
    [ 1, 0, 1, 2, 0, 0 ],
    [ 1, 1, 1, 2, 2, 2 ],
    [ 0, 3, 0, 0, 4, 0 ],
    [ 3, 0, 3, 4, 0, 4 ],
    [ 0, 3, 0, 0, 4, 0 ],
  ],
  [
    [ 1, 1, 1, 2, 2, 2 ],
    [ 1, 1, 1, 2, 2, 2 ],
    [ 1, 1, 1, 2, 2, 2 ],
    [ 0, 3, 0, 0, 4, 0 ],
    [ 3, 3, 3, 4, 4, 4 ],
    [ 0, 3, 0, 0, 4, 0 ],
  ],
]
```
Should result in

```python
[
  [
    [ 1, 1, 1, 2, 2, 2 ],
    [ 1, 1, 1, 2, 2, 2 ],
    [ 1, 1, 1, 2, 2, 2 ],
    [ 0, 3, 0, 0, 4, 0 ],
    [ 3, 3, 3, 4, 0, 4 ],
    [ 0, 3, 0, 0, 4, 0 ],
  ],
  [
    [ 1, 1, 1, 2, 2, 2 ],
    [ 1, 1, 1, 2, 0, 0 ],
    [ 1, 1, 1, 2, 2, 2 ],
    [ 0, 3, 0, 0, 4, 0 ],
    [ 3, 3, 3, 4, 0, 4 ],
    [ 0, 3, 0, 0, 4, 0 ],
  ],
  [
    [ 1, 1, 1, 2, 2, 2 ],
    [ 1, 1, 1, 2, 2, 2 ],
    [ 1, 1, 1, 2, 2, 2 ],
    [ 0, 3, 0, 0, 4, 0 ],
    [ 3, 3, 3, 4, 4, 4 ],
    [ 0, 3, 0, 0, 4, 0 ],
  ],
]
```
The only filled holes are the 1 and 3 in the middle slice.
The 2 shape is open to the side and the 4 is open to the back.
The 0 between the classes should stay untouched.


**Describe the solution you'd like**

MONAI is missing a "fill_holes" function / transformation.
scipy has an implementation which works for 3D data but only for binary images ([scipy.ndimage.morphology.binary_fill_holes](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.binary_fill_holes.html?highlight=fill_holes#scipy.ndimage.binary_fill_holes)). This requires an iteration over all labels of the image which makes this already slow method unacceptably slow.

I wish that MONAI would implement such a function.

**Describe alternatives you've considered**

I opened this feature request on the scipy project: https://github.com/scipy/scipy/issues/14504
And this stackoverflow question: https://stackoverflow.com/questions/68608749/performant-way-to-fill-holes-for-categorical-data

I implemented 7 versions using the existing [scipy.ndimage.morphology.binary_fill_holes](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.binary_fill_holes.html?highlight=fill_holes#scipy.ndimage.binary_fill_holes) function (or its implementation) and `numpy`. Here the two best versions so far:

```python
import numpy as np
from scipy.ndimage.morphology import binary_fill_holes

def fill_holes6(img: np.ndarray, applied_labels: np.ndarray) -> np.ndarray:
    output = np.zeros_like(img)
    for i in applied_labels:
        output[binary_fill_holes(img == i)] = i

    return output

def fill_holes7(img: np.ndarray, applied_labels: np.ndarray) -> np.ndarray:
    output = np.zeros(img.shape, dtype=int)
    for i in applied_labels:
        tmp = np.zeros(img.shape, dtype=bool)
        binary_dilation(tmp, structure=None, iterations=-1, mask=img != i, origin=0, border_value=1, output=tmp)
        output[np.logical_not(tmp)] = i
        
    return output
```

In MONAI this could be implemented something like this:

```python
class FillHoles():

    @classmethod
    def _binary_dilation(cls, img: torch.Tensor) -> torch.Tensor:
        img_arr = img.detach().cpu().numpy()
        img_arr = binary_fill_holes(img_arr)
        return torch.as_tensor(img_arr, device=img.device).type(torch.uint8)

    def __init__(self, applied_labels: Union[Sequence[int]]) -> None:
        self.applied_labels = applied_labels

    def __call__(self, img: torch.Tensor,
                 meta_data: Dict[str, Any]) -> Tuple[torch.Tensor, Dict[str, Any]]:
       output = torch.zeros_like(img)
       for i in applied_labels:
          output[self._binary_fill_holes(img == i)] = i

       return output, meta_data
```

I measured the performance the following way (matching my real world data distribution):

```python
import time
import pandas as pd

def measure(funs, t):
    res = []
    for _ in range(t):
        ra = np.random.randint(10, 40)
        sh = np.random.randint(200, 400, 3)
        img = np.random.randint(0, ra, sh)

        applied_labels = np.unique(img)[1:]

        fun_res = []
        for fun in funs:
            start = time.time()
            fun(img, applied_labels)
            end = time.time()
            fun_res.append(end - start)
        res.append(fun_res)
    return np.min(res, axis=0), np.max(res, axis=0), np.mean(res, axis=0), np.std(res, axis=0)

print(measure([fill_holes6, fill_holes7], t=10))
```

For my first implementations I got the following execution times (t=100):

|  | fill_holes1 | fill_holes2 | fill_holes3 |
| -- | ------------- | ------------- | ------------- |
| min  | 6.4s | 6.9s | 6.2s |
| max  | 83.7s | 96.0s | 80.4s  |
| mean  | 32.9s | 37.3s | 31.6s  |
| std  | 17.3s | 20.1s | 16.5 |

This is very slow.
The last implementation fill_holes7 is only 1.27 times faster than fill_holes3.
I really hope there is a more performant way of doing this.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Method to fill holes in segmentations #2678

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	fill_holes1	fill_holes2	fill_holes3
min	6.4s	6.9s	6.2s
max	83.7s	96.0s	80.4s
mean	32.9s	37.3s	31.6s
std	17.3s	20.1s	16.5

Method to fill holes in segmentations #2678

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions