🚀 Add `PreProcessor` to `AnomalyModule` #2358

samet-akcay · 2024-10-09T13:30:26Z

📝 Description

The PreProcessor class serves as both a PyTorch module and a Lightning callback, handling transforms during different stages of training, validation, testing and prediction. This PR demonstrates how to create and use custom pre-processors.

Key Components

The pre-processor functionality is implemented in:

class PreProcessor(nn.Module, Callback):
    """Anomalib pre-processor.

    This class serves as both a PyTorch module and a Lightning callback, handling
    the application of transforms to data batches during different stages of
    training, validation, testing, and prediction.

    Args:
        train_transform (Transform | None): Transform to apply during training.
        val_transform (Transform | None): Transform to apply during validation.
        test_transform (Transform | None): Transform to apply during testing.
        transform (Transform | None): General transform to apply if stage-specific
            transforms are not provided.

    Raises:
        ValueError: If both `transform` and any of the stage-specific transforms
            are provided simultaneously.

    Notes:
        If only `transform` is provided, it will be used for all stages (train, val, test).

        Priority of transforms:
        1. Explicitly set PreProcessor transforms (highest priority)
        2. Datamodule transforms (if PreProcessor has no transforms)
        3. Dataloader transforms (if neither PreProcessor nor datamodule have transforms)
        4. Default transforms (lowest priority)

    Examples:
        >>> from torchvision.transforms.v2 import Compose, Resize, ToTensor
        >>> from anomalib.pre_processing import PreProcessor

        >>> # Define transforms
        >>> train_transform = Compose([Resize((224, 224)), ToTensor()])
        >>> val_transform = Compose([Resize((256, 256)), CenterCrop((224, 224)), ToTensor()])

        >>> # Create PreProcessor with stage-specific transforms
        >>> pre_processor = PreProcessor(
        ...     train_transform=train_transform,
        ...     val_transform=val_transform
        ... )

        >>> # Create PreProcessor with a single transform for all stages
        >>> common_transform = Compose([Resize((224, 224)), ToTensor()])
        >>> pre_processor_common = PreProcessor(transform=common_transform)

        >>> # Use in a Lightning module
        >>> class MyModel(LightningModule):
        ...     def __init__(self):
        ...         super().__init__()
        ...         self.pre_processor = PreProcessor(...)
        ...
        ...     def configure_callbacks(self):
        ...         return [self.pre_processor]
        ...
        ...     def training_step(self, batch, batch_idx):
        ...         # The pre_processor will automatically apply the correct transform
        ...         processed_batch = self.pre_processor(batch)
        ...         # Rest of the training step
    """

And used by the base AnomalyModule in:

    def _resolve_pre_processor(self, pre_processor: PreProcessor | bool) -> PreProcessor:
        """Resolve and validate which pre-processor to use..

        Args:
            pre_processor: Pre-processor configuration
                - True -> use default pre-processor
                - False -> no pre-processor
                - PreProcessor -> use the provided pre-processor

        Returns:
            Configured pre-processor
        """
        if isinstance(pre_processor, PreProcessor):
            return pre_processor
        if isinstance(pre_processor, bool):
            return self.configure_pre_processor()
        msg = f"Invalid pre-processor type: {type(pre_processor)}"
        raise TypeError(msg)

Usage Examples

1. Using Default Pre-Processor

The simplest way is to use the default pre-processor which resizes images to 256x256 and normalizes using ImageNet statistics:

from anomalib.models import PatchCore

# Uses default pre-processor
model = PatchCore()

2. Custom Pre-Processor with Different Transforms

Create a pre-processor with custom transforms for different stages:

from torchvision.transforms.v2 import Compose, Resize, CenterCrop, RandomHorizontalFlip, Normalize
from anomalib.pre_processing import PreProcessor

# Define stage-specific transforms
train_transform = Compose([
    Resize((256, 256), antialias=True),
    RandomHorizontalFlip(p=0.5),
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

val_transform = Compose([
    Resize((256, 256), antialias=True), 
    CenterCrop((224, 224)),
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# Create pre-processor with different transforms per stage
pre_processor = PreProcessor(
    train_transform=train_transform,
    val_transform=val_transform,
    test_transform=val_transform  # Use same transform as validation for testing
)

# Use custom pre-processor in model
model = PatchCore(pre_processor=pre_processor)

3. Disable Pre-Processing

To disable pre-processing entirely:

model = PatchCore(pre_processor=False)

4. Override Default Pre-Processor in Custom Model

Custom models can override the default pre-processor configuration:

from anomalib.models.components.base import AnomalyModule

class CustomModel(AnomalyModule):
    @classmethod
    def configure_pre_processor(cls, image_size=(224, 224)) -> PreProcessor:
        transform = Compose([
            Resize(image_size, antialias=True),
            Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
        ])
        return PreProcessor(transform=transform)

Notes

Pre-processor transforms are applied in order of priority:
- Explicitly set PreProcessor transforms (highest)
- Datamodule transforms
- Dataloader transforms
- Default transforms (lowest)
The pre-processor automatically handles both image and mask transforms during training
Custom transforms should maintain compatibility with both image and segmentation mask inputs

Testing

Added unit tests to verify:
Default pre-processor behavior
Custom transform application
Transform priority order
Mask transformation handling

✨ Changes

Select what type of change your PR is:

🐞 Bug fix (non-breaking change which fixes an issue)
🔨 Refactor (non-breaking change which refactors the code base)
🚀 New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📚 Documentation update
🔒 Security update

✅ Checklist

Before you submit your pull request, please make sure you have completed the following steps:

📋 I have summarized my changes in the CHANGELOG and followed the guidelines for my type of change (skip for minor changes, documentation updates, and test enhancements).
📚 I have made the necessary updates to the documentation (if applicable).
🧪 I have written tests that support my changes and prove that my fix is effective or my feature works (if applicable).

For more information about code review checklists, see the Code Review Checklist.

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

jpcbertoldo · 2024-10-09T14:34:46Z

A sub-feature request that would fit here: (optionally?) keep both the transformed and original image/mask in the batch.

So instead of

            image, gt_mask = self.XXX_transform(batch.image, batch.gt_mask)
            batch.update(image=image, gt_mask=gt_mask)

something like

            batch.update(image_original=batch.image, gt_mask_original=batch.gt_mask)
            image, gt_mask = self.XXX_transform(batch.image, batch.gt_mask)
            batch.update(image=image, gt_mask=gt_mask)

It's quite practical to have these when using the API (i've re-implemented this in my local copy 100 times haha).

samet-akcay · 2024-10-09T14:51:11Z

A sub-feature request that would fit here: (optionally?) keep both the transformed and original image/mask in the batch.

So instead of
            image, gt_mask = self.XXX_transform(batch.image, batch.gt_mask)
            batch.update(image=image, gt_mask=gt_mask)
something like
            batch.update(image_original=batch.image, gt_mask_original=batch.gt_mask)
            image, gt_mask = self.XXX_transform(batch.image, batch.gt_mask)
            batch.update(image=image, gt_mask=gt_mask)
It's quite practical to have these when using the API (i've re-implemented this in my local copy 100 times haha).

yeah, the idea is to keep batch.image and batch.gt_mask original outside the model. It is not working that way though :)

jpcbertoldo · 2024-10-09T15:26:28Z

yeah, the idea is to keep batch.image and batch.gt_mask original outside the model

exactly, makes sense : )

but it's also useful to be able to access the transformed one (eg. when using augmentations)

it is not working that way though :)

didnt get this. cause it's not backcompatible?

samet-akcay · 2024-10-09T17:54:12Z

yeah, the idea is to keep batch.image and batch.gt_mask original outside the model

exactly, makes sense : )

but it's also useful to be able to access the transformed one (eg. when using augmentations)

it is not working that way though :)

didnt get this. cause it's not backcompatible?

oh I meant, it is currently not working, I need to fix it :)

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

…ssor Signed-off-by: Samet Akcay <samet.akcay@intel.com>

…oolkit/anomalib into add-pre-processor

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

review-notebook-app · 2024-10-17T19:22:34Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

ashwinvaidya17

Thanks. I have a few minor comments

ashwinvaidya17 · 2024-10-30T10:36:35Z

src/anomalib/models/components/base/anomaly_module.py

+        # Handle pre-processor
+        # True -> use default pre-processor
+        # False -> no pre-processor
+        # PreProcessor -> use the provided pre-processor
+        if isinstance(pre_processor, PreProcessor):
+            self.pre_processor = pre_processor
+        elif isinstance(pre_processor, bool):
+            self.pre_processor = self.configure_pre_processor()
+        else:
+            msg = f"Invalid pre-processor type: {type(pre_processor)}"
+            raise TypeError(msg)
+


Minor comment, but can we move this to a separate method?

which one would you prefer? _init_pre_processor, _resolve_pre_processor, _handle_pre_processor or _setup_pre_processor

ashwinvaidya17 · 2024-10-30T10:43:31Z

src/anomalib/models/components/base/anomaly_module.py

@@ -220,30 +250,12 @@ def input_size(self) -> tuple[int, int] | None:
        The effective input size is the size of the input tensor after the transform has been applied. If the transform
        is not set, or if the transform does not change the shape of the input tensor, this method will return None.
        """
-        transform = self.transform or self.configure_transforms()
+        transform = self.pre_processor.train_transform


Should we add a check to ascertain whether train_transform is present? Models like VlmAD might not have train_transforms passed to them. I feel it should pick up val or pred transform is train is not available.

ashwinvaidya17 · 2024-10-30T12:03:09Z

src/anomalib/models/components/base/anomaly_module.py

@@ -79,6 +93,10 @@ def _setup(self) -> None:
        initialization.
        """

+    def configure_callbacks(self) -> Sequence[Callback] | Callback:
+        """Configure default callbacks for AnomalyModule."""
+        return [self.pre_processor]


How can we ensure that pre_processor callback is called before the other callbacks? Like, is metrics callback dependent on pre-processing first?

In the base model, we will need to ensure the list of callbacks, I guess. For the child classes that inherits this one, we could have something like this:

def configure_callbacks(self) -> Sequence[Callback]: """Configure callbacks with parent callbacks preserved.""" # Get parent callbacks first parent_callbacks = super().configure_callbacks() # Add child-specific callbacks callbacks = [ *parent_callbacks, # Parent callbacks first MyCustomCallback(), # Then child callbacks AnotherCallback(), ] return callbacks

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

src/anomalib/models/components/base/anomaly_module.py

src/anomalib/pre_processing/pre_processing.py

djdameln · 2024-11-05T13:33:01Z

src/anomalib/pre_processing/utils/transform.py

+    if dataset_attr and hasattr(datamodule, dataset_attr):
+        dataset = getattr(datamodule, dataset_attr)
+        if hasattr(dataset, "transform"):
+            dataset.transform = transform


Would there be a way to assign the transforms to the datamodule before the datasets are instantiated, instead of overwriting them here? That way the datasets would always have the right transform, which would be less prone to bugs.

I guess it is done this way because setup callback hook gets called after the datamodule's setup hook, so by the time that we set up the pre-processor, the datasets are already created. I just wonder if there is a different way to do it which does not involve overwriting the transforms.

In this PR #2239, the setup logic completely changes. It might be an idea to visit it when working on this PR instead of addressing here

Can we target the other PR to the feature branch then?

We'll have to, but it requires quite some changes, as it is out-of-date now
It is within 2.0 requirements #2364

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

src/anomalib/pre_processing/pre_processing.py

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay added 6 commits October 9, 2024 12:14

Created pre-processor

5338afa

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Rename transforms to transform in pre-processor

180c22f

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datamodules

7738e38

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datasets

a133048

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove setup_transforms from Engine

c748a0d

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Add preprocessor to AnomalyModule and models

03a2a2e

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay requested review from ashwinvaidya17 and djdameln as code owners October 9, 2024 13:30

samet-akcay added 6 commits October 10, 2024 15:02

Fix tests

6f7399a

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove self._transform from AnomalyModule

cc5f559

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

revert transforms in datasets

4d2e110

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

fix efficient_ad and engine.config tests

1e83e57

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Update the upgrade tests

1e05349

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Revert on_load_checkpoint hook to AnomalyModule

785d64f

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

This was referenced Oct 14, 2024

[Bug]: Output segmentation mask mismatch #1660

Open

📋 [TASK] Integrate Pre-processing as AnomalibModule Attribute #2366

Open

samet-akcay linked an issue Oct 14, 2024 that may be closed by this pull request

📋 [TASK] Integrate Pre-processing as AnomalibModule Attribute #2366

Open

samet-akcay added 7 commits October 15, 2024 05:58

Remove exportable transform from anomaly module and move to pre-proce…

b798243

…ssor Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Merge branch 'feature/design-simplifications' of github.com:openvinot…

4bf6187

…oolkit/anomalib into add-pre-processor

Merge main

c942604

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Add pre-processor to the model graph

ea28833

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Add docstring to pre-processor class

78cf516

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Fix win-clip tests

46fe7e5

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Update notebooks

f058fbb

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Split the forward logic and move the training to model hooks

84c39cd

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay added 2 commits October 25, 2024 17:03

Replace batch.update with in-place batch transforms

8c379c0

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove logger.warning when the default pre-processor is used

db1d543

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

openvinotoolkit deleted a comment from djdameln Oct 25, 2024

samet-akcay added 12 commits October 25, 2024 17:13

Use predict-transforms explicitly

9ec2547

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

remove pre-processor and configure_transforms from export mixin

ba240be

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Rename set_datamodule_transform to set_datamodule_stage_transform

6ea5369

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datamodules

c71e41c

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datamodules

b9bb700

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datamodules

185fec8

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datamodules

06fd947

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transforms from datamodules

079168e

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove transform related keys from data configs

1f6555c

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

update preprocessor tests

03196fa

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove setup method from the model implementations

d579312

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove image size from datamodules in jupyter notebooks

5e82c34

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

ashwinvaidya17 reviewed Oct 30, 2024

View reviewed changes

Modify folder notebook to acccess the batch from dataset not dataloader

a1a0548

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay requested a review from djdameln October 31, 2024 05:26

Create resolve preprocessor method

0ab0a71

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay requested a review from ashwinvaidya17 November 4, 2024 18:13

djdameln requested changes Nov 5, 2024

View reviewed changes

samet-akcay added 4 commits November 5, 2024 14:21

Return if is

401fbaa

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Rename self.exportable_transform to self.export_transform

9b45def

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Remove set_datamodule_transforms

f5fbb7c

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

remove hooks as they are not needed anymore

9cc11d0

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

samet-akcay requested a review from djdameln November 5, 2024 14:46

Fix pre-processor tests

05c86da

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

djdameln reviewed Nov 5, 2024

View reviewed changes

src/anomalib/pre_processing/pre_processing.py Outdated Show resolved Hide resolved

remove transform getter util function

3eecd89

Signed-off-by: Samet Akcay <samet.akcay@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚀 Add `PreProcessor` to `AnomalyModule` #2358

🚀 Add `PreProcessor` to `AnomalyModule` #2358

samet-akcay commented Oct 9, 2024 •

edited

Loading

jpcbertoldo commented Oct 9, 2024

samet-akcay commented Oct 9, 2024

jpcbertoldo commented Oct 9, 2024

samet-akcay commented Oct 9, 2024

review-notebook-app bot commented Oct 17, 2024

ashwinvaidya17 left a comment

ashwinvaidya17 Oct 30, 2024

samet-akcay Oct 30, 2024 •

edited

Loading

ashwinvaidya17 Oct 30, 2024

ashwinvaidya17 Oct 30, 2024

samet-akcay Oct 30, 2024

djdameln Nov 5, 2024

samet-akcay Nov 5, 2024

djdameln Nov 5, 2024

samet-akcay Nov 5, 2024 •

edited

Loading

🚀 Add PreProcessor to AnomalyModule #2358

Are you sure you want to change the base?

🚀 Add PreProcessor to AnomalyModule #2358

Conversation

samet-akcay commented Oct 9, 2024 • edited Loading

📝 Description

Key Components

Usage Examples

1. Using Default Pre-Processor

2. Custom Pre-Processor with Different Transforms

3. Disable Pre-Processing

4. Override Default Pre-Processor in Custom Model

Notes

Testing

✨ Changes

✅ Checklist

jpcbertoldo commented Oct 9, 2024

samet-akcay commented Oct 9, 2024

jpcbertoldo commented Oct 9, 2024

samet-akcay commented Oct 9, 2024

review-notebook-app bot commented Oct 17, 2024

ashwinvaidya17 left a comment

Choose a reason for hiding this comment

ashwinvaidya17 Oct 30, 2024

Choose a reason for hiding this comment

samet-akcay Oct 30, 2024 • edited Loading

Choose a reason for hiding this comment

ashwinvaidya17 Oct 30, 2024

Choose a reason for hiding this comment

ashwinvaidya17 Oct 30, 2024

Choose a reason for hiding this comment

samet-akcay Oct 30, 2024

Choose a reason for hiding this comment

djdameln Nov 5, 2024

Choose a reason for hiding this comment

samet-akcay Nov 5, 2024

Choose a reason for hiding this comment

djdameln Nov 5, 2024

Choose a reason for hiding this comment

samet-akcay Nov 5, 2024 • edited Loading

Choose a reason for hiding this comment

🚀 Add `PreProcessor` to `AnomalyModule` #2358

🚀 Add `PreProcessor` to `AnomalyModule` #2358

samet-akcay commented Oct 9, 2024 •

edited

Loading

samet-akcay Oct 30, 2024 •

edited

Loading

samet-akcay Nov 5, 2024 •

edited

Loading