Add transforms and presets for optical flow models #5026

NicolasHug · 2021-12-03T10:49:18Z

Towards #4644

This PR adds transforms and train/eval presets needed for the optical flow training reference. Ideally #5004 should be merged first.

A lot of these transforms are just wrappers around existing tranforms, modified so that they accept all four parameters img1, img2, flow, valid_flow_mask. For the rest of the transforms I added comments in the code.

CC @fmassa @datumbox @haooooooqi

facebook-github-bot · 2021-12-03T10:49:31Z

💊 CI failures summary and remediations

As of commit f6fe16d (more details on the Dr. CI page):

1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

cmake_windows_gpu (1/1)

Step: "set -ex
source packaging/windows/internal/vc_install_helper.sh
packaging/windows/internal/cuda_install.bat
packaging/build_cmake.sh
" (full log | diagnosis details | 🔁 rerun)

35.138 | ERROR: [NVI2.NVInstaller] 240@CNVIn... {0xe4080000; File: InstallThread.cpp; Line: 112}.

     35.057 |     INFO: [NVI2.NVInstaller] 4197@CNVInstaller::InternalProcessFinalStatus : Calling "C:\Users\circleci\project\packaging\windows\temp_build\cuda\HDAudio\HDAudioExt.dll" to process final status. 
     35.057 |     INFO: [NVI2.NVInstaller] 4202@CNVInstaller::InternalProcessFinalStatus : No processing of finalStatus is carried out, continuing to next extension. 
     35.057 |     INFO: [NVI2.NVInstaller] 4197@CNVInstaller::InternalProcessFinalStatus : Calling "C:\Users\circleci\project\packaging\windows\temp_build\cuda\GFExperience.NvStreamSrv\NvStreamSrvExt.dll" to process final status. 
     35.057 |     INFO: [NVI2.NVInstaller] 4202@CNVInstaller::InternalProcessFinalStatus : No processing of finalStatus is carried out, continuing to next extension. 
     35.057 |     INFO: [NVI2.NVInstaller] 4197@CNVInstaller::InternalProcessFinalStatus : Calling "C:\Users\circleci\project\packaging\windows\temp_build\cuda\ShadowPlay\ShadowPlayExt.DLL" to process final status. 
     35.057 |     INFO: [NVI2.NVInstaller] 4202@CNVInstaller::InternalProcessFinalStatus : No processing of finalStatus is carried out, continuing to next extension. 
     35.057 |    DEBUG: [NVI2.Installation] 107@CInstallationStepper::Present : Presenting "!InstallCheckFailed". 
     35.057 |    DEBUG: [NVI2.Installation] 163@CInstallationStepper::Response : Response - not presenting, auto move next. 
     35.062 |    DEBUG: [NVI2.NVInstaller] 712@CNVInstaller::InternalCleanupCacheDeferred : Scheduling deferred delete using module C:\Users\circleci\AppData\Local\Temp\NVI2_29.DLL waiting for this process C:\Users\circleci\project\packaging\windows\temp_build\cuda\setup.exe 8880. 
     35.138 |    DEBUG: [NVI2.NVInstaller] 761@CNVInstaller::InternalCleanupCacheDeferred : Deferred delete is primed. 
     35.138 |    ERROR: [NVI2.NVInstaller] 240@CNVInstaller::PerformSteps : Installation failed with error Exception {0xe4080000; File: InstallThread.cpp; Line: 112}. 
     35.138 |  WARNING: [NVI2.Telemetry] 64@CTelemetry::CTelemetry : Telemetry reporting disabled. 
     35.138 |  WARNING: [NVI2.Telemetry] 1196@CTelemetry::SendTelemetry : Telemetry reporting disabled. 
     35.138 |  WARNING: [NVI2.GFCTelemetryFeedback] 65@CGFCTelemetryFeedback::CGFCTelemetryFeedback : Telemetry reporting disabled. 
     35.138 |  WARNING: [NVI2.GFCTelemetryFeedback] 843@CGFCTelemetryFeedback::SendTelemetry : Telemetry reporting disabled. 
     35.139 |    DEBUG: [NVI2.NVExtension] 201@CNVExtension::InternalUnload : Unloading "C:\Users\circleci\project\packaging\windows\temp_build\cuda\MSVCRT\MSVCRTExt.dll". 
     35.139 |  WARNING: [NVI2.NVExtension] 209@CNVExtension::InternalUnload : Aborted unload of "C:\Users\circleci\project\packaging\windows\temp_build\cuda\MSVCRT\MSVCRTExt.dll" - in use. 
     35.139 |    DEBUG: [NVI2.NVExtension] 201@CNVExtension::InternalUnload : Unloading "C:\Users\circleci\project\packaging\windows\temp_build\cuda\CUDAToolkit\CUDAToolkitExt.DLL". 
     35.139 |    DEBUG: [NVI2.NVExtension] 205@CNVExtension::InternalUnload : Unloaded "C:\Users\circleci\project\packaging\windows\temp_build\cuda\CUDAToolkit\CUDAToolkitExt.DLL". 
     35.139 |    DEBUG: [NVI2.NVExtension] 201@CNVExtension::InternalUnload : Unloading "C:\Users\circleci\project\packaging\windows\temp_build\cuda\MSVCRT\MSVCRTExt.dll". 
     35.139 |    DEBUG: [NVI2.NVExtension] 205@CNVExtension::InternalUnload : Unloaded "C:\Users\circleci\project\packaging\windows\temp_build\cuda\MSVCRT\MSVCRTExt.dll".

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

references/optical_flow/presets.py

NicolasHug · 2021-12-03T10:51:28Z

references/optical_flow/transforms.py

+        return img1, img2, flow, valid_flow_mask
+
+
+class RandomErase(torch.nn.Module):


This is similar to our existing RandomErase, but replaces with the mean of the pixels, instead of replacing with a given (known-in-advance) value

I understand that this class is supposed to be used BEFORE converting the image from int to floats. This is unlike TorchVision's RandomErasing class which is applied AFTER converting the image to float, scaling it to 0-1 and normalizing it. I would definitely prefer having in TorchVision following this approach rather than the current one and that's something worth changing on the future, but I think offering another transform right now that operates slightly differently could be confusing. Thoughts?

Concerning using mean VS a fixed value, note that TorchVision's RandomErasing implementation uses a fixed zero value because the images are expected to be de-meaned. I wonder if it would be possible to reuse for now TorchVision's transform if you passed a normalized image. Perhaps that's worth doing even if the outcome is not 100% the same, just to maintain parity between transforms defined in references and those in legacy transforms (this will simplify porting to the new API).

datumbox

Thanks for the PR @NicolasHug, I've added a few comments for discussion. Might be worth discussing this offline and then summarize our decision to speed this up.

references/optical_flow/transforms.py

datumbox · 2021-12-06T10:54:28Z

references/optical_flow/transforms.py

+            img1 = super().forward(img1)
+            img2 = super().forward(img2)
+        else:
+            # symmetric: same transform for img1 and img2


@NicolasHug: So does the p determine the probability of doing symmetric VS asymmetric? If yes I would add a comment to clarify.

@pmeier: Could you please check this strange transform to confirm it's supported by the new Transforms API?

As it stands, this would not be supported. A transform always treats a sample as atomic unit and so multiple images in the same sample would be transformed with the same parameters.

OK, I'll clarify.

Ultimately this is a special case of RandomApply(t1, t2, [p, p - 1]), so there's nothing too fancy here

t2 can be a Sequential(take_care_of_img1_only, take_care_of_img2_only)

@NicolasHug Sounds good, just add comments. No need to use RandomApply here.

@pmeier No worries, this is why we give the option for someone to write custom transforms without the magic of the new API. For weird cases like this. Could you now confirm that this is indeed a workaround we can apply?

@NicolasHug

Sequential(take_care_of_img1_only, take_care_of_img2_only)

I'm guessing take_care_of_img1_only and take_care_of_img2_only are transforms here, correct? If yes, how would you tell the transform to only handle one or the other image if both receive the full sample?

I think this is one of the cases @datumbox mentioned where we need to circumvent the automatic dispatch a little. In case we want to transform both samples separately, we could split the sample and and perform the transformation once for the sample minus image 2 and once for image2. The problem I see with this, is that it can't be automated without assumptions about how the sample is structured. So we either need to use the same structure for every dataset (for example flat dictionary with image1 and image2 keys) or provide a way to parametrize the transform.

I'm guessing take_care_of_img1_only and take_care_of_img2_only are transforms here, correct? If yes, how would you tell the transform to only handle one or the other image if both receive the full sample?

Each transform would receive the entire input (which IIRC is a dict) and operate on a subset of that dict.

Are you suggesting that img1 and img2 would be concatenated?

datumbox · 2021-12-06T11:09:11Z

references/optical_flow/transforms.py

+        return img1, img2, flow, valid_flow_mask
+
+
+class RandomErase(torch.nn.Module):


I understand that this class is supposed to be used BEFORE converting the image from int to floats. This is unlike TorchVision's RandomErasing class which is applied AFTER converting the image to float, scaling it to 0-1 and normalizing it. I would definitely prefer having in TorchVision following this approach rather than the current one and that's something worth changing on the future, but I think offering another transform right now that operates slightly differently could be confusing. Thoughts?

Concerning using mean VS a fixed value, note that TorchVision's RandomErasing implementation uses a fixed zero value because the images are expected to be de-meaned. I wonder if it would be possible to reuse for now TorchVision's transform if you passed a normalized image. Perhaps that's worth doing even if the outcome is not 100% the same, just to maintain parity between transforms defined in references and those in legacy transforms (this will simplify porting to the new API).

references/optical_flow/transforms.py

references/optical_flow/presets.py

references/optical_flow/transforms.py

datumbox

LGTM thanks @NicolasHug.

references/optical_flow/transforms.py

fmassa

Thanks for the PR!

I've only a couple of minor comments, otherwise LGTM!

references/optical_flow/transforms.py

Reviewed By: NicolasHug Differential Revision: D32950925 fbshipit-source-id: 77cce3f89a1110fa607b219ffa1cdf639a424a33

Add transforms and presets for RAFT

85f314d

NicolasHug added the other if you have no clue or if you will manually handle the PR in the release notes label Dec 3, 2021

pytorch-probot bot added the ciflow/default label Dec 3, 2021

facebook-github-bot added the cla signed label Dec 3, 2021

NicolasHug commented Dec 3, 2021

View reviewed changes

references/optical_flow/presets.py Show resolved Hide resolved

NicolasHug commented Dec 3, 2021

View reviewed changes

NicolasHug mentioned this pull request Dec 3, 2021

RAFT model and training reference #4644

Closed

12 tasks

datumbox reviewed Dec 6, 2021

View reviewed changes

NicolasHug added 3 commits December 6, 2021 12:54

Merge branch 'main' of github.com:pytorch/vision into raft_transforms

55e9992

Address most comments

706a0a6

Move RandomErasing after Normalize(), and use builtin implem

647dbb5

NicolasHug mentioned this pull request Dec 7, 2021

Add raft builders and presets in prototypes #5043

Merged

NicolasHug commented Dec 7, 2021

View reviewed changes

references/optical_flow/transforms.py Show resolved Hide resolved

__call__ -> forward()

02a2640

datumbox approved these changes Dec 7, 2021

View reviewed changes

NicolasHug commented Dec 7, 2021

View reviewed changes

references/optical_flow/transforms.py Outdated Show resolved Hide resolved

fmassa approved these changes Dec 7, 2021

View reviewed changes

references/optical_flow/transforms.py Show resolved Hide resolved

references/optical_flow/transforms.py Outdated Show resolved Hide resolved

NicolasHug and others added 2 commits December 7, 2021 18:48

Address comments

ccc0029

Merge branch 'main' into raft_transforms

f6fe16d

NicolasHug added the module: reference scripts label Dec 7, 2021

NicolasHug merged commit 47ae092 into pytorch:main Dec 7, 2021

facebook-github-bot pushed a commit that referenced this pull request Dec 9, 2021

[fbsync] Add transforms and presets for optical flow models (#5026)

1225318

Reviewed By: NicolasHug Differential Revision: D32950925 fbshipit-source-id: 77cce3f89a1110fa607b219ffa1cdf639a424a33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add transforms and presets for optical flow models #5026

Add transforms and presets for optical flow models #5026

NicolasHug commented Dec 3, 2021

facebook-github-bot commented Dec 3, 2021 •

edited

Loading

NicolasHug Dec 3, 2021

datumbox Dec 6, 2021

datumbox left a comment

datumbox Dec 6, 2021

pmeier Dec 6, 2021

NicolasHug Dec 6, 2021 •

edited

Loading

datumbox Dec 6, 2021

datumbox Dec 6, 2021 •

edited

Loading

pmeier Dec 6, 2021

NicolasHug Dec 6, 2021

datumbox Dec 6, 2021

datumbox left a comment

fmassa left a comment

		return img1, img2, flow, valid_flow_mask


		class RandomErase(torch.nn.Module):

Add transforms and presets for optical flow models #5026

Add transforms and presets for optical flow models #5026

Conversation

NicolasHug commented Dec 3, 2021

facebook-github-bot commented Dec 3, 2021 • edited Loading

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

cmake_windows_gpu (1/1)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

datumbox left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NicolasHug Dec 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

datumbox Dec 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

datumbox left a comment

Choose a reason for hiding this comment

fmassa left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Dec 3, 2021 •

edited

Loading

NicolasHug Dec 6, 2021 •

edited

Loading

datumbox Dec 6, 2021 •

edited

Loading