Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Include 2D DCT filter #545

Closed
quamt opened this issue Dec 24, 2023 · 14 comments
Closed

Request: Include 2D DCT filter #545

quamt opened this issue Dec 24, 2023 · 14 comments

Comments

@quamt
Copy link

quamt commented Dec 24, 2023

Hello @rigaya

I want to check if including the 2D DCT filter would be possible.

Link to FFmpeg
https://ffmpeg.org/ffmpeg-filters.html#toc-dctdnoiz

info from FFmpeg

Denoise frames using 2D DCT (frequency domain filtering).

This filter is not designed for real time.

The filter accepts the following options:

sigma, s
Set the noise sigma constant.

This sigma defines a hard threshold of 3 * sigma; every DCT coefficient (absolute value) below this threshold with be dropped.

If you need a more advanced filtering, see expr.

Default is 0.

overlap
Set number overlapping pixels for each block. Since the filter can be slow, you may want to reduce this value, at the cost of a less effective filter and the risk of various artefacts.

If the overlapping value doesn’t permit processing the whole input width or height, a warning will be displayed and according borders won’t be denoised.

Default value is blocksize-1, which is the best possible setting.

expr, e
Set the coefficient factor expression.

For each coefficient of a DCT block, this expression will be evaluated as a multiplier value for the coefficient.

If this is option is set, the sigma option will be ignored.

The absolute value of the coefficient can be accessed through the c variable.

n
Set the blocksize using the number of bits. 1<<n defines the blocksize, which is the width and height of the processed blocks.

The default value is 3 (8x8) and can be raised to 4 for a blocksize of 16x16. Note that changing this setting has huge consequences on the speed processing. Also, a larger block size does not necessarily means a better de-noising.
@rigaya
Copy link
Owner

rigaya commented Dec 24, 2023

I think it will be difficult to call ffmpeg filters from NVEnc, and I also think we have multiple denoise filters for NVEnc.

@quamt
Copy link
Author

quamt commented Dec 24, 2023

I think it will be difficult to call ffmpeg filters from NVEnc, and I also think we have multiple denoise filters for NVEnc.

Yes, there are some filters.
I was looking for this specific filter because it is speedy and efficient with noise removal.
For specific scenarios, it is better than the ones.

Another benefit would be that if it were possible, the other two encoders QSV and VCE would benefit too.

I found some sources, but I am not sure how useful they are as they are "outside" FFmpeg:

  1. https://github.com/JeremieMelo/dct_cuda

  2. https://medium.com/@fanzongshaoxing/accelerate-opencv-dct-discrete-cosine-transform-in-multi-dimensional-array-2225acf89eb4

And one outdated NVIDIA 2D Image And Signal Performance Primitives (NPP):
https://docs.nvidia.com/cuda/archive/10.1/npp/group__image__quantization.html

@rigaya
Copy link
Owner

rigaya commented Dec 24, 2023

For specific filters, I recommend you to use Avisynth/VapourSynth filters as always, implementing filters myself takes up too much time, and that is why NVEnc supports avs/vpy readers.

If you want DCT based filters, vpp-smooth is also a DCT based filter.

@rigaya
Copy link
Owner

rigaya commented Dec 24, 2023

I think it might be possible to implement something near to vf_dctdnoiz, but it shall take long time (for months as it looks complicated and difficult to understand).
https://www.ffmpeg.org/doxygen/6.0/vf__dctdnoiz_8c_source.html

@quamt
Copy link
Author

quamt commented Dec 25, 2023

Hello @rigaya,

Firstly, I am grateful for your detailed response and the insights you've provided. It's clear that integrating the 2D DCT filter, akin to vf_dctdnoiz from FFmpeg, is a complex task, especially considering the need to potentially implement it across different platforms like NVENC, QSV, and VCE. I understand this might be a challenging process, and I appreciate the challenges in calling FFmpeg filters from NVEnc or adapting similar functionalities.

Implementing such features can be time-consuming and complex, especially when ensuring compatibility and efficiency across various encoders. Your suggestion about the potential long-term possibility of something akin to vf_dctdnoiz is encouraging, albeit understanding it's a complicated and time-intensive endeavour.

Please take all the time you need for this implementation. I understand the intricacies involved in such development, especially across NVENC, QSVENC, and VCEEnc platforms, and I presume OpenCL might be a pathway, given its support across these platforms. Your efforts to look into this matter are highly appreciated, and many users, myself included, look forward to any future developments, no matter the timeline.

Thank you once again for your dedication and for considering this feature request.
The community greatly appreciates your work, and I am excited about the potential enhancements it could bring to the quality and efficiency of video encoding.

rigaya added a commit that referenced this issue Jan 18, 2024
とりあえず動くところまで。
@rigaya
Copy link
Owner

rigaya commented Jan 22, 2024

--vpp-denoise-dct which implements dctdnoiz algorithm for GPU.

The implementation details are different, and also parameter names are, so please refer the documents for details.

Also, expr was difficult to implement on GPU, so is not supported.


Actually I've had very hard time with implementation and debugging to make this run, I really recommend to use Avisynth/VapourSynth filters via avs/vpy reader for complex filters...........

@quamt
Copy link
Author

quamt commented Jan 23, 2024

Dear @rigaya,

I am extending my sincerest gratitude for the recent implementation of the --vpp-denoise-dct feature in response to my request. Your commitment to enhancing video encoding tools' capabilities is commendable, and I am genuinely impressed by the speed and quality of this new filter.

Having tested the filter, I can confidently say the results are remarkable. The balance of speed and efficiency in noise removal is impressive, and it's evident that a significant amount of effort and expertise went into this development. Implementing such features, especially on GPU, is challenging and time-consuming, and your dedication to this project is highly appreciated.

While Avisynth/VapourSynth filters are powerful, as you've mentioned, there are scenarios where they might not be the most efficient choice due to speed constraints. The addition of --vpp-denoise-dct provides a much-needed alternative for such cases.

On that note, I would like to know if there are plans to implement this feature in qsvenc. Many users, including myself, would find great value in having this capability across different platforms, particularly for those who rely on qsvenc for their encoding needs.

Once again, thank you for your incredible work and for considering the community's needs. Your efforts significantly impact the quality and efficiency of video encoding, and they do not go unnoticed.

I am looking forward to hearing from you about the potential Qsvenc implementation.

@rigaya
Copy link
Owner

rigaya commented Jan 24, 2024

It’s very nice to hear that it works fine with good speed and was worth to work on.

GPU implementation requires compilcated implementation to make it run fast as you say, and took time especially on debugging.

Of course I plan to add this also to QSVEnc and VCEEnc, but will take another time as I need port it to OpenCL. Also, please note that it’s performance might be poor, especially on Intel iGPU. The implementation of this filter heavily relies on so-called shared memory ( or local memory) for performance speed up, but shared memory on Intel iGPU is known to be poor.

@quamt
Copy link
Author

quamt commented Jan 25, 2024

Again, Thank you for your dedication and hard work in implementing the --vpp-denoise-dct feature.

I understand that the process of porting it to QSVENC and VCEENC will take some time, especially considering the need to use OpenCL. I appreciate your willingness to expand this feature to multiple platforms, and I'm eagerly anticipating its availability in the future.

Furthermore, I understand that performance may vary between integrated and dedicated graphics cards. Your knowledge of potential hardware limitations is valuable and helps establish realistic expectations for users.

@quamt
Copy link
Author

quamt commented Jan 27, 2024

Quick question,

I noticed there isn't a limitation to the Sigma.
Example: denoise-dct: sigma 88888.55, step 1, block_size 16
Is this on purpose? I won't use this value as everything would be blurry.

@rigaya
Copy link
Owner

rigaya commented Jan 27, 2024

Current limitation is to be positive value, there is no limitation on the larger side, as there won't be any fatal trouble like causing artifacts.

Getting too blurry is expected, it's simply unsuitable value.

@quamt
Copy link
Author

quamt commented Jan 27, 2024

I appreciate the confirmation.
I only used that high value for testing purposes.

@rigaya
Copy link
Owner

rigaya commented Feb 11, 2024

QSVEnc 7.59 adds --vpp-denoise-dct, performance seems fine at least on Arc A380.

@quamt
Copy link
Author

quamt commented Feb 12, 2024

Thank you very much.
I'll conduct some tests on the ARC A770 and report back.

@quamt quamt closed this as completed Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants