Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

struct LossLess: Make Rav1dFrameHeader_segmentation::lossless a bit array #1231

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

kkysen
Copy link
Collaborator

@kkysen kkysen commented Jun 20, 2024

I'm working on trying to fix #1180; this is one of a bunch of steps.

@kkysen kkysen requested review from randomPoison and fbossen June 20, 2024 03:03
@kkysen kkysen force-pushed the kkysen/fn-loop_restoration_filter-Fn-call-safe branch from 42b9e02 to 8ada302 Compare June 20, 2024 05:40
@kkysen kkysen force-pushed the kkysen/struct-LossLess-bit-array branch from 9973a68 to f9b1f96 Compare June 20, 2024 05:40
Copy link
Collaborator

@randomPoison randomPoison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding this correctly, it seems like the main optimization here is using a bitmask operations (LossLess::get) over an array index operation. It's not clear to me that this would necessarily be more efficient though. Have you profiled this and observed a clear performance improvement?

@kkysen
Copy link
Collaborator Author

kkysen commented Jun 20, 2024

If I'm understanding this correctly, it seems like the main optimization here is using a bitmask operations (LossLess::get) over an array index operation. It's not clear to me that this would necessarily be more efficient though. Have you profiled this and observed a clear performance improvement?

No, I didn't profile this one individually yet. It does remove the bounds checks, though I guess that's also solved by SegmentId and InRange later on. It does also use less memory, which I think is useful as well.

@randomPoison
Copy link
Collaborator

It does remove the bounds checks, though I guess that's also solved by SegmentId and InRange later on. It does also use less memory, which I think is useful as well.

Given that you have other changes that target bounds checks for the lossless array, the only real advantage here is the memory reduction. That may still be a nice optimization to have, depending on how many of these are expected to be in memory at once, i.e. how many frame headers there are at a given time. I'm assuming not many, so this doesn't really seem like a worthwhile change to me, but I could be wrong about that. @fbossen might be the best person to weigh in on if this is a reasonable optimization.

Base automatically changed from kkysen/fn-loop_restoration_filter-Fn-call-safe to main June 21, 2024 20:54
@kkysen kkysen force-pushed the kkysen/struct-LossLess-bit-array branch from f9b1f96 to 2720efd Compare June 21, 2024 21:04
@kkysen
Copy link
Collaborator Author

kkysen commented Jun 21, 2024

Given that you have other changes that target bounds checks for the lossless array, the only real advantage here is the memory reduction. That may still be a nice optimization to have, depending on how many of these are expected to be in memory at once, i.e. how many frame headers there are at a given time. I'm assuming not many, so this doesn't really seem like a worthwhile change to me, but I could be wrong about that. @fbossen might be the best person to weigh in on if this is a reasonable optimization.

It's not just about less overall memory, for which memory reductions would be quite small, but about what could be cached.

@fbossen, what do you think?

@kkysen kkysen force-pushed the kkysen/struct-LossLess-bit-array branch from 2720efd to 8db6ed1 Compare June 24, 2024 05:29
@fbossen
Copy link
Collaborator

fbossen commented Jun 24, 2024

I've run a bunch of tests and it seems this PR leads to a small performance regression. The potential memory saving here is too small to have a positive impact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fn decode_coefs is slow
5 participants