Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore optimizing RFile LoadPlan computation #5272

Open
keith-turner opened this issue Jan 17, 2025 · 0 comments
Open

Explore optimizing RFile LoadPlan computation #5272

keith-turner opened this issue Jan 17, 2025 · 0 comments
Labels
enhancement This issue describes a new feature, improvement, or optimization.

Comments

@keith-turner
Copy link
Contributor

keith-turner commented Jan 17, 2025

Is your feature request related to a problem? Please describe.

In #4898 a new mechanism was added to RFile to compute bulk import load plans as the RFile is written. This new mechanism was implemented using completely new code that examines each key value written. There may be existing code in RFile that could be leveraged for this computation that may reduce the amount of work done per key value written.

Describe the solution you'd like

Determine if this code could be modified to help compute the load plan leveraging its tracking of first and last keys. Ideally this change would minimize the overall work done per key value when writing to a RFile.

Describe alternatives you've considered

It may be best to not make any changes at for this issue, its needs investigation.

The following are some reasons that maybe no changes should be made for this issue.

  1. The performance impact of the code that does per key examination added in Offers new ways of computing bulk load plans #4898 is negligible compared to other parts of the rfile code write pipeline. Optimizing something that is not taking much time will not really speed up the overall write pipeline. Need to optimize the slowest parts to see measurable improvement.
  2. The existing code is not well suited for the new task.
  3. There too many existing layers of abstraction that would need to be broken to make the change.

Only want to make this change if it shows a measurable performance improvement and does not add tech debt to the code.

@keith-turner keith-turner added the enhancement This issue describes a new feature, improvement, or optimization. label Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This issue describes a new feature, improvement, or optimization.
Projects
None yet
Development

No branches or pull requests

1 participant