Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add summary for txnBlock(1.2-dev only) #18691

Merged
merged 3 commits into from
Sep 11, 2024
Merged

Conversation

aptend
Copy link
Contributor

@aptend aptend commented Sep 10, 2024

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue https://github.com/matrixorigin/MO-Cloud/issues/3925

What this PR does / why we need it:

Add summary to speed up scanning txn during pull logtail


PR Type

enhancement


Description

  • Added a skipFn function in reader.go to filter blocks based on the presence of summaries, optimizing the scanning process during logtail operations.
  • Enhanced the summary struct in table.go by introducing a tids map to track table IDs, improving the efficiency of transaction block summaries.
  • Updated the DefaultLogtailTxnPageSize in types.go from 100 to 256 to accommodate larger transaction pages.

Changes walkthrough 📝

Relevant files
Enhancement
reader.go
Add block filtering logic in logtail reader                           

pkg/vm/engine/tae/logtail/reader.go

  • Added a skipFn function to filter blocks based on summary presence.
  • Modified ForeachRowInBetween to use skipFn.
  • +9/-1     
    table.go
    Enhance summary struct with table IDs tracking                     

    pkg/vm/engine/tae/logtail/table.go

  • Introduced tids map in summary struct.
  • Updated trySumary to populate tids with table IDs.
  • +7/-3     
    Configuration changes
    types.go
    Update default logtail transaction page size                         

    pkg/vm/engine/tae/options/types.go

    • Increased DefaultLogtailTxnPageSize from 100 to 256.
    +1/-1     

    💡 PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    Copy link

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Reviewer Guide 🔍

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Key issues to review

    Potential Performance Issue
    The new skipFn function may introduce unnecessary overhead if called frequently, as it involves atomic loading and map lookup operations.

    Memory Usage
    The tids map is initialized with a fixed capacity of 8, which may lead to unnecessary memory allocations if the actual number of table IDs frequently exceeds this initial capacity.

    @matrix-meow matrix-meow added the size/S Denotes a PR that changes [10,99] lines label Sep 10, 2024
    @mergify mergify bot requested a review from sukki37 September 10, 2024 09:59
    Copy link

    qodo-merge-pro bot commented Sep 10, 2024

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Performance
    Optimize the loop by moving a condition check outside

    Consider moving the check for catalog changes outside the loop to improve
    performance, as it only needs to be set once.

    pkg/vm/engine/tae/logtail/table.go [211-218]

    +summary.hasCatalogChanges = false
     for _, row := range blk.rows {
       memo := row.GetMemo()
    -  if !summary.hasCatalogChanges && memo.HasCatalogChanges() {
    -    summary.hasCatalogChanges = true
    +  if !summary.hasCatalogChanges {
    +    summary.hasCatalogChanges = memo.HasCatalogChanges()
       }
       for k := range memo.Tree.Tables {
         summary.tids[k] = struct{}{}
       }
     }
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: The suggestion optimizes performance by reducing unnecessary checks within the loop, which can have a positive impact on efficiency, especially with large datasets.

    8
    Maintainability
    Use a constant for the initial map capacity

    Consider using a constant for the initial capacity of the tids map to improve code
    maintainability.

    pkg/vm/engine/tae/logtail/table.go [210]

    -summary.tids = make(map[uint64]struct{}, 8)
    +const initialTidsCapacity = 8
    +summary.tids = make(map[uint64]struct{}, initialTidsCapacity)
     
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: Using a constant for the initial capacity enhances maintainability by making it easier to adjust the capacity in the future, but it is not a significant change.

    7
    Readability
    Rename the function to better describe its purpose

    Consider using a more descriptive name for the skipFn function, such as
    shouldSkipBlock, to better convey its purpose.

    pkg/vm/engine/tae/logtail/reader.go [86-93]

    -skipFn := func(blk BlockT) bool {
    +shouldSkipBlock := func(blk BlockT) bool {
       summary := blk.summary.Load()
       if summary == nil {
         return false
       }
       _, exist := summary.tids[id]
       return !exist
     }
     
    • Apply this suggestion
    Suggestion importance[1-10]: 6

    Why: The suggestion improves code readability by using a more descriptive function name, which helps in understanding the code's purpose. However, it is not a critical change.

    6

    @mergify mergify bot merged commit 37ab251 into matrixorigin:1.2-dev Sep 11, 2024
    17 of 18 checks passed
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    5 participants