Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

txn: refactor some iters in scanner.go #27854

Closed
wants to merge 1 commit into from

Conversation

lcwangchao
Copy link
Collaborator

@lcwangchao lcwangchao commented Sep 7, 2021

What problem does this PR solve?

There are some problems for the current wrapper iterator implementions.

  • oneByOneIter

In current implement, oneByOneIter keeps a list of iterators and iters them on by one. It may cause some problems when we have many iterators. For example, 1000+ iterators may keep 1000+ connections, but only one is used at one time. It's big waste of resources.

  • filterEmptyValueIter, lowerBoundReverseIter

We should think more carefully for error cases. For example, when we call lowerBoundReverseIter.Next() it actually calls its inner iter's Next function. If it's innert iter returns an error, what's the wrapper's behavior after it.

What is changed and how it works?

What's Changed:

  • For oneByOneIter , it keeps a list of function that creates a new iterator instead of a list of iterators. So the creation of iterator becomes lazy. We'll close an iter once it becomes invalid, so there is only one iter alive at one time.

  • For filterEmptyValueIter and lowerBoundReverseIter , when their inner iter.Next() return errors , they become invalid but not close their inner iter explicitly. When wrapper iter's Close() method is called its inner iter will be closed. Why not close inner iter when error occurs is to forbid multi close. User may bypass the wrapper iter and close the inner iter directly, so if we close the inner iter when error occurs, it will be closed twice.

  • Also added some tests for error behaviors.

Check List

Tests

  • Unit test

Release note

None

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 7, 2021
@lcwangchao lcwangchao added sig/transaction SIG:Transaction sig/sql-infra SIG: SQL Infra labels Sep 7, 2021
@lcwangchao
Copy link
Collaborator Author

/run-check_dev_2

1 similar comment
@lcwangchao
Copy link
Collaborator Author

/run-check_dev_2

@disksing
Copy link
Contributor

disksing commented Sep 7, 2021

Looks it is becoming more and more complex. I'm start to wonder if we really need oneByOneIterator. can we just nest and use multiple unionIter?

@lcwangchao
Copy link
Collaborator Author

Looks it is becoming more and more complex. I'm start to wonder if we really need oneByOneIterator. can we just nest and use multiple unionIter?

Can you explain it in detail? Sure we can archive the same function with unionIter by wrapping it for many times. But it still requires that all iterators are created initially which can cause a waste of resource. And it will have too many stack jumps in one Next() call, I think the performance is boring

@tiancaiamao
Copy link
Contributor

Looks it is becoming more and more complex. I'm start to wonder if we really need oneByOneIterator. can we just nest and use multiple unionIter?

This is a good question. Where oneByOneIterator is used? @lcwangchao

@xhebox
Copy link
Contributor

xhebox commented Sep 8, 2021

Looks it is becoming more and more complex. I'm start to wonder if we really need oneByOneIterator. can we just nest and use multiple unionIter?

Can you explain it in detail? Sure we can archive the same function with unionIter by wrapping it for many times. But it still requires that all iterators are created initially which can cause a waste of resource. And it will have too many stack jumps in one Next() call, I think the performance is boring

Agreed, nesting unionIter is neither intuitive nor easier to implement. I am thinking about renaming, oneByOneIterator to UnionIteratorLazyor UnionIteratorEx.. And unionIter can use the previous oneByOneIterator to accept unknown number of iters, like io.MultiReader.

As for the usage, it comes from #27722, looks like it will be used for temporary table #26952. It is like an interceptor API to bypass TiKV. A more elegant solution will be support multi storage drivers, which is, well, basically impossible for now. That is how @lcwangchao comes here, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. sig/sql-infra SIG: SQL Infra sig/transaction SIG:Transaction size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants