txn: refactor some iters in scanner.go #27854

lcwangchao · 2021-09-07T07:33:56Z

What problem does this PR solve?

There are some problems for the current wrapper iterator implementions.

oneByOneIter

In current implement, oneByOneIter keeps a list of iterators and iters them on by one. It may cause some problems when we have many iterators. For example, 1000+ iterators may keep 1000+ connections, but only one is used at one time. It's big waste of resources.

filterEmptyValueIter, lowerBoundReverseIter

We should think more carefully for error cases. For example, when we call lowerBoundReverseIter.Next() it actually calls its inner iter's Next function. If it's innert iter returns an error, what's the wrapper's behavior after it.

What is changed and how it works?

What's Changed:

For oneByOneIter , it keeps a list of function that creates a new iterator instead of a list of iterators. So the creation of iterator becomes lazy. We'll close an iter once it becomes invalid, so there is only one iter alive at one time.
For filterEmptyValueIter and lowerBoundReverseIter , when their inner iter.Next() return errors , they become invalid but not close their inner iter explicitly. When wrapper iter's Close() method is called its inner iter will be closed. Why not close inner iter when error occurs is to forbid multi close. User may bypass the wrapper iter and close the inner iter directly, so if we close the inner iter when error occurs, it will be closed twice.
Also added some tests for error behaviors.

Check List

Tests

Unit test

Release note

None

ti-chi-bot · 2021-09-07T07:33:57Z

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

lcwangchao · 2021-09-07T07:49:09Z

/run-check_dev_2

lcwangchao · 2021-09-07T08:11:22Z

/run-check_dev_2

disksing · 2021-09-07T09:27:13Z

Looks it is becoming more and more complex. I'm start to wonder if we really need oneByOneIterator. can we just nest and use multiple unionIter?

lcwangchao · 2021-09-07T09:48:35Z

Looks it is becoming more and more complex. I'm start to wonder if we really need oneByOneIterator. can we just nest and use multiple unionIter?

Can you explain it in detail? Sure we can archive the same function with unionIter by wrapping it for many times. But it still requires that all iterators are created initially which can cause a waste of resource. And it will have too many stack jumps in one Next() call, I think the performance is boring

tiancaiamao · 2021-09-07T13:57:30Z

Looks it is becoming more and more complex. I'm start to wonder if we really need oneByOneIterator. can we just nest and use multiple unionIter?

This is a good question. Where oneByOneIterator is used? @lcwangchao

xhebox · 2021-09-08T05:23:40Z

Looks it is becoming more and more complex. I'm start to wonder if we really need oneByOneIterator. can we just nest and use multiple unionIter?

Can you explain it in detail? Sure we can archive the same function with unionIter by wrapping it for many times. But it still requires that all iterators are created initially which can cause a waste of resource. And it will have too many stack jumps in one Next() call, I think the performance is boring

Agreed, nesting unionIter is neither intuitive nor easier to implement. I am thinking about renaming, oneByOneIterator to UnionIteratorLazyor UnionIteratorEx.. And unionIter can use the previous oneByOneIterator to accept unknown number of iters, like io.MultiReader.

As for the usage, it comes from #27722, looks like it will be used for temporary table #26952. It is like an interceptor API to bypass TiKV. A more elegant solution will be support multi storage drivers, which is, well, basically impossible for now. That is how @lcwangchao comes here, I think.

txn: refactor some iters in scanner.go

7b625ec

ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 7, 2021

lcwangchao added sig/transaction SIG:Transaction sig/sql-infra SIG: SQL Infra labels Sep 7, 2021

lcwangchao requested review from disksing, tiancaiamao and djshow832 September 7, 2021 07:35

lcwangchao mentioned this pull request Sep 9, 2021

A simpler implement for snapshot custom retriever #27917

Closed

lcwangchao closed this Sep 9, 2021

lcwangchao deleted the refactor_iter branch September 9, 2021 14:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

txn: refactor some iters in scanner.go #27854

txn: refactor some iters in scanner.go #27854

lcwangchao commented Sep 7, 2021 •

edited

Loading

ti-chi-bot commented Sep 7, 2021

lcwangchao commented Sep 7, 2021

lcwangchao commented Sep 7, 2021

disksing commented Sep 7, 2021

lcwangchao commented Sep 7, 2021

tiancaiamao commented Sep 7, 2021

xhebox commented Sep 8, 2021 •

edited

Loading

txn: refactor some iters in scanner.go #27854

txn: refactor some iters in scanner.go #27854

Conversation

lcwangchao commented Sep 7, 2021 • edited Loading

What problem does this PR solve?

What is changed and how it works?

Check List

Release note

ti-chi-bot commented Sep 7, 2021

lcwangchao commented Sep 7, 2021

lcwangchao commented Sep 7, 2021

disksing commented Sep 7, 2021

lcwangchao commented Sep 7, 2021

tiancaiamao commented Sep 7, 2021

xhebox commented Sep 8, 2021 • edited Loading

lcwangchao commented Sep 7, 2021 •

edited

Loading

xhebox commented Sep 8, 2021 •

edited

Loading