Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add keyonly support for seek #7419

Merged
merged 9 commits into from
Aug 30, 2018
Merged

Add keyonly support for seek #7419

merged 9 commits into from
Aug 30, 2018

Conversation

shafreeck
Copy link
Contributor

What problem does this PR solve?

Only return keys when do seek

What is changed and how it works?

Add a new Option: KeyOnly

// Before seek , set the KeyOnly option, it takes effects until you set it false in this transaction
txn.SetOption(kv.KeyOnly, true)

The TiKV RPC method has already supported this option, so there is nothing to change with the remote server

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
    txn.SetOption(kv.KeyOnly, true)
    iter, err := txn.Seek("your key")
    if err != nil {
        return err
    }
    for iter.Valid()  {
        fmt.Println(iter.Key(), iter.Value)
        if err := iter.Next(); err != nil {
          return  err
        }
    }
  • No code

@sre-bot
Copy link
Contributor

sre-bot commented Aug 16, 2018

Hi contributor, thanks for your PR.

This patch needs to be approved by someone of admins. They should reply with "/ok-to-test" to accept this PR for running test automatically.

@CLAassistant
Copy link

CLAassistant commented Aug 16, 2018

CLA assistant check
All committers have signed the CLA.

@@ -98,10 +100,6 @@ func (s *Scanner) Next() error {
s.Close()
return errors.Trace(err)
}
if len(s.Value()) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seem we need to check keyOnly here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not assume that the key is not exist if len(s.Value()) == 0, it is conflict with the concept key only. So the TiKV server would (and must )not return an empty value to indicate a non-exist key when doing seek.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, the current situation is that TiKV allows empty values, but TiDB does not because an empty value in the BufferStore indicates deletion.

Copy link
Contributor Author

@shafreeck shafreeck Aug 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, the current situation is that TiKV allows empty values

Yeah, that’s why I said an empty value does not mean that a key is non-exist. The code here is weird, I don’t know what is the original design purpose. Can any one give some information?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you cannot remove it directly here. Be aware that resolveCurrentLock() used snapshot.Get(), which returns nil when the key does not exist. You have to change the behaviour of Get if you really want to accept empty values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the keyOnly option and the CI passed. However, I think the problem is not resolved. The keyOnly option does not change the behavior of TiKV server, so is there any case that the server returns a key which does not exist?

Copy link
Contributor Author

@shafreeck shafreeck Aug 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it that when deleting a key with prewrite succeeded and commit failed, the key will be returned by the TiKV server with an error. Now I am trying to fix it.

@shafreeck
Copy link
Contributor Author

The KeyOnly Option only takes effects on snapshot now, I think it is necessary to work on BufferStore as well. I did not find how to add this option for BufferStore gracefully, do you guys have any suggestions ?

@disksing
Copy link
Contributor

disksing commented Aug 17, 2018

em, I think BufferStore does not have to support KeyOnly. The option was invented for improving performance -- but it will not help BufferStore.

@shafreeck
Copy link
Contributor Author

@disksing Yeah, I agree, KeyOnly just makes the behavior consistent between a buffer store and a snapshot. It does not help for performance. It is acceptable for me to ignore KeyOnly Option for buffer store.

@disksing
Copy link
Contributor

@shafreeck Good point. As an alternative, I think we can make it 'undefined behavior' to call Value() when the KeyOnly option is set. Then it will be ok to return either nil or a value.

@shafreeck
Copy link
Contributor Author

@disksing Great idea !

@shafreeck
Copy link
Contributor Author

shafreeck commented Aug 21, 2018

Any further suggestions? @disksing @siddontang

@disksing
Copy link
Contributor

em, you need to get CI pass. You can use make dev to reproduce the test failure.

@@ -98,7 +100,8 @@ func (s *Scanner) Next() error {
s.Close()
return errors.Trace(err)
}
if len(s.Value()) == 0 {

if len(s.Value()) == 0 && !s.keyOnly {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said resolveCurrentLock uses snapshotGet to resolve lock. When len(s.Value()) == 0, it means the key does not exist. It has nothing to do with the keyOnly flag.

@disksing
Copy link
Contributor

It should be correct now. Could you add a test case that seeks keys with keyOnly option?

@shafreeck
Copy link
Contributor Author

@disksing Test cases added

@siddontang
Copy link
Member

PTAL @disksing

@disksing
Copy link
Contributor

LGTM.

@disksing disksing added the status/LGT1 Indicates that a PR has LGTM 1. label Aug 29, 2018
kv/kv.go Outdated
@@ -46,6 +46,8 @@ const (
// BypassLatch option tells 2PC commit to bypass latches, it would be true when the
// transaction is not conflict-retryable, for example: 'select for update', 'load data'.
BypassLatch
// KeyOnly retrieve only keys, it can be used in scan now
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add . at the end of a sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

s.Close()
return errors.Trace(err)
}
if len(s.Value()) == 0 {

if resolved && len(s.Value()) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If resolved is false from https://github.com/pingcap/tidb/pull/7419/files#diff-d1ae066017b8533eea6b56eb55c4382cR125, the logic is different from the old code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is different, and it is more appropriate and clear. The value returned from a seek cannot be nil if none errors happened.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, when several conditions are combined, it's not more appropriate and clear.

resolved == false is ambiguous: is there no lock, so no need to resolve lock?
or is there a lock, and resolved fail, so it's false ?

When resolved && len(s.Value()) was written down, I have to consider four combination:

  • resolve success && key not exist
  • resolve success && key exist
  • resolve fail && key not exists
  • resolve fail && key exist

I think it's more appropriate to split the conditions, check lock first, then decide whether to resolve lock, and check key results afterwards.

Copy link
Contributor Author

@shafreeck shafreeck Aug 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tiancaiamao
I love your ideas. It is actually not human-friendly when combines several conditions. I did that for keeping the same style as the old code.

Let me explain what I thought more in details, and then talk about how to improve it.

I said more appropriate and clear according to the old code, it is not an argument about if conditions.

The source of evil here is the method resolveCurrentLock , it is completely side effect. It is invoked without any condition and checks GetError() inside, furthermore, it does not take any arguments and return any results, but it indeed changes the current item of the iterator.

There is nothing relation between len(s.Value()) and resolveCurrentLock we can tell from the original code, but it actually has. The resolved I added indicates the relation, that's why I said more appropriate and clear. I want to focus on the KeyOnly feature in this PR, so I did not change resolveCurrentLock too much, for achieving my goal and keeping the old code style.

Now I think I made it clear, let's talk about the solution. Maybe we have two options

  1. Focus on the KeyOnly feature here and using the way I did. Maybe we can refactor resolveCurrentLock in another PR.
  2. Refactor resolveCurrentLock and do the way as your purpose. All in this PR.

These two options are all OK for me, so I am looking for your suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is completely side effect

Good catch!

I want to focus on the KeyOnly feature in this PR

Or we can reset unrelated changes

  1. Refactor resolveCurrentLock and do the way as your purpose. All in this PR.

I think option 2 is acceptable, this refactor is just a tiny change. @shafreeck

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will take another commit.

@@ -27,13 +27,14 @@ type Scanner struct {
snapshot *tikvSnapshot
batchSize int
valid bool
keyOnly bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why keyOnly is add to Scanner? it's accessible from Scanner.snapshot.keyOnly already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be omitted, thanks.

@shafreeck shafreeck force-pushed the scankeyonly branch 2 times, most recently from ad4df95 to 7a9cd60 Compare August 29, 2018 13:22
@shenli
Copy link
Member

shenli commented Aug 29, 2018

@tiancaiamao PTAL

@@ -115,18 +117,18 @@ func (s *Scanner) startTS() uint64 {
return s.snapshot.version.Ver
}

func (s *Scanner) resolveCurrentLock(bo *Backoffer) error {
func (s *Scanner) resolveCurrentLock(bo *Backoffer) (bool, error) {
current := s.cache[s.idx]
if current.GetError() == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please move this if branch out the resolveCurrentLock function, to the caller?
Then the resolved return value can be removed.

current := s.cache[s.idx]
if current.GetError() == nil {
      ... // no lock, no need to resolve lock
} else {
      err := resolveCurrentLock()
}

s.Close()
return errors.Trace(err)
}
if len(s.Value()) == 0 {

if resolved && len(s.Value()) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, when several conditions are combined, it's not more appropriate and clear.

resolved == false is ambiguous: is there no lock, so no need to resolve lock?
or is there a lock, and resolved fail, so it's false ?

When resolved && len(s.Value()) was written down, I have to consider four combination:

  • resolve success && key not exist
  • resolve success && key exist
  • resolve fail && key not exists
  • resolve fail && key exist

I think it's more appropriate to split the conditions, check lock first, then decide whether to resolve lock, and check key results afterwards.

@shafreeck
Copy link
Contributor Author

@tiancaiamao PTAL?

@tiancaiamao
Copy link
Contributor

LGTM

@tiancaiamao
Copy link
Contributor

/run-all-tests

@tiancaiamao tiancaiamao added status/LGT2 Indicates that a PR has LGTM 2. component/tikv and removed status/LGT1 Indicates that a PR has LGTM 1. labels Aug 30, 2018
@ngaut ngaut merged commit b4fdaf3 into pingcap:master Aug 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/tikv status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants