Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

semantics: Use a BitSet #11819

Merged
merged 7 commits into from
Nov 28, 2022
Merged

semantics: Use a BitSet #11819

merged 7 commits into from
Nov 28, 2022

Conversation

vmg
Copy link
Collaborator

@vmg vmg commented Nov 25, 2022

Following up on #11817 -- the bug was caused because our current implementation of TableSet is not strictly comparable with equality (i.e. ==).

This PR fixes the issue in depth by bringing in a new implementation of an immutable bitset that is safe to compare with equality and to be used as a key in a hashmap (something we also do quite often).

Fixes #11805

cc @systay

Signed-off-by: Andres Taylor <andres@planetscale.com>
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Nov 25, 2022

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.

If a new flag is being introduced:

  • Is it really necessary to add this flag?
  • Flag names should be clear and intuitive (as far as possible)
  • Help text should be descriptive.
  • Flag names should use dashes (-) as word separators rather than underscores (_).

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow should be required, the maintainer team should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should include a link to an issue that describes the bug.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from VTop, if used there.

Signed-off-by: Vicent Marti <vmg@strn.cat>
@vmg
Copy link
Collaborator Author

vmg commented Nov 25, 2022

There's a new commit that fixes some of the superluous renaming but it's not showing up in the web UI. I think GitHub is having issues again.

vmg and others added 3 commits November 25, 2022 16:10
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Andres Taylor <andres@planetscale.com>
@arthurschreiber

This comment was marked as resolved.

Signed-off-by: Andres Taylor <andres@planetscale.com>
@harshit-gangal
Copy link
Member

@vmg could you add more documentation to the bitset implementation about how it works? Also, any place where I can read more about it?

@arthurschreiber
Copy link
Contributor

Sorry for the nitpicking, but I think the filename is still incorrect - it's currently tablet_set.go instead of table_set.go. 🙈

if len(words) == 0 {
return ""
}
return *(*Bitset)(unsafe.Pointer(&words))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this get the memory address from go runtime?

Comment on lines +218 to +222
const singleton = "\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x04\x00\x00\x00\x08\x00\x00\x00\x10\x00\x00\x00\x20\x00\x00\x00\x40\x00\x00\x00\x80"

// Single returns a new Bitset where only the given bit is set.
// If the given bit is less than 32, Single does not allocate to create a new Bitset.
func Single(bit int) Bitset {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the limit to the number of tableSet that can exist now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no limit! Any bitset larger than 32 will allocate, but you can make it as large as you want.

Comment on lines 126 to 129
// TableSetFromIds returns TableSet for all the id passed in argument.
func TableSetFromIds(tids ...int) (ts TableSet) {
return TableSet(bitset.Build(tids...))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method is not used in the codebase which will also imply that Build method is not required in bitset

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We intend to use this method in some PlanetScale-internal projects!

Copy link
Member

@harshit-gangal harshit-gangal Nov 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest having at least a test for it so that if anyone uses/modifies it here, does not break the usage.

@harshit-gangal
Copy link
Member

@vmg could you add more documentation to the bitset implementation about how it works? Also, any place where I can read more about it?

it was nice reading some math-intensive code.

Comment on lines 186 to 197
func (bs Bitset) ForEach(yield func(int)) {
for i := 0; i < len(bs); i++ {
bitset := bs[i]
for bitset != 0 {
t := bitset & -bitset
r := bits.TrailingZeros8(bitset)
yield(i*bitsetWidth + r)
bitset ^= t
}
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an example of how this works will help. A general example for overall implementation will also work.

Signed-off-by: Vicent Marti <vmg@strn.cat>
@vmg
Copy link
Collaborator Author

vmg commented Nov 28, 2022

Added comments on the tricky parts of the implementation 👌

@vmg vmg merged commit 9a9bd43 into vitessio:main Nov 28, 2022
@vmg vmg deleted the vmg/bitset branch November 28, 2022 10:37
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Nov 28, 2022

I was unable to backport this Pull Request to the following branches: release-15.0.

GuptaManan100 pushed a commit to planetscale/vitess that referenced this pull request Nov 29, 2022
* make sure planner works well with more than 64 tables

Signed-off-by: Andres Taylor <andres@planetscale.com>

* semantics: use an immutable bitset

Signed-off-by: Vicent Marti <vmg@strn.cat>

* bitset: add documentation

Signed-off-by: Vicent Marti <vmg@strn.cat>

* bitset: add license

Signed-off-by: Vicent Marti <vmg@strn.cat>

* use IsEmpty and NonEmpty more

Signed-off-by: Andres Taylor <andres@planetscale.com>

* rename file

Signed-off-by: Andres Taylor <andres@planetscale.com>

* bitset: better documentation

Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Co-authored-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
GuptaManan100 added a commit that referenced this pull request Nov 29, 2022
* make sure planner works well with more than 64 tables

Signed-off-by: Andres Taylor <andres@planetscale.com>

* semantics: use an immutable bitset

Signed-off-by: Vicent Marti <vmg@strn.cat>

* bitset: add documentation

Signed-off-by: Vicent Marti <vmg@strn.cat>

* bitset: add license

Signed-off-by: Vicent Marti <vmg@strn.cat>

* use IsEmpty and NonEmpty more

Signed-off-by: Andres Taylor <andres@planetscale.com>

* rename file

Signed-off-by: Andres Taylor <andres@planetscale.com>

* bitset: better documentation

Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Co-authored-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Vicent Martí <42793+vmg@users.noreply.github.com>
Co-authored-by: Andres Taylor <andres@planetscale.com>
dbussink added a commit that referenced this pull request Jan 30, 2023
* semantics: Use a BitSet (#11819) (#11837)

* make sure planner works well with more than 64 tables

Signed-off-by: Andres Taylor <andres@planetscale.com>

* semantics: use an immutable bitset

Signed-off-by: Vicent Marti <vmg@strn.cat>

* bitset: add documentation

Signed-off-by: Vicent Marti <vmg@strn.cat>

* bitset: add license

Signed-off-by: Vicent Marti <vmg@strn.cat>

* use IsEmpty and NonEmpty more

Signed-off-by: Andres Taylor <andres@planetscale.com>

* rename file

Signed-off-by: Andres Taylor <andres@planetscale.com>

* bitset: better documentation

Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Co-authored-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Vicent Martí <42793+vmg@users.noreply.github.com>
Co-authored-by: Andres Taylor <andres@planetscale.com>

* boostplan: use the new TableSet API

Signed-off-by: Vicent Marti <vmg@strn.cat>

* [planner] Better AST equality (#11867) (#1387)

* [planner] Better AST equality (#11867)

* make it possible to do deep comparisons with custom logic

Signed-off-by: Andres Taylor <andres@planetscale.com>

* use the new EqualS function to do deep semantic equality

Signed-off-by: Andres Taylor <andres@planetscale.com>

* re gen the asthelpergen integration test data

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* update equal gen comment

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* test: add unit tests showing the new EqualsExpr functionality

Signed-off-by: Andres Taylor <andres@planetscale.com>

* [planner] use the semantic equality in more places

Signed-off-by: Andres Taylor <andres@planetscale.com>

* comments - update to reflect new code

Signed-off-by: Andres Taylor <andres@planetscale.com>

* small fixes around tests

Signed-off-by: Andres Taylor <andres@planetscale.com>

* update integration test build

Signed-off-by: Andres Taylor <andres@planetscale.com>

* use semantic equality for Anding together expressions

Signed-off-by: Andres Taylor <andres@planetscale.com>

* test: simplify test

Signed-off-by: Andres Taylor <andres@planetscale.com>

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>

* boostplan: use new AST equality helpers

Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: Vicent Marti <vmg@strn.cat>

* Fix deprecated ioutil usage in custom backups (#1388)

These are our internal implementations for backup so they won't get
updated upstream to fix this deprecation, so we have to do it here.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>

* Experiment boost queries on Gen4 (#1373)

* watcher: add support for experiments

Signed-off-by: Vicent Marti <vmg@strn.cat>

* more logic around comparing results

Co-authored-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* add engine primitive that can compare boost and gen4 results

Signed-off-by: Andres Taylor <andres@planetscale.com>

* link the gen4 vs boost comparison to the plan time in vtgate

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* refactor the utils comparison methods and add fail on error to boost experiments

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* add an enum to experiments

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* rename experiments to querycomparison

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* add topo rpc endpoint in vtboost and vtctld

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* add vtctldclient command for setquerycomparisons

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* change fakeControllerClient to implement DRPCControllerServiceClient

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* self-review

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* run TryExecute async when failure mode is set to LOG

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* run TryStreamExecute async when failure mode is set to LOG

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* Introduce a pflag file for boost

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* fix proto files

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* fix executor error

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* re gen proto

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* apply review suggestions

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* Rename query comparison to Science

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* Filter SetScience per cluster

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* Simplify the set science in tests and optimize the use of vcursor in BoostCompare

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* addition of context to BoostCompare

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* remove BackgroundVCursor

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* topo: simplify

Signed-off-by: Vicent Marti <vmg@strn.cat>

* use ctx.Background

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* gofmt

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* make proto

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* endtoend: fix test output

Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Vicent Marti <vmg@strn.cat>
Co-authored-by: Andres Taylor <andres@planetscale.com>

* boost: Go 1.19 cleanups (#1403)

* common: use Go 1.19 atomics

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Fix go.mod

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com>

* add group by and original expressions (#1423)

* add group by and original expressions

Signed-off-by: Andres Taylor <andres@planetscale.com>

* incorporator: allow disabling upquery generation

Signed-off-by: Vicent Marti <vmg@strn.cat>

* integration: remove unused test

Signed-off-by: Vicent Marti <vmg@strn.cat>

* don't rewrite colnames in the original query while planning upqueries

Signed-off-by: Andres Taylor <andres@planetscale.com>

* boostplan: add missing upquery rewrite

Signed-off-by: Vicent Marti <vmg@strn.cat>

* incorporator: revert optional upqueries

Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Co-authored-by: Vicent Marti <vmg@strn.cat>

* [planner] Schema information on the information_schema views (#11941)

* add info_schema information

Signed-off-by: Andres Taylor <andres@planetscale.com>

* add SchemaInformation handling for info_schema tables

Signed-off-by: Andres Taylor <andres@planetscale.com>

* fix bad test query

Signed-off-by: Andres Taylor <andres@planetscale.com>

* add support for information_schema on mysql 5.7

Signed-off-by: Andres Taylor <andres@planetscale.com>

* columns sorted just like mysql, and tests for 5.7

Signed-off-by: Andres Taylor <andres@planetscale.com>

* test: skip test that should not run

Signed-off-by: Andres Taylor <andres@planetscale.com>

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>
Co-authored-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com>
Co-authored-by: FlorentP <35779988+frouioui@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug Report: bug: not supported pushWeightStringForDistinct on *planbuilder.limit
6 participants