Cached Size Implementation #7387

vmg · 2021-01-26T17:50:21Z

Description

Here's a hard problem in a garbage collected language like Go: we need to know the exact size that an object occupies in memory, so we can perform accurate memory tracking for our caches. This issue in the official Go repo provides many examples of why this is exceedingly complicated, and as evidenced by the fact that the issue has been closed by the maintainers, it doesn't seem like the authors of Go are looking to implement a way to reliably do this in the language runtime.

It is then our time to get dirty. After many discussions with @systay, we agreed on an approach that I've implemented in this PR: a code generator, very similar to the one we use to automatically generate the walker code for our AST, but that instead generates a helper method for all the structures that can possibly fit in cache.

This PR is composed of two commits: the first one checks in the code generator and the second one checks in all the generated code.

My goals, in order of importance, are as follows:

Getting feedback on the structure and correctness of the generated code
Getting feedback on the design of the codegenerator
Merging the (for now unused) codegen & generated code so I can resume my cache implementation on top of it

Notes on the codegen

As you can see, the generated code is rather comprehensive. The output that has been checked in has been generated from running the following command:

~/src/vitess
$ go run go/tools/sizegen/sizegen.go -in ./go/vt/... -gen vitess.io/vitess/go/vt/vtgate/engine.Plan -gen vitess.io/vitess/go/vt/vttablet/tabletserver.TabletPlan

The input to the sizegen tool are the base types that are going to be stored in the cache. In our case, this is engine.Plan and tabletserver.TabletPlan. From there, the generator builds a full AST of Vitess, typechecks all our packages and dependencies, and starts traversing the type information, starting from just the Plan and TabletPlan structs, to find every single field that can be reached from our root structs and generate a CachedSize helper for each one of them.

The generated code is written separately to every package in go/vt that contains cacheable data structures, in a file called cached_size.go. This is required so we can obtain fully accurate type information even for private fields in the structs.

The generator is smart and outputs optimal code: structs that are PODs do not get helper methods because their size is known a-priori. Pointers are always checked before being dereferenced. Slices take into account the capacity of their backing arrays, and maps use implementation internals to a get a down-to-the-byte count of how much is the map implementation using in backing data structures. For all the Interfaces that are stored in our cacheable structs, we use type inference to find all the structs in the codebase that could possibly implement such interface, and generate helper methods for all of them. Lastly, the generator takes special care to distinguish between fields that are embedded in structs and fields that are pointed at, to ensure that non-allocated data is not double counted.

The generator is also fully reproducible (particularly regarding to order), so that successive runs of the codegen always output the same files unless the underlying code has changed.

As far as I can tell, there are no outstanding issues or missing features on the codegen, but there may be correctness bugs, since there is a significant amount of complexity in the data structures we're storing on cache. I look forward to your feedback.

Related Issue(s)

#7304 is the WIP PR for the new cache implementation, where I explain in the detail the reason why accurate memory usage assessments for the cache are important.

Checklist

Should this PR be backported?
Tests were added or are not required
Documentation was added or is not required

Deployment Notes

Impacted Areas in Vitess

Components that this PR will affect:

cc @systay -- would you mind CCing whoever else you think would be interested on this?

go/tools/sizegen/sizegen.go

harshit-gangal · 2021-01-28T18:01:46Z

@vmg you would need to fix the DCO
https://github.com/vitessio/vitess/pull/7387/checks?check_run_id=1785914590

Signed-off-by: Vicent Marti <vmg@strn.cat>

These two features are now handled exceptionally: - Generated files that don't attempt to perform interface casts no longer have a local interface definition for `cachedObject` - Generated functions that use `unsafe.Pointer` for hashmap size calculations are now properly flagged with a pragma to prevent a false positive in the compiler `ptrcheck` pass. - Abstracted the out filesystem to make things easier to test Signed-off-by: Vicent Marti <vmg@strn.cat> Signed-off-by: Andres Taylor <andres@planetscale.com> Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Vicent Marti <vmg@strn.cat>

vmg requested review from harshit-gangal, shlomi-noach, sougou and systay as code owners January 26, 2021 17:50

systay reviewed Jan 26, 2021

View reviewed changes

go/tools/sizegen/sizegen.go Show resolved Hide resolved

vmg force-pushed the sizegen-tool branch 2 times, most recently from 993721d to 7ac2a45 Compare January 27, 2021 15:37

systay force-pushed the sizegen-tool branch from 7ac2a45 to 3872537 Compare January 28, 2021 14:12

vmg and others added 5 commits January 29, 2021 12:08

vt: implement 'sizegen' helper

3b793b1

Signed-off-by: Vicent Marti <vmg@strn.cat>

go/vt: check in autogenerated sources from sizegen

263d577

Signed-off-by: Vicent Marti <vmg@strn.cat>

sizegen: implement a 'verify' flag

f362f44

Signed-off-by: Vicent Marti <vmg@strn.cat>

sizegen: implement unit tests

39a6791

Signed-off-by: Vicent Marti <vmg@strn.cat>

vmg force-pushed the sizegen-tool branch from 2ebf215 to 39a6791 Compare January 29, 2021 11:08

sizegen: more interesting integration tests

048dc81

Signed-off-by: Vicent Marti <vmg@strn.cat>

systay approved these changes Feb 1, 2021

View reviewed changes

systay merged commit f44849d into vitessio:master Feb 1, 2021

vmg mentioned this pull request Feb 3, 2021

LFU Cache Implementation #7439

Merged

8 tasks

askdba added the Component: Query Serving label Feb 8, 2021

askdba added this to the v10.0 milestone Feb 8, 2021

deepthi mentioned this pull request Mar 12, 2021

Performance Improvements #7674

Open

ajm188 mentioned this pull request Jul 13, 2021

slack vitess v10.pre tinyspeck/vitess#228

Merged

brendar mentioned this pull request Jun 14, 2023

Generate CachedSize function for Placement vindex Shopify/vitess#102

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cached Size Implementation #7387

Cached Size Implementation #7387

vmg commented Jan 26, 2021

harshit-gangal commented Jan 28, 2021

Cached Size Implementation #7387

Cached Size Implementation #7387

Conversation

vmg commented Jan 26, 2021

Description

Notes on the codegen

Related Issue(s)

Checklist

Deployment Notes

Impacted Areas in Vitess

harshit-gangal commented Jan 28, 2021