Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: cache layer #9672

Merged
merged 10 commits into from
Jan 21, 2023
Merged

refactor: cache layer #9672

merged 10 commits into from
Jan 21, 2023

Conversation

dantengsky
Copy link
Member

@dantengsky dantengsky commented Jan 18, 2023

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

  • storages-common-cache

    • expose trait CacheAccessor, which provides the essential cache operation put, get, evict
    • define trait StroageCache, which describes the minimum interface that all the cache providers/backends should implement
      note that it is a "sync style" trait. for the put and evict methods, they are likely/should to be sync, not sure about get yet.
    • implement StorageCache for LruCache, and defines LruCache instances(type level) we currently used
      and also a dummy implementation of StorageCache for DiskCache, to test the API
    • provides commonly used in-memory cache types, instantiated from LruCache parameterized by the type of cached item, size meter, etc.
    • provides a generic CachedReader
      given an implementation of StorageCache, e.g. LruCache or DiskCache, and a proper implementation of Loader, it can be instantiated to a concrete CachedReader.
  • storages-common-table-meta::cache

    • meta_cache_manager.rs init all the in-memory cache objects that are currently needed
    • meta_cache.rs
      • declares the type alias of in-memory cache types that currently used to cache table meta
      • provides the CachedMeta trait and impls, which
        • binds the type of meta to the type of cache
          e.g. SegmentInfo::cache() should return the SegmentInfoCache
        • requires all the Cache returned should impl CacheAccessor, with proper type parameters.
        • also the instance of Cache grabbed from CacheManager is bound to the meta type at compile time
  • common-storages-fuse::io

    • introduces ColumnDataLoader, which implements Loader to load range of data from object storage, with the ability to customize cache keys
    • introduces io::read::BloomIndexColumnReader which basically instantiated CachedReader with ColumnDataLoader and LruCahe, caches index column data in memory
    • introduces CachedMetaWriter trait, which writes data and populated the item caches
      implemented for all the type that implements CachedMeta
  • other related refactor
    in fuse table operation modules, replaces the implicit usage of CacheManger to populate/evict caches, with the usage of CachedMeta trait.
    the way of manipulating the cache is kept as it is, e.g. snapshot will not be cached during the writing, but after committed successfully.

Closes #issue

@vercel
Copy link

vercel bot commented Jan 18, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated
databend ⬜️ Ignored (Inspect) Jan 20, 2023 at 5:45AM (UTC)

@mergify mergify bot added the pr-refactor this PR changes the code base without new features or bugfix label Jan 18, 2023
read path & column reader
@dantengsky dantengsky marked this pull request as ready for review January 19, 2023 16:00
@BohuTANG
Copy link
Member

Cool, will review it today later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-refactor this PR changes the code base without new features or bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants