Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure JXM and Enable Hive connector storage caching #1321

Closed
rohank07 opened this issue Aug 24, 2022 · 5 comments
Closed

Configure JXM and Enable Hive connector storage caching #1321

rohank07 opened this issue Aug 24, 2022 · 5 comments
Assignees

Comments

@rohank07
Copy link
Contributor

rohank07 commented Aug 24, 2022

Configure storage caching following: https://trino.io/docs/current/connector/hive-caching.html

To verify caching is configured correctly - setting up JMX connector can help to query the metrics of the caching system

@rohank07 rohank07 mentioned this issue Aug 24, 2022
54 tasks
@rohank07 rohank07 changed the title Enabled storage caching Hive connector storage caching Aug 24, 2022
@rohank07 rohank07 self-assigned this Aug 24, 2022
@rohank07 rohank07 changed the title Hive connector storage caching Enable Hive connector storage caching Aug 24, 2022
@rohank07
Copy link
Contributor Author

rohank07 commented Jan 3, 2023

Configured caching on the trino coordinator - got cache hits from running Select * queries on dummy tables. Still trying to figure how how to populate the cache on the worker pods.

@rohank07 rohank07 changed the title Enable Hive connector storage caching Configure JXM and Enable Hive connector storage caching Jan 4, 2023
@rohank07
Copy link
Contributor Author

rohank07 commented Jan 4, 2023

SELECT * FROM jmx.current."metrics:name=rubix.bookkeeper.gauge.cache_hit_rate";
image

@rohank07
Copy link
Contributor Author

rohank07 commented Jan 4, 2023

Configuration for Hive cache:
hive.cache.enabled=true
hive.cache.location=/opt/hive-cache
hive.metastore-cache-ttl=1440s
hive.cache.disk-usage-percentage=80
hive.cache.start-server-on-coordinator=true

hive.cache.start-server-on-coordinator property only works when enabled on the coordinator and does not write to worker pod directory.

@rohank07
Copy link
Contributor Author

rohank07 commented Jan 4, 2023

Initial configuration is complete. But when querying SELECT * FROM jmx.current."metrics:name=rubix.bookkeeper.gauge.cache_hit_rate";
It only updates the cache hit rate on the unclassified catalog and not the lsddp one.

Also, when runnin SELECT avg(cache_hit) FROM jmx.current."rubix:catalog=unclassified,name=stats" WHERE NOT is_nan(cache_hit); the hit rate value returns NaN

Caching has been configured for the unclassifed catalog. When configuring for the lsddp catalog cache records are being recorded but jxm is unable to query the cache hit rate

@rohank07
Copy link
Contributor Author

rohank07 commented Jan 5, 2023

Closing - will create a seperate issue for multiple catalogs

@rohank07 rohank07 closed this as completed Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant