Skip to content
This repository has been archived by the owner on Jun 30, 2021. It is now read-only.

More powerful caching mechanics #157

Open
icook opened this issue Nov 3, 2014 · 3 comments
Open

More powerful caching mechanics #157

icook opened this issue Nov 3, 2014 · 3 comments

Comments

@icook
Copy link
Member

icook commented Nov 3, 2014

It's been apparent to me for a while that we need more refined caching mechanisms. I believe we can boil down our caching needs into three main categories:

  1. Data that is expensive to compute, but rarely (if ever) needs to be recalculated. This could (and in some cases is) stored in Postgres, but this almost always gives us headaches down the road. Ideally we would have a disk backed cache like SSDB (Redis compatible datastore) that this type of data is stored in. Examples are calculating the profitability of a block.
  2. Data that needs to be refreshed on an interval and always present, and is too expensive to compute at request time. Examples include Pool statistics, etc. This is handled decently well by Flask-Cache and scheduled tasks. There could be some decorators etc built to reduce boilerplate, but it's minor.
  3. Data that can be computed at request time, but gets cached for a short period of time to improve repeat visits. Handled perfectly by Flask-Cache.

So really we're lacking a good solution for 1. Worth implementing this for profitability?

@ericecook
Copy link
Member

Probably not worth trying to setup for profit stats.

I'm not sure I see the benefit to using a separate DB as a cache, and I'd rather not add dependencies unnecessarily. The only real annoyance I can recall us having with data that falls into 1 is the requirement to cleaning it up later on (like with block credits), which is pretty minor.

@icook
Copy link
Member Author

icook commented Nov 3, 2014

The benefit of a separate datastore to hold type #1 is that it could get quite large, but not be fetched that often, therefore Redis is probably not the best choice since the dataset is held in memory. For now it would be trivial to use the redis cache and switch it later if it becomes an issue at all.

The main reason I'm discussing this is that you mentioned that generating the data took significant amount of CPU, and it if has expiry set (required by Flask-Cache) then we will be recomputing it regularly.

@ericecook
Copy link
Member

I guess I'm looking for an example of where something like SSDB might be better than just using Postgres. Our use cases for redis/flask-cache seems pretty straightforward imo.

In the branch I made profit data is cached in redis keys, and its a pretty small number of keys so no worries there. When profit is computed it loops through ALL the credits of the last X hours, which is expensive. This is probably OK in the short term, if we recompute it every couple hours or w/e. But I'm not sure the best way to handle this moving forward - as its clearly sub-optimal.

One easy/straightforward way to minimize the performance hit is caching some of the data needed from the Credits. Its pretty trivial to add a column in Postgres to ChainPayout that tracks the total BTC amount received, and if this column is updated once the sell request for the ChainPayouts Block has been completed we no longer need to query the ChainPayouts Credits for that. Since the BTC for the block/chain received won't change after the request is closed, this cache would greatly reduces the number of Credits that'd need to be queried, and easily allow us to display profitability for the last week/month/whatever.

This optimization would probably be total trash if we want to move towards a more robust profit tracking system. I can't find the ticket for it, but at some point you suggested a table that functions similarly to the share_slice table, except for profits.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants