-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache size to avoid delays on big db files #125
Labels
Comments
Interesting. Can you expand the motivation? In Adding an extra |
I'm using this as a cache for meta-information from a much larger ES
database (the difference in access time is orders of magnitude). Once the
ES db is established, it's effectively read-only, but I need to generate a
number of smaller, derived databases (in redis) using different subsets of
data. These smaller dbs are generated as-needed over a longer period of
time.
When I go to use the cache, I need to ensure it's in sync with the ES
database/index (thus checking the size), but I also want to have the full
size to pass to tqdm for progress meters.
Given the size of the sqlite db, getting the length can take 4-5 minutes,
which is painful when I'm using more than one of these representing more
than one ES index.
…On Wed, Dec 2, 2020 at 12:32 PM Radim Řehůřek ***@***.***> wrote:
Interesting. Can you expand the motivation? In I use mostly read-only
dbs, so the time spent sizing them is wasted., what exactly is wasted,
how much, how does it affect you, how do you expect it will affect others?
Adding an extra meta table to help with caching is possible, but it's
additional complexity so needs to be well motivated.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#125 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFYZLOPOVRRV67OM3NDY43SSZ23RANCNFSM4UKYMLFQ>
.
|
ES => subset of data into sqlitedict => process data into a number of
different redis dbs
ES and sqlitedict are modified quite rarely in comparison to the redis dbs.
For the moment, I'm just using a subclass, which is fine (it also
implements a non-threaded model to improve error handling and general
performance).
I'm also using msgpack for serialization, which is much, much faster (and
smaller) than pickle.
…On Wed, Dec 2, 2020 at 12:49 PM Timothy Wall ***@***.***> wrote:
I'm using this as a cache for meta-information from a much larger ES
database (the difference in access time is orders of magnitude). Once the
ES db is established, it's effectively read-only, but I need to generate a
number of smaller, derived databases (in redis) using different subsets of
data. These smaller dbs are generated as-needed over a longer period of
time.
When I go to use the cache, I need to ensure it's in sync with the ES
database/index (thus checking the size), but I also want to have the full
size to pass to tqdm for progress meters.
Given the size of the sqlite db, getting the length can take 4-5 minutes,
which is painful when I'm using more than one of these representing more
than one ES index.
On Wed, Dec 2, 2020 at 12:32 PM Radim Řehůřek ***@***.***>
wrote:
> Interesting. Can you expand the motivation? In I use mostly read-only
> dbs, so the time spent sizing them is wasted., what exactly is wasted,
> how much, how does it affect you, how do you expect it will affect others?
>
> Adding an extra meta table to help with caching is possible, but it's
> additional complexity so needs to be well motivated.
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#125 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAFYZLOPOVRRV67OM3NDY43SSZ23RANCNFSM4UKYMLFQ>
> .
>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I use mostly read-only dbs, so the time spent sizing them is wasted. Even in the case of writable dbs, caching the last known size and timestamp and comparing against the db file mtime can be of use.
Something like this:
The text was updated successfully, but these errors were encountered: