You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some work to measure the size of the database when storing history is in mozilla/application-services#191. For my places.sqlite (100k places, 150k visits), it gets around 200MB larger.
It's worth noting that 'disk usage' was one of the primary concerns reported by user research for Fenix (although it's not clear if this size increase (relative to places) is the kind of thing that would make a dent relative to stuff like caches and the like -- A very informal poll of some friends of mine found that Fennec typically uses around 500MB of space (app + data), another 200MB isn't a trivial increase, but doesn't substantially change where we are in terms of app size).
Support interned keywords mozilla/mentat#32: Interning keywords. Our test schema didn't use these and I'm not sure what the actual use case for them is over :db.type/ref to a { :db/ident :my/keyword }, so this doesn't seem like a high priority.
I think something like sqlite's zipvfs extension would likely help (as the databases compress well), but have not tried it. Implementing it ourselves is likely beyond the scope of this effort (I took a look at the effort required and it wasn't exactly trivial). Additionally, whatever we do would need to somehow integrate with sqlcipher (I also took a look at bolting compression into sqlcipher before the encryption, but the fact that this makes the block output a variable size seemed to make this problematic).
Other notes:
Storing strings as fulltext and using the compress/uncompress options of FTS4 did not help, since the strings in each column are relatively small. Additionally, the performance overhead here was substantial even for a very fast compressor (LZ4).
Most string data seems to be duplicated ~4 times, in datoms, timelined_transactions, and in the indices idx_datoms_eavt, idx_datoms_aevt.
During RustConf, @rnewman suggested that ultimately mentat will likely not want to use sqlite, and instead want to read datoms chunks directly out of something like RKV. These chunks could be compressed more easily. This seems out of scope, as it would be a massive change to mentat, but is worth writing down.
Additional concerns exist around the fact that this problem may be exacerbated by materialized views (perhaps #33 will help or prevent this?)
The text was updated successfully, but these errors were encountered:
This document highlights many of the issues and concerns https://docs.google.com/document/d/14ywV4PlBAdOsJrxd7QhMcxbo7pw8cdc1kLAb7I4QhFY/edit?ts=5b7b7f87.
Some work to measure the size of the database when storing history is in mozilla/application-services#191. For my places.sqlite (100k places, 150k visits), it gets around 200MB larger.
It's worth noting that 'disk usage' was one of the primary concerns reported by user research for Fenix (although it's not clear if this size increase (relative to places) is the kind of thing that would make a dent relative to stuff like caches and the like -- A very informal poll of some friends of mine found that Fennec typically uses around 500MB of space (app + data), another 200MB isn't a trivial increase, but doesn't substantially change where we are in terms of app size).
Some bugs which may help (suggested by @rnewman):
:db.type/ref
to a{ :db/ident :my/keyword }
, so this doesn't seem like a high priority.I think something like sqlite's zipvfs extension would likely help (as the databases compress well), but have not tried it. Implementing it ourselves is likely beyond the scope of this effort (I took a look at the effort required and it wasn't exactly trivial). Additionally, whatever we do would need to somehow integrate with sqlcipher (I also took a look at bolting compression into sqlcipher before the encryption, but the fact that this makes the block output a variable size seemed to make this problematic).
Other notes:
compress
/uncompress
options of FTS4 did not help, since the strings in each column are relatively small. Additionally, the performance overhead here was substantial even for a very fast compressor (LZ4).Additional concerns exist around the fact that this problem may be exacerbated by materialized views (perhaps #33 will help or prevent this?)
The text was updated successfully, but these errors were encountered: