Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch backend storage to use SQLite #10

Closed
axllent opened this issue Aug 29, 2022 · 0 comments
Closed

Switch backend storage to use SQLite #10

axllent opened this issue Aug 29, 2022 · 0 comments

Comments

@axllent
Copy link
Owner

axllent commented Aug 29, 2022

CloverDB (which uses BadgerDB) works really well for Mailpit, however it does come with some massive overheads (for just a small application), and with hindsight just isn't the right choice for Mailpit.

Two reported issues currently affected by this are #8 and #9, and is likely also related to #3.

Comparing CloverDB and SQLite (native Go port) with a persistent database (written to disk) containing 100,000 emails:

BadgerDB SQLite
Insert 100 messages 500ms 1,500ms
Memory usage (persistent data) 400-2,500MB 8-20MB
Idle CPU usage 2-3% CPU 0% CPU
Messages index load 9ms 21ms
Single word search 70ms 32ms
Three word search 3,756ms 1,264ms

Note: these are approximate & average values, tested on my laptop (SSD + Intel i7-7600U CPU @ 2.80GHz), and were using actual emails varying in size (ie: not just a simple plain text text email). The values are used simply for a comparison.

Findings

  • Data storage speeds reduce quite significantly with SQLite, although I am still able to process and store approximately 75 emails per second (versus 200 per second with CloverDB). SQLite must also use a single query at a time, to very heavy use can result in delayed responses. Both implementations use zstd to compress the raw email in the database to save space.
  • The runtime memory requirements are huge by comparison, with SQLite requiring 50-125 times less RAM than CloverDB/BadgerDB at all times.
  • BadgerDB appears to use 2-3% CPU when idle (unsure why), SQLite does not suffer from this.
  • UI response time is similar for message lists and message loading.
  • Single word searches are similar in response time, however SQLite performs much better with multi-word searches.
  • The implementation of SQLite also allows Mailpit to perform more defined searching within specific fields, currently limited to:<term>, from:<term>, subject:<term>, and also has:attachment.
  • BadgerDB does reclaim some space after deleting enough messages, however it isn't clear exactly how & when this works, and apparently requires the database to be closed and re-opened, plus some additional data written to to the database to activate the pruning. SQLite can run a periodic VACUUM to reclaim space (automatically via Mailpit when needed), which reverts the entire database to less than 40KB when all messages are deleted (Delete all).

Implementation & testing

  • Mailpit currently allows either persistent storage, or in-memory storage. Whilst SQLite also has a feature to run in-memory, the chosen approach will be to store "non-persistent" data to a temporary file instead of RAM, which is then automatically deleted on exit. This will mean that memory usage is not directly linked to the number of messages.
  • Testing on Windows / Mac needs to be done prior to merging in any code.

Breaking changes

Once implemented, the switch to SQLite will mean that any existing user data is useless, and should be manually deleted to free up space. The database formats are not interchangeable, and once I implement SQLite the original data will no longer be usable. Given the major change, the next release should likely jump a major version, ie: 1.0.0.

axllent added a commit that referenced this issue Aug 30, 2022
BREAKING CHANGE: This release includes a major backend storage change (SQLite) that will render any previously-saved messages useless. Please delete old data to free up space. For more information see #10
@axllent axllent closed this as completed Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant