-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of memory when dealing with large collections #529
Comments
joepio
changed the title
Collections stuck on loading
Out of memory when dealing with large collections
Oct 29, 2022
5 tasks
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The
.tpf
function currently stores all Atoms in memory. This leads to OOM issues (out of memory crash) for large collections.Some thoughts on this:
Rust OOM tools
Currently, we don't get any useful errors in the log. We can't do a stack trace, theres no unwind. This makes debugging OOM issues hard.
This also may have something to do with linux overcommitting memory.
try_reserve
may help prevent panics / OS killing Atomic-Server.oom=panic
might help give prettier error messages. But it's not implemented in stable rust yet.Index all the TPF queries
Let's go over the types of TPF queries we use, and how we can index these:
subject
are not relevantproperty
andvalue
property
probably need aproperty-value-subject
index. We don't have that as of now. That would also help us create really performant queries for new, unindexed query filters.value
are indexed by thereference_index
.How I found the issue
read more...
loading...
The problem is that the websocket requests have no response.
Sometimes (but not always) the WebSocket connection seems to fail:
On the server, I see this every time:
What killed our process?
dmesg -T| grep -E -i -B100 'killed process'
An out of memory issue...
Since we can correctly see most of the Collections, but not all, I think it's one of the collections that is actually causing this.
After checking them one by one, the culprit seems to be
/commits
. Makes sense, it is by far the largest collection!I think the problem has to do with
.tpf
not being iterable.The text was updated successfully, but these errors were encountered: