Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory when dealing with large collections #529

Closed
joepio opened this issue Oct 27, 2022 · 0 comments
Closed

Out of memory when dealing with large collections #529

joepio opened this issue Oct 27, 2022 · 0 comments

Comments

@joepio
Copy link
Member

joepio commented Oct 27, 2022

The .tpf function currently stores all Atoms in memory. This leads to OOM issues (out of memory crash) for large collections.

Some thoughts on this:

Rust OOM tools

Currently, we don't get any useful errors in the log. We can't do a stack trace, theres no unwind. This makes debugging OOM issues hard.

This also may have something to do with linux overcommitting memory.

  • The RFC for try_reserve may help prevent panics / OS killing Atomic-Server.
  • oom=panic might help give prettier error messages. But it's not implemented in stable rust yet.

Index all the TPF queries

Let's go over the types of TPF queries we use, and how we can index these:

  • All the queries with a known subject are not relevant
  • By far the most queries have a known property and value
  • The queries with a known property probably need a property-value-subject index. We don't have that as of now. That would also help us create really performant queries for new, unindexed query filters.
  • The queries with only a known value are indexed by the reference_index.

How I found the issue

read more...

Screenshot_2022-10-27-22-21-11-348_org mozilla firefox

The problem is that the websocket requests have no response.

Sometimes (but not always) the WebSocket connection seems to fail:

The connection to wss://atomicdata.dev/ws was interrupted while the page was loading. [websockets.js:23:19](https://atomicdata.dev/lib/dist/src/websockets.js)
websocket error: 
error { target: WebSocket, isTrusted: true, srcElement: WebSocket, currentTarget: WebSocket, eventPhase: 2, bubbles: false, cancelable: false, returnValue: true, defaultPrevented: false, composed: false, … }
[bugsnag.js:2579:15](https://atomicdata.dev/node_modules/.pnpm/@bugsnag+browser@7.16.5/node_modules/@bugsnag/browser/dist/bugsnag.js)

On the server, I see this every time:

Oct 29 10:50:49 vultr.guest atomic-server[2965299]: Visit https://atomicdata.dev
Oct 29 10:50:49 vultr.guest atomic-server[2965299]: 2022-10-29T10:50:49.596753Z  INFO actix_server::builder: Starting 1 workers
Oct 29 10:50:49 vultr.guest atomic-server[2965299]: 2022-10-29T10:50:49.596978Z  INFO actix_server::server: Actix runtime found; starting in Actix runtime
Oct 29 10:51:13 vultr.guest systemd[1]: atomic.service: Main process exited, code=killed, status=9/KILL
Oct 29 10:51:13 vultr.guest systemd[1]: atomic.service: Failed with result 'signal'.
Oct 29 10:51:14 vultr.guest systemd[1]: atomic.service: Scheduled restart job, restart counter is at 27.
Oct 29 10:51:14 vultr.guest systemd[1]: Stopped Atomic-Server.
Oct 29 10:51:14 vultr.guest systemd[1]: Started Atomic-Server.

What killed our process?

dmesg -T| grep -E -i -B100 'killed process'

oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/atomic.service,task=atomic-server,pid=2965353,uid=0
[Sat Oct 29 10:51:59 2022] Out of memory: Killed process 2965353 (atomic-server) total-vm:891908kB, anon-rss:278920kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:776kB oom_score_adj:0

An out of memory issue...

Since we can correctly see most of the Collections, but not all, I think it's one of the collections that is actually causing this.

After checking them one by one, the culprit seems to be /commits. Makes sense, it is by far the largest collection!

I think the problem has to do with .tpf not being iterable.

@joepio joepio transferred this issue from atomicdata-dev/atomic-data-browser Oct 29, 2022
@joepio joepio changed the title Collections stuck on loading Out of memory when dealing with large collections Oct 29, 2022
joepio added a commit that referenced this issue Oct 31, 2022
joepio added a commit that referenced this issue Nov 2, 2022
joepio added a commit that referenced this issue Nov 2, 2022
joepio added a commit that referenced this issue Nov 2, 2022
joepio added a commit that referenced this issue Nov 2, 2022
joepio added a commit that referenced this issue Nov 2, 2022
@joepio joepio closed this as completed Nov 2, 2022
joepio added a commit that referenced this issue Nov 2, 2022
joepio added a commit that referenced this issue Nov 2, 2022
joepio added a commit that referenced this issue Nov 2, 2022
@joepio joepio mentioned this issue Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant