Error `needLargeMem: Out of memory - request size 65568 bytes, errno: 12` #351

corneliusroemer · 2023-09-18T17:19:19Z

When opening: https://genome.ucsc.edu/cgi-bin/hgPhyloPlace?db=wuhCor1&phyloPlaceTree=hgPhyloPlaceData/wuhCor1/public.plusGisaid.latest.masked.pb&subtreeSize=5000&remoteFile=https%3A%2F%2Flapis.cov-spectrum.org%2Fgisaid%2Fv1%2Fsample%2Fgisaid-epi-isl%3ForderBy%3Drandom%26limit%3D1000%26dateFrom%3D2023-06-12%26dateTo%3D2023-09-11%26variantQuery%3D%255B4-of%253AS%253A455F%252CS%253A456L%252CS%253A478R%252CS%253A403K%252CS%253A486P%252CS%253A475V%252CS%253A470N%255D%26host%3DHuman%26accessKey%3D9Cb3CqmrFnVjO3XCxQLO6gUnKPd

I get needLargeMem: Out of memory - request size 65568 bytes, errno: 12

It's around 500 sequences.

The text was updated successfully, but these errors were encountered:

AngieHinrichs · 2023-09-18T17:24:38Z

Thanks for reporting, I'll take a look. It might take a while to fix (especially on the main site since there is a one to four week release cycle delay from the test site). In the meantime, if an option could be added to CoV-Spectrum to randomly downsample, to send UShER no more than 500 sequences (or 400 to be safe) that would help avoid the problem while still giving a good lineage overview (@chaoran-chen ?).

See yatisht/usher#351 (comment)

chaoran-chen · 2023-09-18T19:03:14Z

Sure, I reduced it to 400. Please let me know if I should increase it again.

AngieHinrichs · 2023-09-18T20:49:41Z

Wonderful, thanks @chaoran-chen, and as always, so fast! @corneliusroemer, does the query work better for you now?

corneliusroemer · 2023-09-18T21:26:12Z

Wonderful, thanks @chaoran-chen, and as always, so fast! @corneliusroemer, does the query work better for you now?

This query from above still pulls in 1000 😜 I managed to get it to work occasionally with 1000, I think, but better not overload your server :)

corneliusroemer · 2023-09-18T21:27:35Z

I just reran with limit=400 and now I got the following error (I actually remember seeing that Cannot allocate memory, can't fork before, yesterday and today):

AngieHinrichs · 2023-09-18T21:33:22Z

It's possible that our server is getting a little overloaded. I'll look into it.

chaoran-chen · 2023-09-18T21:43:02Z

@corneliusroemer, but the cov-spectrum website now generates links with limit=400, right?

corneliusroemer · 2023-09-18T22:37:01Z

Yes it does @chaoran-chen, but I still get the error needLargeMem: Out of memory - request size 65568 bytes, errno: 12

AngieHinrichs · 2023-09-19T00:15:54Z

About a week ago we had to impose some stricter limits on the total amount of memory used by all threads of the apache web server, because sometimes too many high-memory requests were hitting us at once and crashing the machine. That may be happening here. I just watched top while trying Cornelius's request and while the hgPhyloPlace process got up to ~15GB, there was a Genome Browser process that got as high as 32GB! I'm tracing through the logs to see if I can figure out what kind of usage makes a Genome Browser process so big (and relatively slow).

[Also I could be a lot smarter about how I'm handling metadata, for SARS-CoV-2 it's enormous and I really don't need to be reading it all in. I should just read in an index, maybe try sqlite?]

corneliusroemer · 2023-09-22T12:28:16Z

The failures keep happening stochastically even with covSpectrum only exporting 400 sequences now.

Seems like the overall memory is sometimes tight as when it happens, it happens to a lot of requests (I sometimes send 4 in parallel).

AngieHinrichs · 2023-09-22T21:56:55Z

Sorry but with our new restrictions on total memory use, sending four requests in parallel might be a bit much... maybe back off to 2? If you need to run lots of these, maybe I can set you up with equivalent matUtils extract commands that you can run locally on full tree files?

corneliusroemer · 2023-09-24T12:47:18Z

Sorry but with our new restrictions on total memory use, sending four requests in parallel might be a bit much... maybe back off to 2?

@AngieHinrichs If load is an issue then I can absolutely change my usage, though it comes at the cost of having to modify my established workflow. It appears that I am indeed single-handely crashing (or rather thrashing) Usher when firing off some 5 requests in short succession. I definitely don't want to DDOS Usher, so yes, I shouldn't do that anymore now that I'm aware.

If you're interested in looking to work around this here are some things that might be worth considering:

it might be the number of requests passed to Usher that cause thrashing rather the number of requests, whether large or small (just anecdotal evidence/gut feeling)
how have you implemented the new memory limit? Is this effectively rate limiting by rejecting requests when more than X jobs are running before you get into memory issues or do you rate limit only once things start getting slow?
Unless I'm the only one getting those OOM messages (logs will tell whether it's just my IP ;) ) it might be nice to wrap them so that users know that it's not a bug but that Usher is under heavy load ("Usher is currently very busy, you might want to look at matUtils extract if you want to run this yourself...")

I'm absolutely willing to figure out how to use Usher locally. I should have already done so long ago - the reason I haven't is that until now the web server was good enough. As you know, I've used Taxonium with the trees, but stopped using that when Taxonium's relative lack of features made it less effective in my view than using Nextstrain/Auspice trees via web usher (cc @theosanderson in case you'd want to have some power user feedback on things that Taxonium is missing to make it as good or better than auspice not only on very large trees but also on tree sizes that auspice can handle).

I would love to have a look at using matUtils extract with you to see whether it might be easy to get up a local equivalent of what the web server does - that could be very useful to others as well, maybe as a tutorial on how to get started with matUtils.

theosanderson · 2023-09-24T16:41:17Z

@corneliusroemer yes if you could you remind me what the highest priority Taxonium feature request(s) would be for you that would be helpful. Mutation text without hovering? (Feel free to open an issue in Taxonium repo - one issue that lists everything you want to mention would be fine).

chaoran-chen added a commit to GenSpectrum/cov-spectrum-website that referenced this issue Sep 18, 2023

feat: reduce sequences sent to UShER to 400

ca1a51a

See yatisht/usher#351 (comment)

chaoran-chen mentioned this issue Sep 18, 2023

feat: reduce sequences sent to UShER to 400 GenSpectrum/cov-spectrum-website#856

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error `needLargeMem: Out of memory - request size 65568 bytes, errno: 12` #351

Error `needLargeMem: Out of memory - request size 65568 bytes, errno: 12` #351

corneliusroemer commented Sep 18, 2023

AngieHinrichs commented Sep 18, 2023

chaoran-chen commented Sep 18, 2023

AngieHinrichs commented Sep 18, 2023

corneliusroemer commented Sep 18, 2023

corneliusroemer commented Sep 18, 2023

AngieHinrichs commented Sep 18, 2023

chaoran-chen commented Sep 18, 2023

corneliusroemer commented Sep 18, 2023 •

edited

Loading

AngieHinrichs commented Sep 19, 2023

corneliusroemer commented Sep 22, 2023

AngieHinrichs commented Sep 22, 2023

corneliusroemer commented Sep 24, 2023

theosanderson commented Sep 24, 2023

Error needLargeMem: Out of memory - request size 65568 bytes, errno: 12 #351

Error needLargeMem: Out of memory - request size 65568 bytes, errno: 12 #351

Comments

corneliusroemer commented Sep 18, 2023

AngieHinrichs commented Sep 18, 2023

chaoran-chen commented Sep 18, 2023

AngieHinrichs commented Sep 18, 2023

corneliusroemer commented Sep 18, 2023

corneliusroemer commented Sep 18, 2023

AngieHinrichs commented Sep 18, 2023

chaoran-chen commented Sep 18, 2023

corneliusroemer commented Sep 18, 2023 • edited Loading

AngieHinrichs commented Sep 19, 2023

corneliusroemer commented Sep 22, 2023

AngieHinrichs commented Sep 22, 2023

corneliusroemer commented Sep 24, 2023

theosanderson commented Sep 24, 2023

Error `needLargeMem: Out of memory - request size 65568 bytes, errno: 12` #351

Error `needLargeMem: Out of memory - request size 65568 bytes, errno: 12` #351

corneliusroemer commented Sep 18, 2023 •

edited

Loading