Improve sync performance #2129

bugzilla-to-github · 2018-05-09T02:52:50Z

This issue was created automatically by a script.

Bug 1460348

Bug Reporter: @mathjazz
CC: @lonnen, @flodolo, github@anandthakker.net

Note:

This is not a new bug. It's been hitting us since we migrated Pontoon to Heroku and has been tracked under bug 1214411 initially.

--

Details:

Heroku dyno used as a worker for the sync process often runs out of memory. We see two types of errors in the logs:

R14 - Memory quota exceeded (degraded performance):
https://devcenter.heroku.com/articles/error-codes#r14-memory-quota-exceeded

R15 - Memory quota vastly exceeded (dyno is killed, sync breaks):
https://devcenter.heroku.com/articles/error-codes#r15-memory-quota-vastly-exceeded

--

Previous attempts at fixing the problem:

To address the problem, we made several optimizations to the sync process in the past, two of which stand out:

We fixed bug 1214411 (which tracked this problem initially) by detecting which files changed in VCS and only syncing those. That stopped aforementioned error messages from appearing constantly and only showing up when a bigger changeset is synced.
We fixed bug 1383252 by greatly reducing the costly hg clone operations. That reduced the average sync time from 20 to 2 minutes and allowed us to switch from using 3 Standard-2X dynos to 1 Standard-1X (also reducing the worker dyno cost by a factor of 6).

--

Current status:

We mostly see the error when bigger changeset are processed, e.g. when we run Fluent migrations or when projects that store translations in big bilingual files are synced (e.g. SUMO, AMO, MDN).

To avoid losing the worker (and damaging the sync process), we manually upgrade the sync worker to Performance-M before we run Fluent migrations, but that makes the process more manual than it could be and doesn't scale. We don't know for example when new SUMO strings will land.

--

Plan:

We should investigate what's the root cause of the problem and figure out if we can fix it programatically. A possible suspect is that the increased memory consumption is caused by the reduced number of DB queries (which are now bigger) and extensive use of prefetching, which are needed for performance reasons.

The other solution is to permanently upgrade the sync worker to a more expensive Performance-M (https://www.heroku.com/pricing), which works reliably. It's also dedicated.

bugzilla-to-github · 2018-07-25T21:11:04Z

Comment Author: @mathjazz

We're no longer hitting the problem since we started using the Performance-M worker.

bugzilla-to-github · 2019-09-18T23:52:39Z

Comment Author: Anand <github@anandthakker.net>

I was just trying out a Fluent migration on my Pontoon instance, and even using a Standard 2X dyno for the worker, I got some "R14 - Memory quota exceeded (degraded performance)" errors during the sync. More interestingly: I noticed that even after the sync was complete, I'm still seeing those errors every few seconds:

2019-09-19T13:48:36.067934+00:00 heroku[worker.1]: Process running mem=1088M(106.2%)
2019-09-19T13:48:36.068061+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)
2019-09-19T13:48:55.889321+00:00 heroku[worker.1]: Process running mem=1088M(106.2%)
2019-09-19T13:48:55.889443+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)
2019-09-19T13:49:15.906849+00:00 heroku[worker.1]: Process running mem=1088M(106.2%)
2019-09-19T13:49:15.906963+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)
2019-09-19T13:49:35.905572+00:00 heroku[worker.1]: Process running mem=1088M(106.2%)

Which makes me think that there may be a leak somewhere.

mathjazz · 2024-04-04T08:41:24Z

When syncing large projects with project configuration files (e.g. Mozilla.org), we sometimes need to upgrade Heroku to Performance-L dynos for the task to complete.

mathjazz · 2024-09-24T10:56:58Z

See https://docs.djangoproject.com/en/dev/ref/models/querysets/#django.db.models.query.QuerySet.iterator

For a QuerySet which returns a large number of objects that you only need to access once, this can result in better performance and a significant reduction in memory.

mathjazz added bug sync and removed enhancement labels Nov 5, 2021

mathjazz changed the title ~~[sync] Dyno running out of memory~~ Dyno running out of memory Nov 10, 2021

mathjazz mentioned this issue Nov 10, 2021

Error R12 in Update to Pontoon #2311

Closed

mathjazz added this to Pontoon Sync reliability Apr 4, 2024

mathjazz moved this to Done in Pontoon Sync reliability Apr 4, 2024

mathjazz moved this from Done to Performance in Pontoon Sync reliability Apr 4, 2024

mathjazz changed the title ~~Dyno running out of memory~~ Improve sync performance Apr 4, 2024

mathjazz added P2 We want to ship it soon, possibly in the current quarter P3 Default, possibly shipping in the following two quarters and removed P4 We expect it to be fixed someday P3 Default, possibly shipping in the following two quarters labels Apr 4, 2024

eemeli mentioned this issue Nov 27, 2024

Refactor sync #3312

Merged

eemeli closed this as completed in #3312 Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve sync performance #2129

Improve sync performance #2129

bugzilla-to-github commented May 9, 2018

bugzilla-to-github commented Jul 25, 2018

bugzilla-to-github commented Sep 18, 2019

mathjazz commented Apr 4, 2024

mathjazz commented Sep 24, 2024

Improve sync performance #2129

Improve sync performance #2129

Comments

bugzilla-to-github commented May 9, 2018

Bug 1460348

bugzilla-to-github commented Jul 25, 2018

bugzilla-to-github commented Sep 18, 2019

mathjazz commented Apr 4, 2024

mathjazz commented Sep 24, 2024