Clean up Jobqueues, minor fixes for S3 Queue #451

neilkakkar · 2021-06-01T15:10:38Z

Changes

Ensures S3 queue consumer runs.

Makes GraphileQueue adhere to JobQueueBase.

Checklist

Updated Settings section in README.md, if settings are affected
Jest tests

neilkakkar · 2021-06-01T15:17:25Z

src/main/job-queues/job-queue-base.ts

@@ -62,8 +64,10 @@ export class JobQueueBase implements JobQueue {
    resumeConsumer(): void
    // eslint-disable-next-line @typescript-eslint/require-await
    async resumeConsumer(): Promise<void> {


Resume is called on drain events emitted by Piscina, which results in multiple calls to sync state. This is okay, as long as dequeue code is idempotent (which it isn't, thus leading to race conditions).

We didn't face this with graphile because the runner object is unchanged over multiple sync state calls.

neilkakkar · 2021-06-01T15:28:38Z

src/main/job-queues/job-queue-base.ts

@@ -77,8 +81,8 @@ export class JobQueueBase implements JobQueue {
                clearTimeout(this.timeout)
            }
            // eslint-disable-next-line @typescript-eslint/await-thenable
-            const hadSomething = await this.readState()
-            this.timeout = setTimeout(() => this.syncState(), hadSomething ? 0 : this.intervalSeconds * 1000)
+            await this.readState()


Given existing semantics, readState is supposed to finish everything required with the jobs seen so far. So, it doesn't make sense to poll again right after, if it found some jobs to process.

The idea here is that if you have 10k jobs to run, you shouldn't wait 5 sec between them (or between every 100 of them) :). So if we got a job with the last run, immediately see if there's another one. Otherwise wait a few seconds and then try again.

Ah, that makes sense... assuming there's batching involved. I haven't seen it so far (am I missing something?) - so didn't assume such a state was possible.

Will revert.

src/main/job-queues/redlocked/s3-queue.ts

neilkakkar · 2021-06-01T15:29:29Z

tests/jobs.test.ts

@@ -284,5 +308,44 @@ describe('job queues', () => {
                Key: `prefix/2020-01-01/20200101-123456.123Z-deadbeef.json.gz`,
            })
        })
+
+        test('polls for new jobs', async () => {


these tests would fail after 60s (jest timeout) if there's no polling.

mariusandra

some thoughts below, still haven't ran it locally

mariusandra · 2021-06-01T16:06:13Z

src/main/job-queues/job-queue-base.ts

@@ -77,8 +81,8 @@ export class JobQueueBase implements JobQueue {
                clearTimeout(this.timeout)
            }
            // eslint-disable-next-line @typescript-eslint/await-thenable
-            const hadSomething = await this.readState()
-            this.timeout = setTimeout(() => this.syncState(), hadSomething ? 0 : this.intervalSeconds * 1000)
+            await this.readState()


The idea here is that if you have 10k jobs to run, you shouldn't wait 5 sec between them (or between every 100 of them) :). So if we got a job with the last run, immediately see if there's another one. Otherwise wait a few seconds and then try again.

tests/jobs.test.ts

mariusandra

These changes make sense

…/plugin-server#451) * fix s3 consumer, cleanup graphile queue * tests * address comments

neilkakkar added 2 commits June 1, 2021 15:25

fix s3 consumer, cleanup graphile queue

66b7310

tests

a610f62

neilkakkar requested a review from mariusandra June 1, 2021 15:10

neilkakkar commented Jun 1, 2021

View reviewed changes

src/main/job-queues/redlocked/s3-queue.ts Outdated Show resolved Hide resolved

neilkakkar commented Jun 1, 2021

View reviewed changes

mariusandra reviewed Jun 1, 2021

View reviewed changes

address comments

dbec6f9

neilkakkar requested a review from mariusandra June 2, 2021 10:34

mariusandra approved these changes Jun 3, 2021

View reviewed changes

neilkakkar added 2 commits June 3, 2021 13:39

Merge branch 'master' into jobqueues

c0a1c54

Merge branch 'master' into jobqueues

f71cfb5

neilkakkar merged commit b499825 into master Jun 4, 2021

neilkakkar deleted the jobqueues branch June 4, 2021 09:49

posthog-bot mentioned this pull request Jun 4, 2021

Update plugin server to 0.21.15 PostHog/posthog#4599

Merged

fuziontech pushed a commit to PostHog/posthog that referenced this pull request Oct 12, 2021

[plugin-server] Clean up Jobqueues, minor fixes for S3 Queue (PostHog…

c6eb272

…/plugin-server#451) * fix s3 consumer, cleanup graphile queue * tests * address comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clean up Jobqueues, minor fixes for S3 Queue #451

Clean up Jobqueues, minor fixes for S3 Queue #451

neilkakkar commented Jun 1, 2021 •

edited

Loading

neilkakkar Jun 1, 2021

neilkakkar Jun 1, 2021

mariusandra Jun 1, 2021

neilkakkar Jun 1, 2021

neilkakkar Jun 1, 2021

mariusandra left a comment

mariusandra Jun 1, 2021

mariusandra left a comment

Clean up Jobqueues, minor fixes for S3 Queue #451

Clean up Jobqueues, minor fixes for S3 Queue #451

Conversation

neilkakkar commented Jun 1, 2021 • edited Loading

Changes

Checklist

neilkakkar Jun 1, 2021

Choose a reason for hiding this comment

neilkakkar Jun 1, 2021

Choose a reason for hiding this comment

mariusandra Jun 1, 2021

Choose a reason for hiding this comment

neilkakkar Jun 1, 2021

Choose a reason for hiding this comment

neilkakkar Jun 1, 2021

Choose a reason for hiding this comment

mariusandra left a comment

Choose a reason for hiding this comment

mariusandra Jun 1, 2021

Choose a reason for hiding this comment

mariusandra left a comment

Choose a reason for hiding this comment

neilkakkar commented Jun 1, 2021 •

edited

Loading