Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use skip locked #303

Closed
wants to merge 2 commits into from
Closed

Conversation

JasonHerr
Copy link
Contributor

This was inspired by work done by @MSch and documented in #279. I realize you're not looking to go to PostgreSQL 9.5 (or later) quite yet, but I needed this speed up. For large numbers of very small jobs and eight plus workers, this was a massive improvement. I've tested performance in my environment and I think your mileage may vary based on your job's computational size, frequency, number of workers, and so on.

Thanks so much to @MSch for doing his comparisons and proof of concept work.

I would bump the versioning but am loathe to do so until someone thinks this is likely to make it into master. I'll be referencing a git tag in my Gemfile until then.

@ukd1
Copy link
Contributor

ukd1 commented Jul 3, 2019

👀

@ukd1 ukd1 self-assigned this Jul 3, 2019
@ukd1
Copy link
Contributor

ukd1 commented Jul 3, 2019

@JasonHerr, thanks for this - it looks good, though I think I want to look to make it backwards compatible. If you're interested, let me know, if not I may do it.

@JasonHerr
Copy link
Contributor Author

JasonHerr commented Jul 4, 2019

@ukd1 I would not be able to do it any time soon. However, I could do it when I get some free time. My questions would be:

  • Are you suggesting something like modifying the stored proc to check server_version_num and then run the old method if it's too old?
  • Or, are you suggesting top_bound should be added back into the query but do nothing so no ruby changes are needed?

I guess I'm basically asking if you're suggesting it be backwards compatible with older PostgreSQL or just backwards compatible with QC?

Thanks,
Jason

@ukd1
Copy link
Contributor

ukd1 commented Jul 5, 2019

@JasonHerr - tldr, backwards compatible with QC.

I've decided that the next version of QC will be only for Ruby >= 2.4, and PG > 9.6 only, as a) the older versions are still around b) these are currently also pretty old. However, I'd like to not break installs if possible - so I think for now adding top_bound in and then just ignoring it will make migration easier. We / I can then remove it in another later version.

@MSch
Copy link

MSch commented Jul 5, 2019

@JasonHerr Thank you so much for taking this and implementing it! Really appreciate it.

One thing I noticed when looking through the PR is that since the new query is just a single SQL statement there's no need to have a lock_head function at all any more - you can just run the query directly. Previously there was an actual PL/pgSQL function, but now you're just wrapping the query.

@ukd1
Copy link
Contributor

ukd1 commented Jul 5, 2019

@MSch actually, great point - this would also make it backwards compatible.

@JasonHerr
Copy link
Contributor Author

Yes, I missed that. I can just modify queue.rb and add the select statement there AND leave top_bound alone.

@JasonHerr
Copy link
Contributor Author

Got time to look at this today. This was much simpler. I haven't run it on one of my performance boxes yet but, the concept remains the same.

@ukd1
Copy link
Contributor

ukd1 commented Jul 18, 2019

@JasonHerr nice, looking!

@ukd1
Copy link
Contributor

ukd1 commented Jul 18, 2019

Seems good to me; benchmarked it using the queue-shootout project from #279 (also, forked for some updates I'll do --> qc org).

tldr, it's comparable to faster than que with these changes. all the tests pass; I can't see anything wrong / not backwards compatible.

master:

queue_classic jobs per second: avg = 883.6, max = 1586.9, min = 400.1, stddev = 558.3
que jobs per second: avg = 1555.6, max = 1599.2, min = 1507.8, stddev = 32.8

b96ef4c:

queue_classic jobs per second: avg = 1879.1, max = 2072.1, min = 1779.4, stddev = 128.3
que jobs per second: avg = 1500.5, max = 1550.8, min = 1405.2, stddev = 58.1

full output --

master (59b3570) :

imaclols:queue-shootout russ$ DATABASE_URL=postgres://localhost:5432/russ bundle exec rake
/Users/russ/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/redis-3.0.6/lib/redis/client.rb:374: warning: constant ::Fixnum is deprecated
Benchmarking queue_classic, que
  QUIET = false
  ITERATIONS = 5
  DATABASE_URL = postgres://localhost:5432/russ
  JOB_COUNT = 1000
  TEST_PERIOD = 0.2
  WARMUP_PERIOD = 0.2
  SYNCHRONOUS_COMMIT = on

Iteration #1:
queue_classic: 1 => 312.9, 2 => 890.4, 3 => 1119.5, 4 => 1079.9, 5 => 1382.5, 6 => 1323.8, 7 => 1203.5, 8 => 933.6, 9 => 792.5, 10 => 784.2
queue_classic: Peaked at 5 workers with 1382.5 jobs/second
que: 1 => 323.2, 2 => 719.7, 3 => 934.1, 4 => 1157.3, 5 => 1247.0, 6 => 1420.4, 7 => 1552.0, 8 => 1401.9, 9 => 1555.2, 10 => 1599.2, 11 => 1535.5, 12 => 1166.7, 13 => 1361.1, 14 => 1516.2, 15 => 1472.9
que: Peaked at 10 workers with 1599.2 jobs/second

Iteration #2:
queue_classic: 1 => 249.3, 2 => 395.9, 3 => 594.7, 4 => 483.7, 5 => 599.7, 6 => 468.8, 7 => 449.2, 8 => 355.2, 9 => 426.1, 10 => 504.0
queue_classic: Peaked at 5 workers with 599.7 jobs/second
que: 1 => 228.5, 2 => 699.2, 3 => 855.1, 4 => 1139.7, 5 => 1123.4, 6 => 1401.7, 7 => 1310.3, 8 => 999.9, 9 => 1499.4, 10 => 1486.7, 11 => 1366.2, 12 => 1473.0, 13 => 1507.8, 14 => 1507.4, 15 => 1426.9, 16 => 1410.5, 17 => 1382.6, 18 => 1408.1
que: Peaked at 13 workers with 1507.8 jobs/second

Iteration #3:
queue_classic: 1 => 139.4, 2 => 294.6, 3 => 377.9, 4 => 449.0, 5 => 391.5, 6 => 346.3, 7 => 365.1, 8 => 339.4, 9 => 331.2
queue_classic: Peaked at 4 workers with 449.0 jobs/second
que: 1 => 304.6, 2 => 689.1, 3 => 960.1, 4 => 1121.1, 5 => 1261.4, 6 => 1346.0, 7 => 1314.4, 8 => 1074.2, 9 => 1564.0, 10 => 1515.2, 11 => 1493.8, 12 => 1452.1, 13 => 1206.1, 14 => 1347.8
que: Peaked at 9 workers with 1564.0 jobs/second

Iteration #4:
queue_classic: 1 => 146.7, 2 => 304.6, 3 => 356.2, 4 => 363.6, 5 => 400.1, 6 => 380.5, 7 => 359.0, 8 => 340.8, 9 => 358.9, 10 => 282.6
queue_classic: Peaked at 5 workers with 400.1 jobs/second
que: 1 => 336.2, 2 => 600.2, 3 => 922.0, 4 => 936.3, 5 => 1246.9, 6 => 1431.4, 7 => 1240.6, 8 => 1558.1, 9 => 1384.2, 10 => 1538.4, 11 => 1282.8, 12 => 1414.6, 13 => 1430.2
que: Peaked at 8 workers with 1558.1 jobs/second

Iteration #5:
queue_classic: 1 => 139.4, 2 => 293.2, 3 => 381.8, 4 => 720.3, 5 => 1483.5, 6 => 1586.9, 7 => 1454.9, 8 => 1025.8, 9 => 812.7, 10 => 772.3, 11 => 840.5
queue_classic: Peaked at 6 workers with 1586.9 jobs/second
que: 1 => 313.3, 2 => 730.6, 3 => 988.2, 4 => 1108.0, 5 => 1263.2, 6 => 1270.6, 7 => 1445.6, 8 => 1085.7, 9 => 1548.9, 10 => 1495.8, 11 => 1489.5, 12 => 1388.5, 13 => 1269.1, 14 => 1497.5
que: Peaked at 9 workers with 1548.9 jobs/second

queue_classic jobs per second: avg = 883.6, max = 1586.9, min = 400.1, stddev = 558.3
que jobs per second: avg = 1555.6, max = 1599.2, min = 1507.8, stddev = 32.8

Total runtime: 55.0 seconds

b96ef4c :

imaclols:queue-shootout russ$ DATABASE_URL=postgres://localhost:5432/russ bundle exec rake
/Users/russ/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/redis-3.0.6/lib/redis/client.rb:374: warning: constant ::Fixnum is deprecated
Benchmarking queue_classic, que
  QUIET = false
  ITERATIONS = 5
  DATABASE_URL = postgres://localhost:5432/russ
  JOB_COUNT = 1000
  TEST_PERIOD = 0.2
  WARMUP_PERIOD = 0.2
  SYNCHRONOUS_COMMIT = on

Iteration #1:
queue_classic: 1 => 477.5, 2 => 999.1, 3 => 1362.6, 4 => 1263.1, 5 => 1744.2, 6 => 1850.5, 7 => 1776.0, 8 => 1686.2, 9 => 2072.1, 10 => 1914.0, 11 => 1682.2, 12 => 1015.6, 13 => 1739.4, 14 => 2020.5
queue_classic: Peaked at 9 workers with 2072.1 jobs/second
que: 1 => 278.4, 2 => 560.6, 3 => 764.3, 4 => 755.8, 5 => 1239.2, 6 => 1169.6, 7 => 1186.6, 8 => 1207.4, 9 => 1399.8, 10 => 1158.0, 11 => 1301.5, 12 => 1324.1, 13 => 940.1, 14 => 1405.2, 15 => 1089.7, 16 => 1301.0, 17 => 1319.9, 18 => 1392.9, 19 => 1098.2
que: Peaked at 14 workers with 1405.2 jobs/second

Iteration #2:
queue_classic: 1 => 368.4, 2 => 769.1, 3 => 1156.7, 4 => 673.8, 5 => 1495.9, 6 => 1353.1, 7 => 1113.0, 8 => 677.5, 9 => 1604.8, 10 => 1950.6, 11 => 1445.0, 12 => 1797.7, 13 => 1832.9, 14 => 1641.1, 15 => 1637.7
queue_classic: Peaked at 10 workers with 1950.6 jobs/second
que: 1 => 308.0, 2 => 638.4, 3 => 738.3, 4 => 1053.0, 5 => 1096.8, 6 => 1318.3, 7 => 1478.8, 8 => 1515.5, 9 => 1499.2, 10 => 1236.8, 11 => 1406.6, 12 => 1297.9, 13 => 1286.9
que: Peaked at 8 workers with 1515.5 jobs/second

Iteration #3:
queue_classic: 1 => 400.7, 2 => 656.0, 3 => 1050.6, 4 => 1156.9, 5 => 1521.2, 6 => 1400.8, 7 => 1764.3, 8 => 1798.8, 9 => 1652.0, 10 => 1541.8, 11 => 1648.5, 12 => 1707.3, 13 => 1160.4
queue_classic: Peaked at 8 workers with 1798.8 jobs/second
que: 1 => 284.5, 2 => 597.7, 3 => 767.1, 4 => 1042.7, 5 => 1180.3, 6 => 1208.3, 7 => 1252.5, 8 => 1502.4, 9 => 1540.0, 10 => 1323.7, 11 => 278.9, 12 => 1239.5, 13 => 1344.5, 14 => 1331.5
que: Peaked at 9 workers with 1540.0 jobs/second

Iteration #4:
queue_classic: 1 => 377.7, 2 => 666.3, 3 => 898.2, 4 => 999.4, 5 => 1456.9, 6 => 1580.7, 7 => 1727.6, 8 => 1496.1, 9 => 1527.6, 10 => 1794.7, 11 => 1137.1, 12 => 1655.5, 13 => 1703.9, 14 => 1351.0, 15 => 1458.0
queue_classic: Peaked at 10 workers with 1794.7 jobs/second
que: 1 => 338.9, 2 => 615.1, 3 => 700.8, 4 => 1112.9, 5 => 1122.7, 6 => 1444.1, 7 => 1334.3, 8 => 1249.2, 9 => 1490.9, 10 => 1262.3, 11 => 1455.0, 12 => 1361.0, 13 => 1476.1, 14 => 1268.7
que: Peaked at 9 workers with 1490.9 jobs/second

Iteration #5:
queue_classic: 1 => 379.2, 2 => 732.4, 3 => 1004.1, 4 => 1240.8, 5 => 1253.1, 6 => 1317.5, 7 => 1626.0, 8 => 1749.4, 9 => 1779.4, 10 => 1655.5, 11 => 1755.2, 12 => 1517.2, 13 => 1553.4, 14 => 1225.5
queue_classic: Peaked at 9 workers with 1779.4 jobs/second
que: 1 => 320.7, 2 => 614.6, 3 => 924.1, 4 => 1076.2, 5 => 1024.6, 6 => 1395.4, 7 => 1306.6, 8 => 1550.8, 9 => 1335.7, 10 => 1432.2, 11 => 1060.6, 12 => 1380.3, 13 => 1365.4
que: Peaked at 8 workers with 1550.8 jobs/second

queue_classic jobs per second: avg = 1879.1, max = 2072.1, min = 1779.4, stddev = 128.3
que jobs per second: avg = 1500.5, max = 1550.8, min = 1405.2, stddev = 58.1

Total runtime: 64.2 seconds

@ukd1
Copy link
Contributor

ukd1 commented Jul 18, 2019

@shosti could I get a once over on this just for sanity?

@ukd1 ukd1 self-requested a review July 18, 2019 20:16
@ukd1
Copy link
Contributor

ukd1 commented Jul 18, 2019

I'm going merge this - but via #311 as I've merged master, run the tests on circle and updated the changelog. Thx @JasonHerr!

@ukd1 ukd1 closed this Jul 18, 2019
@JasonHerr
Copy link
Contributor Author

Glad to see it in! Thanks!

@JasonHerr JasonHerr deleted the Use_skip_locked branch July 18, 2019 20:55
@ryandotsmith
Copy link
Contributor

This is so rad!

@coffenbacher
Copy link

I just saw these benchmarks as well - sick!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants