Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for count being very slow on large Parse Classes' collections (Postgres) #5330

Merged

Conversation

CoderickLamar
Copy link
Contributor

@CoderickLamar CoderickLamar commented Jan 29, 2019

Fetching the count for Postgres tables that are large, >5 mil in our case, takes an extremely long time that bogs down the server. This is a known issue with Postgres: https://wiki.postgresql.org/wiki/Slow_Counting

If no complex query is supplied, this will now default to estimating the count which is accurate enough for parse dashboard.

This is an equivalent fix to commit 5264 which solved this for MongoDb.

@codecov
Copy link

codecov bot commented Jan 29, 2019

Codecov Report

Merging #5330 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5330      +/-   ##
==========================================
- Coverage   93.96%   93.95%   -0.01%     
==========================================
  Files         123      124       +1     
  Lines        9027     9056      +29     
==========================================
+ Hits         8482     8509      +27     
- Misses        545      547       +2
Impacted Files Coverage Δ
src/Adapters/Storage/Mongo/MongoStorageAdapter.js 91.48% <100%> (-0.73%) ⬇️
...dapters/Storage/Postgres/PostgresStorageAdapter.js 97.09% <100%> (+0.01%) ⬆️
src/Controllers/DatabaseController.js 95.08% <100%> (ø) ⬆️
src/RestWrite.js 93.07% <0%> (-0.37%) ⬇️
src/Adapters/Cache/RedisCacheAdapter.js
src/Adapters/Cache/RedisCacheAdapter/index.js 100% <0%> (ø)
...dapters/Cache/RedisCacheAdapter/KeyPromiseQueue.js 95.45% <0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a3c8629...8951580. Read the comment docs.

@flovilmart
Copy link
Contributor

That’s a great idea. Seems that a few PG tests broke along the way. Can you have a look?

Sent with GitHawk

@CoderickLamar
Copy link
Contributor Author

Thoughts on the broken tests?

7740.3:

    ✗ query installations with count = 1 (0.05 sec)
      - Expected 0 to equal 2.
    ✗ query installations with limit = 0 and count = 1 (0.045 sec)
      - Expected 0 to equal 2.

I believe the reason this fails is because the postgres database is freshly created and hasn't had a chance to have the pg_class table built... since this is an estimate and not an exact value.

Unsure of how 7740.5 relates to count.

@flovilmart
Copy link
Contributor

Thoughts on the broken tests? ...

@CoderickLamar we can disable this test on PG specifically if we need to (look for it_only_db) or we could let pg get enough time

Sent with GitHawk

@ben-sab
Copy link

ben-sab commented Jan 30, 2019

@flovilmart, we've disabled the tests for PG as suggested but we're getting new errors which don't seem to be related to the previous ones. I was wondering if you could point us in the right direction by any chance? Thanks.

✗ refuses to delete non-empty collection (20 secs)
      - Failed: HTTPResponse({ status: 200, headers: Object({ x-powered-by: 'Express', access-control-allow-origin: '*', access-control-allow-methods: 'GET,PUT,POST,DELETE,OPTIONS', access-control-allow-headers: 'X-Parse-Master-Key, X-Parse-REST-API-Key, X-Parse-Javascript-Key, X-Parse-Application-Id, X-Parse-Client-Version, X-Parse-Session-Token, X-Requested-With, X-Parse-Revocable-Session, Content-Type', access-control-expose-headers: 'X-Parse-Job-Status-Id, X-Parse-Push-Status-Id', content-type: 'application/json; charset=utf-8', content-length: '2', etag: 'W/"2-vyGp6PvFo4RvsFtPoIWeCReyIC8"', date: 'Wed, 30 Jan 2019 22:57:11 GMT', connection: 'close' }), cookies: undefined, buffer: Buffer [ 123, 125 ], text: <getter>, data: <getter> })
      - Error: Timeout - Async callback was not invoked within timeout specified by jasmine.DEFAULT_TIMEOUT_INTERVAL.
✗ does not fail when deleting nonexistant collections (20 secs)
      - Error: Timeout - Async callback was not invoked within timeout specified by jasmine.DEFAULT_TIMEOUT_INTERVAL.

@flovilmart
Copy link
Contributor

Usually, when a test hit the timeout, it’s because done is not called in time. This can be solved by handling all promises rejections with a done.fail. Does this make sense?

Sent with GitHawk

@dplewis
Copy link
Member

dplewis commented Jan 30, 2019

@CoderickLamar I looked this over real quick.

I believe the reason this fails is because the postgres database is freshly created and hasn't had a chance to have the pg_class table built... since this is an estimate and not an exact value.

Based on the wiki you posted, the estimate value updates on certain events.

If you would have ran analyze _Installation or waited for the autovacuum to periodically update the estimate the test would have passed . (Note you must have autovacuum enabled on Postgres for this estimate count to work, it is defaulted to on).

I think as a fall back looking at the failing test.

it('query installations with count = 1', done => {
  ...
  const response = res.response;
  expect(response.results.length).toEqual(2);
  expect(response.count).toEqual(2); // returns 0 could be assign with response.results.length
  ...
});

Copy link
Member

@dplewis dplewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits from me

src/Adapters/Storage/Postgres/PostgresStorageAdapter.js Outdated Show resolved Hide resolved
src/Adapters/Storage/Postgres/PostgresStorageAdapter.js Outdated Show resolved Hide resolved
@ben-sab
Copy link

ben-sab commented Feb 1, 2019

Thanks for your reply. @dplewis I have made the changes requested. As for the errors which still persist, on my local they are passing fine and they're not taking long either:

screen shot 2019-01-31 at 4 51 08 pm

Do we need to increase the timeout?

@flovilmart
Copy link
Contributor

Perhaps a different Postgres version?

Sent with GitHawk

@dplewis
Copy link
Member

dplewis commented Feb 1, 2019

What version are running I tried it on 9.6 and 10 and it works for me locally

@flovilmart
Copy link
Contributor

Travis runs 9.5

Sent with GitHawk

@dplewis
Copy link
Member

dplewis commented Feb 1, 2019

@flovilmart I’ll try 9.5 but I don’t know how this would affect the schema. I’ll try tomorrow

@ben-sab
Copy link

ben-sab commented Feb 1, 2019

@flovilmart @dplewis I was running PG 11 on my local.

@CoderickLamar
Copy link
Contributor Author

@dplewis @flovilmart How can we get postgres updated on travis?

@flovilmart
Copy link
Contributor

We should not, instead the fix should be compatible with Postgres 9.5.

Sent with GitHawk

@CoderickLamar
Copy link
Contributor Author

On that note, we should figure out the best route forward. We should get this test passing, but it sounds like that is separate task?

@flovilmart
Copy link
Contributor

flovilmart commented Feb 5, 2019

The tests that are failing on this branch are passing on the master branch. So the changes introduce a possible regression.

Sent with GitHawk

@flovilmart
Copy link
Contributor

Also, excluding the tests is not something I am a big fan of, as we should maintain the same level of features on all databases

Sent with GitHawk

@dplewis
Copy link
Member

dplewis commented Feb 5, 2019

@CoderickLamar As I mentioned above #5330 (comment) estimate count will always be 0 until it is updated by an event.

I was able to get the failing test to work locally somehow.

refuses to delete non-empty collection test fails because this will use estimate count and the collection will always report empty. You can delete an empty collection.

Same goes for does not fail when deleting nonexistant collections

This means things like deleting from schema needs a hard count or somewhat close estimation .

@CoderickLamar
Copy link
Contributor Author

@ben-sab @dplewis
Solution here seems to be to let the estimation rebuild

Another solution would be to implement hard count specifically for deleting schema... can someone point me in the right direction for that?

@dplewis
Copy link
Member

dplewis commented Mar 27, 2019

@behrangs Thank you for the fix. Simple and beautiful.

Can you resolve the conflicts? There is a dependency bot that handles updating packages

@CoderickLamar
Copy link
Contributor Author

What are the next steps here?

@dplewis
Copy link
Member

dplewis commented Apr 3, 2019

@CoderickLamar @ben-sab Can you try this branch with your projects? I don't know how long autovacuum takes on your machines for you to see an estimated count thats not zero.

@dplewis dplewis requested a review from acinader April 3, 2019 03:11
@acinader acinader removed their request for review April 3, 2019 03:42
@dplewis dplewis requested a review from vitaly-t April 3, 2019 20:48
@CoderickLamar
Copy link
Contributor Author

Everything is looking good on my end. Speedy dashboard once again.

@CoderickLamar
Copy link
Contributor Author

Can we get this merged into a release?

@dplewis dplewis merged commit c7eb7da into parse-community:master Apr 8, 2019
@CoderickLamar
Copy link
Contributor Author

How often are releases usually put together on master?

UnderratedDev pushed a commit to UnderratedDev/parse-server that referenced this pull request Mar 21, 2020
…stgres) (parse-community#5330)

* Changed count to be approximate. Should help with postgres slowness

* refactored last commit to only fall back to estimate if no complex query

* handlign variables correctly

* Trying again because it was casting to lowercase table names which doesnt work for us/

* syntax error

* Adding quotations to pg query

* hopefully final pg fix

* Postgres will now use an approximate count unless there is a more complex query specified

* handling edge case

* Fix for count being very slow on large Parse Classes' collections in Postgres. Replicating fix for Mongo in issue 5264

* Fixed silly spelling error resulting from copying over notes

* Lint fixes

* limiting results to 1 on approximation

* suppress test that we can no longer run for postgres

* removed tests from Postgres that no longer apply

* made changes requested by dplewis

* fixed count errors

* updated package.json

* removed test exclude for pg

* removed object types from method

* test disabled for postgres

* returned type

* add estimate count test

* fix mongo test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants