-
Notifications
You must be signed in to change notification settings - Fork 895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
object_changes benchmarks with text, json and jsonb as column types #964
Comments
Interesting. You might try adding an index specifically on the name: |
I've updated the gist with the index and tried with 100.000 records:
|
Hmm, so the more specific index barely has any effect. I'm really surprised! I'll be curious to hear what other people make of these findings. Thanks Anton. |
Whoops, I just noticed I made a big mistake while using activerecord-import. I'm now creating 100.000 records with one version instead of one record with 100.000 versions. Will update on Monday! |
Hi Anton, are you still working on these benchmarks? I'd love to see a summary of your findings make it into the readme eventually (Section 6.b. Custom Serializer?). |
Hey! Sorry for letting this go like this. I'll try to have some new results up by the start of next week if that's ok? |
I updated the benchmark, but it has become very inefficient since I only hacked it together very quickly. As you can see in the Gist, I create a lot of hashes, resulting in an immense use of memory. But otherwise creating so many versions is unbearably slow. Using raw SQL would probably yield better results at this point. Anyway, here are the results. I can't seem to shake the feeling that I'm doing something very wrong. At least to me the results are very surprising, because I can reproduce the results mentioned in this article.
I'll give it another shot at some point, probably rewrite the benchmark. |
Yeah, I'm also surprised. I'd expect json to be faster than text, but I guess the
The key observation here is that the max. estimated cost is 16.50 for text and 20.40 for json. It's not a benchmark, just an estimate on an empty table, so take that with a grain of salt. However, our use of the So, I'd recommend using json/b anyway, even if it is slower. I don't see any problems with your benchmark, so I think we're ready to summarize our findings and put them in the readme. Can you take a stab at that, please? |
Trying to query a JSON string using The move to JSON is only really valuable for correctness in your queries. JSONB only really shines when you want to do more advanced things like:
|
When dealing with large amounts of data, another way to make queries fast is by creating an index for each |
It's surprising that the numbers are all so similar. You may want to add an index to aid the Since there's only one record for each table, with 100k versions for that single record, the indexes on You may want to How much memory does your computer have? Hmm, actually, 5k iterations a second? That's 5 database queries every millisecond, which seems unlikely (this is Postgres, not Redis). Are you sure you don't need to call |
Stellar analysis, thanks Sean! It sounds like we have some more experimentation to do.
Yes! I'm worried there are more bugs in
Yeah, that's why I would recommend json/b. |
@seanlinsley, thanks for the explanation! Would you be up for adapting the gist? It seems you have more insight into the whole JSON/JSONB column type than I do :) |
This has been a good discussion. Thanks Anton and Sean. I'll close this for now, and maybe we can use it as the basis for some more formal benchmarks in the future. |
Its been quiet some time, and postgres has built a lot on jsonb columns and their performance, i wonder if there are any changes in the benchmarks now? 🤔 |
@ziaulrehman40, I think I did the benchmarks wrong. Like @seanlinsley mentioned I should have Having said that, we've been using JSONB ever since with paper_trail and in other places and have not faced any performance problems for our use cases, given the right indexes. |
Following up on the discussion here I tried to to some benchmarking.
Using this code I got the following results:
Which I think is strange. The differences are very slim and the
text
column type provides the fastest result.I'm not sure if I'm doing anything wrong in the setup or if the samples are too small (both in amount and size of the JSON stored) or anything. But I thought I'd go ahead and maybe someone can explain the results better.
The text was updated successfully, but these errors were encountered: