-
I could not find any guidance for a job size limit. Would a range of 16KB to 32KB work or is that too large? At what point do I need to store my (rather large) JSON objects in a separate data store and just reference them from a River job? (I'd prefer to avoid that as long as possible) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi there! I believe there are two main considerations. The first is that each time a job row is updated Postgres rewrites the entire row, regardless of how much of it was altered. My understanding is this includes jsonb values, even if they’re not modified in that update. Jobs get updated a least a couple of times in their lifecycle (potentially many times if retrying). The second factor and probably the more important one is that jsonb values > 2KB in size are written with TOAST, not collocated with the row. There’s a sort of performance cliff at this point along with more overhead as you continue increasing row size. This pganalyze post covers it pretty well: https://pganalyze.com/blog/5mins-postgres-jsonb-toast Whether or not these impacts are a problem for your use case depends on a lot of factors like job throughput, latency targets, database server sizes, etc. Hopefully that’s enough to help you decide whether you need to test it further! @brandur may have more to chime in with too. |
Beta Was this translation helpful? Give feedback.
Yeah, I would've said that job args size probably doesn't matter that much — large args will be stored out of band in TOAST, which will keep the job row tuples themselves pretty lean and fast.
You should think about total table size though because at some point tables/databases do just become unwieldy if there's too much in them (e.g. recovery on failover and that sort of thing is slower). So if you're storing 1000 jobs at 32 kB that's around 32 MB. No problem. If you're storing 100,000,000 jobs at 32 kB that's more like 3.2 TB. Even that might work, but I do start to worry about Postgres databases once they enter the TB scale.
That link from Lukas on JSONB performance is good. It does se…