Skip to content
This repository has been archived by the owner on Sep 2, 2022. It is now read-only.

JSON values have storage limit of 256kb, will throw error 'Value for field <field name> is too long' if value size limit exceeds. #2705

Closed
5hanth opened this issue Jun 28, 2018 · 13 comments

Comments

@5hanth
Copy link

5hanth commented Jun 28, 2018

Json type fields have storage limitation of 256kb. There might be applications that require LONGTEXT length support to store JSON values.

it would be great if we could remove this limitation at application level and let db handle limits (not sure if this can be desired for all scalar types, but for this case Json value can go longer than 16MB)

@stale
Copy link

stale bot commented Jan 9, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the status/stale Marked as state by the GitHub stalebot label Jan 9, 2019
@stale stale bot removed the status/stale Marked as state by the GitHub stalebot label Jan 9, 2019
@nolandg
Copy link

nolandg commented Mar 23, 2019

Has anyone found a workaround for this? This is a show stopper for my app that stores JSON objects of about 350kb. It seems this isn't a postgres limit but something hard coded by Prisma. Is the only way to fork Prisma and hack that line linked to above? Or is it more complicated and Prisma is also limiting the db field size when creating the db schema?

@gmickel
Copy link

gmickel commented Mar 31, 2019

I would also be interested in having the limits for String and JSON fields made configurable. Our app has to save large documents and we're having to use ugly workarounds at the moment.

@gmickel
Copy link

gmickel commented May 13, 2019

Any update regarding this? We're planning on launching later this month and are blocked by this. Thanks!

@nolandg
Copy link

nolandg commented May 13, 2019 via email

@gmickel
Copy link

gmickel commented May 13, 2019

I used gzipping as the workaround I mentioned earlier in this thread. Naturally, that only delays the inevitable, our system deals with documents that can be up to around 1-2MB in size and due to their complexity (programmable documents) and the real-time nature of our platform, I can't easily split them into smaller pieces.
I'm wondering if the current limits on the String and JSON fields are arbitrary or due to a specific constraint?

@nolandg
Copy link

nolandg commented May 13, 2019 via email

@gmickel
Copy link

gmickel commented May 13, 2019

That's definitely possible, yes. In that case, configuration flags (as mentioned here in the docs, but don't seem to exist) might be a viable solution.

@wangel13
Copy link

Any suggestion on Json type field length?

@entrptaher
Copy link

entrptaher commented Oct 26, 2019

This is a high priority problem for us. We have a production app that depends on some 300-500kb data. We have a very small 500KB-5MB data, and now our server is throwing errors when we work on that table/data.

It's not specified anywhere why or who is deciding Json values are currently limited to 256KB in size., Even mongodb had 16MB limit for a document, MySQL/PostgreSQL has 4GB limit for data.

Why limit the data to 256KB in this age of data? :( I don't understand.

The docs links to which seems to be closed/stalled for many months/years. The following issues are related yet no solution/workaround provided:

All these issues are for 256KB limit on String, because just going with String is not a choice as well,


On that note, here is my current workaround, until I find a better workaround 🗡️ I could handle around 50MB of data as well based on my computer/server CPU/RAM support. It seems the akka used inside scala has this limit by default.

The trick is to create a [String] array, and a Json type. So when requesting from client, we will get/return the single json data. But when storing it, we will split the data into chunks and save it to many little chunks. I'm pretty sure there will be some server limitations, because updating/creating such a big array is going to take time.

When requesting it, we will convert the data to json before returning.

Concept Code

Here is the rough code, use with caution,

// index.js
const { prisma } = require("./generated/prisma-client");
const { GraphQLServer } = require("graphql-yoga");
const fastChunkString = require("fast-chunk-string");

const dataToArray = (data)=> {
  const newData = typeof data === 'string' ? data : JSON.stringify(data);
  return fastChunkString(newData, {size: 2048 })
  // 2048 in kilobits => roughly 256 kilobytes
}
const dataToJson = (result)=> {
  result.single = JSON.parse(result.data.join(''))
  return result;
};

const resolvers = {
  Query: {
    async output(root, args, context) {
      const result = await context.prisma.output({ id: args.id });
      return dataToJson(result);
    }
  },
  Mutation: {
    async createOutput(root, args, context) {
      const result = await context.prisma.createOutput({
        data: { set: dataToArray(args.single) }
      });
      return dataToJson(result);
    },
    async updateOutput(root, args, context) {
      const result = await context.prisma.updateOutput({
        where: {id: args.id},
        data: { ...args.data, data: {set: dataToArray(args.single) }}
      });
      return dataToJson(result);
    }
  }
};

const server = new GraphQLServer({
  typeDefs: "./schema.graphql",
  resolvers,
  context: {
    prisma
  }
});

server.start({bodyParserOptions: { limit: "100mb", type: "application/json" },}, () => console.log("Server is running on http://localhost:4000"));
# schema.graphql
# import * from './generated/prisma.graphql'

type Query {
  output(id: ID!): Output
}

type Mutation {
  createOutput(name: String! single: Json): Output
  updateOutput(id: ID! single: Json): Output
}

type Output {
  id: ID!
  single: Json
}
# datamodel.prisma
type Output {
  id: ID! @id
  data: [String] @scalarList(strategy: RELATION)
  single: Json
}
// testing code with fetch
var longData = '#'.repeat(50000000);
fetch("http://localhost:4000/", {
  credentials: "include",
  headers: {
    accept: "*/*",
    "accept-language": "en-US,en;q=0.9,bn;q=0.8",
    "content-type": "application/json",
    "sec-fetch-mode": "cors",
    "sec-fetch-site": "same-origin"
  },
  referrer: "http://localhost:4000/",
  referrerPolicy: "no-referrer-when-downgrade",
  body:
    `{"operationName":"createOutput","variables":{},"query":"mutation createOutput {\\n  createOutput(name: \\"test\\", single: \\"{\\\\\\"Hello\\\\\\":\\\\\\"${longData}\\\\\\"}\\"){\\n    id\\n    single\\n  }\\n}"}`,
  method: "POST",
  mode: "cors"
})
.then(data=>data.json())
.then(data=> {
  console.log({
  length: longData.length,
  same: data.data.createOutput.single.Hello === longData
  })
})

Result:

image

It was using a freaking amount of RAM for a 50MB request though. The document had only around 200 documents of 256KB Data. And just opening the admin panel used around 5-10GB of ram lol.

Are you saying prisma cannot handle 200 documents with 256KB data each?

On my machine,

  • If I split the string with size: 200, it produces 250k array elements, takes 15s. The CPU and RAM usage is much more tolerant.
  • If I split the string with size: 256000, it will produce 200 array elements and take 3~4s, while opening the admin panel will lead to a super high CPU usage.

This is such a weird workaround but it worked and that tells me this issue is on hold for 2 years for no reason. 😅

any suggestions are welcome

@entrptaher
Copy link

entrptaher commented Oct 26, 2019

They put a 256KB limit so I created a 50MB
workaround (only), because that's not configurable yet, without building the prisma code myself. 😅

The only another issue I see for that hardcoded value on the validation file is the garbage collection. But it still felt really arbitrary/randomly chosen, because maybe they thought "if we don't put a limit here, people might abuse our service, or the server might not be able to handle it" etc.

@kunovsky
Copy link

kunovsky commented Mar 20, 2020

I just ran into this issue myself trying to store ~300KB JSON Objects and thought I'd share my quick work around here incase folks found it helpful. As recommended above I ended up changing the data type for the attribute I was storing from JSON to String and then compressed and decompressed the data on storage and retrieval from the DB. To be safe I tested this with 15MB of JSON and it works fine, so no need to split the data up. FYI before you run deflateData on the Object make sure you JSON.stringify it so Brotli can accept it.

// compressionUtils.js

import * as zlib from 'zlib'

const base64 = 'base64'

export const deflateData = data =>
  zlib.brotliCompressSync(data).toString(base64)

export const inflateData = data =>
  zlib.brotliDecompressSync(Buffer.from(data, base64)).toString()

We're using graphql-yoga so in my mutation and query I'm just running deflateData and inflateData respectively before I store/retrieve it.

Obviously you will lose the ability to query the JSON data in the database if you do this, but if you need to store large documents at massive compression this seems like an acceptable workaround.

@entrptaher
Copy link

@kunovsky I did test with zlib before splitting it, zlib was working fine as well, but it would not allow me to store binary or serious data (non-compressible string). So I had to forcefully split and join it.

I will suggest others to try zlib just like your example if the data is compressable. :)

@janpio janpio closed this as completed Sep 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants