-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Map type #888
RFC: Map type #888
Conversation
|
✔️ Deploy Preview for graphql-spec-draft ready! 🔨 Explore the source changes: 96ce95d 🔍 Inspect the deploy log: https://app.netlify.com/sites/graphql-spec-draft/deploys/615418952c3b8600078131da 😎 Browse the preview: https://deploy-preview-888--graphql-spec-draft.netlify.app |
Just filled in CLA. |
This is interesting! |
I like this approach. BUT maybe this RFC opens a way to bring backward incompatible features and helps somehow to speed up & simplify InputUnions implementation. |
@dotansimha - The proposal recommends only supporting string keys. Reason being json objects are always string keys. Even in JavaScript doing a @nodkz why would this be backwards incompatible? IIUC this is an additive change, hence all existing schemas should work as is. As for resolving, I plan to make a PR on graphql.js to get a working implementation with examples. The core idea is that map field resolvers would work the same way as list field resolvers. The only exception being that no automatic coercion would take place. Also there would be no per-key resolvers. The field ought to return a map with key value pairs. Like lists, the sub fields of the values can be recursively resolved. |
@nojvek So yep, existing schemas would work, but existing tooling with new syntax - not. It's bad. BUT it's not a big problem. If will be introduced protocol/spec version to GraphQL. For example via addition |
In terms of semantic versioning that isn't considered a breaking change right? It's in the same realm as introducing directives syntax, or any new syntax for that matter. It's an additive change. Breaking change is when an upgrade would break existing graphql schemas and documents because a feature got removed. I do understand your point. I do feel pretty motivated by map types, that I'm willing to put the work to make PRs on graphql-js, the spec, language server, Vscode-graphql. I understand it's not a small amount of work, therefore I would love to keep the scope limited. -- I take it that this is a strawman level RFC. How does one find a champion to take this to the next level? |
I would strongly object this approach - map is a list with string keys. Yes, we need a map (dictionary) type. Not everything is strongly typed. There are cases when a column is a container for free-form data that does not fit any specific type. I work with logs; Imagine a log record - there's Time, Source, Message, Severity - strongly typed. But there's also need to save extra info, depends on the system/situation So the solution is Data column that contains key-value pairs. This situation is very common out there, and GraphQL needs to support a 'type' for this 'whatever' data - which would be Map As far as I know, some GraphQL implementations added Map scalar. And I just did it too in NGraphQL. To sum it up my suggestion: add Map type as dict with values of any type; no new syntax, clarify if subset is allowed or not. |
I think one of the big potential issues with this is it breaks GraphQL's query/response symmetry: For the query: { users { firstName } } With lists: { "users": [ { "firstName": "Alice" }, { "firstName": "Bob" } ] } With the proposed map: { "users": { "KLJSDF239sdfkjh9q": { "firstName": "Alice" }, "llw0gis80q3jisd": { "firstName": "Bob" } } } You may think of lists as having "index keys" but the critical thing here is that those keys are implicit - they are unstated - so they do not add additional data to the response body.
I strongly disagree with this sentiment; it's very common to have
Be that champion! Add yourself to the next GraphQL WG: https://github.com/graphql/graphql-wg/blob/main/agendas/2021-10-07.md
Tools that operate based on introspection (and correctly ignore types they do not understand) should not be broken by this additive change. GraphQL only has non-breaking guarantees with respect to existing operation documents remaining valid (including the introspection query), if you're using SDL then it's anticipated that your tool will not support future SDL versions that introduce new syntaxes and will need to be updated. I'm not sure that GraphiQL/Playground will be broken by this since they use introspection. If they are, they should be modified to handle/ignore the newer types they do not understand in the introspection response. |
Awesome. I'll make a PR and add to agenda. As for maps being very flexible and allowing anything, that would go against how request/response format works as @benjie pointed out. IMHO Graphql having strict value types is a feature. I'm not sure how resolution would even work for |
Yes, GraphQL is about strong types, so far. The problem is that the world is not all strongly typed. There are numerous cases when the type is not known in advance, and Dict<string, object> is the best solution. NoSql world is full of stuff like that, and they are quite loose on types. |
I think this proposal addresses the real issue, I just disagree that we need to add Map type for that.
Bellow is my strawman proposal, for query-side solution:
With following limitations:
That way schema designer always exposes collections as |
@rivantsov do keep in mind that the way nested lists are supported This doesn't fully get |
@IvanGoncharov thats an interesting idea. Although not a fan that it only works at the first level but not at the second level. For users to fall in pit of success I believe, they shouldn't have to learn about the special edge cases. I.e map of maps of maps work just like list of lists of lists are supported by GraphQL. I'm not sure I understand the key ordering argument. The key ordering shouldn't matter right? List like when you query for a list, it's upto the server to decide what to return at 1st index and thereon. I could be missing something. |
@nojvek Typically GraphQL APIs allow you to sort collections based on some criteria you can specify through input args. For example, GitHub's GraphQL API has an In case you support pagination you need to support server-side ordering since if the client wants to show only part but with particular ordering of data it can't do ordering client-side. Even if UI can show items in random order, they would jump around on the screen after each refresh. So my argument is that GraphQL is currently optimized to show all data from the returned collection and for that, you need ordering. From reading #101 most popular use case is client-specific pair of P.S. from reading #101 I found an interesting point on how gRPC solved that problem: #101 (comment) |
@nojvek If you are worried about that we can complicate the syntax a bit.
If you do:
You will get
It means you leave the first level of the array as-is and converting its items to maps so you get |
@IvanGoncharov I believe you missed the Array in your first TypeScript type -> |
This a good point. IIUC this means there would be no change to resolvers or existing schemas, they would keep on resolving as List fields, but the GraphQL service would mush the array into an object. I do like the developer simplicity it brings. Less is more. And yes I agree, the order of keys would be important and need to be be specified as part of spec. JS objects/dicts do this by default, but Python and other languages may not. In that sense the spec would be:
The transformation is equivalent of Python dict comprehension, but with keys ordered by insertion.
That brings up the question, should two level maps be allowed? type User {
id: ID!
teamId: ID
name: String
profilePicUrl: String
}
type Query {
users: [User!]
}
For performance reasons, I feel only one level should be supported. User can get map of maps if their types are like this. type Team {
id: ID!
users: [User!]
}
This is going to be an interesting UX challenge with GraphiQl. Right now it's a nested list with checkboxes, but it will need someway of indicating that a field is used as the map key. I'm in favor of your proposal @IvanGoncharov. Although I would argue against square bracket for array syntax. In current spec, to get an array you don't specify square brackets. It's implicit. Do you have real world examples when an array with single key, value pairs are useful?
I don't think GraphQL should support this ^ |
@nojvek , this is not Relay spec, there's no ID here. |
@benjie No it's intentional. Idea is that by default (without square brackets) you replace the any-dimensional list with a map of items.
So square brackets in my proposal mean leave array at these levels as is but convert all nested arrays to a map. |
Thanks for clarifying 👍 |
@nojvek The problem is we can't do that. JSON doesn't guarantee the order of fields in the spec and in practice, many JSON parsers don't guarantee it either. See here: https://www.json.org/json-en.html
It's a deal-breaker for me. No data loss should happen, especially unnoticed.
Also a deal-breaker for me, for the same reason.
Agree 👍
Agree 👍 If we restrict this transformation to only |
After sleeping over this, I think there's still a case for the original proposal with In the list to dict transformation proposal by @IvanGoncharov, there is no way to specify a return type of dict with scalar keys. We can't specify
They are not mutually exclusive proposals. One is syntax sugar to transform list into dict, while the other is allowing resolution of map fields which return map. The map of map of scalars is a valid usecase supported by which can return
The usecase for returning map types is when using document databases like mongo or KV stores like redis, there are many instances where the document is stored as json containing maps. There's also graphql being a layer on top of existing REST-ful apis that return values in similar shape. E.g an api returning feature flags.
|
@nojvek Yes, had the same idea. And I think I know how to allow this in the transformation approach. In many cases you can choose multiple keys for the same data, in your example, you can use location name as a key but in other cases, you may want to use My idea is for the schema author to point stuff that can be used as keys and allow clients to specify what key they want to use for indexing. That way data coming from the same type so we don't have a problem with client caches. Moreover, clients can stip transformation (AFAIK they already do this for aliases) and apply transformation after they get the result and store it in the cache.
This can also be done with the same transformation if we allow specifying fields in addition to selectionSet.
|
Another interesting by-product of this proposal is that we don't need to "single entity" field + alias combination anymore. {
organization(login: "graphql") {
GraphQLSpec: repository(name: "graphql-spec") { ...SomeInfo }
GraphQLJS: repository(name: "graphql-js") { ...SomeInfo }
}
} Aliases are non-dynamic (need to be explicitly written in query) and can't have arbitrary names.
And you will get a nice Map with names as keys. Another use case it allows query dynamic data more naturally, for example we can have
That allows you to query JSON and get result with very minimal wrapping:
With oneof sematic from #825 you will get:
@nojvek But I don't hijack this PR so I'm ok creating separate one if you think both proposals are worth exploring. What do you think? |
After much thinking, I'm convinced that the the query time transformation from lists to map is a better idea. It allows the same List type to be used in multiple ways.
Yes, this is a pain I've experienced before while using Githubs GraphQL.
Wonder why I didn't think about this. Yes, this solves the scalar value problem. -- To discuss in graphqlwg. For the list to map transformation:
I propose last value wins.
Python dict comprehension behavior is last value wins.
I propose we should throw error for null keys.
I propose both should be allowed since keys are string. But I'm open to interpretation. In Github's graphql example, repo names are defined as String! https://docs.github.com/en/graphql/overview/explorer
I propose only
JSON spec doesn't guarantee order as @IvanGoncharov has mentioned. I propose graphql spec should say |
IMO there is a real need for user-created generic types, a la
which you could use in a schema type via:
which you could use at execution time like:
Note we already have generic types (lists and nullable types), we just don't allow user-creation of generic types. This would also allow different schemas to expose different kinds of maps depending on what's needed:
I believe this is an alternative to adding a single map type: it's unclear what is actually wanted for a map type. Do you want to create some tree with O(logn) lookup? A bijection between two groups of data? Something else? |
Feedback from Oct WG for posterity: We should better define what problem we want to solve, which will guide us to the best path to invest proposals into. For example: If the problem is that we have "bags of data" on the server, such that the keys and values are generally unknown - then the past discussions around the JSON scalar might be most compelling. If the problem is true Map types on the server that need to be translated to true Map types on the client, then a full featured Map type (or pattern) will be the most compelling. If the problem is wanting to utilize JSON objects as Maps in the query response, then a transformation syntax will be the most compelling. It could be that multiple of these are worth exploring to a degree, but better understanding what problem we're solving will help us focus on making the right tradeoffs and explore the solution space most likely to have an impact. In terms of the transformation syntax. I personally think it is super interesting. It feels inline with other related ideas of aliased fragments and nested objects as well as field chaining. I almost want to say we should hold back on something like this until we could properly explore the use cases and surface area of all kinds of transformation to better understand the total design space. That said, I want to stay cautious about my own interest because if I apply our guiding principles then I think this path starts to seem misaligned. Most importantly, "Enable new capabilities motivated by real use cases" seems hard to argue since this is purely a transformation of something we could already do. We have to ask ourselves if having the query and service do this transform rather than a client-side post-process is worth the complexity. Which comes to "Simplicity and consistency over expressiveness and terseness". The syntactic complexity, the restrictions on keys, the concerns about lost-data footguns... these all seem like they're big hits to simplicity and consistency all for some fairly minor expressiveness wins. And then "Favor no change" means we need to set a high bar of added value before making significant changes. I think a broader transformation proposal could potentially leap this hurdle, but right now I'm not so sure. In terms of the Map type, looking to prior art I think our clearest source of inspiration could be protobuf map types An important distinction is that this protobuf map type still uses a list of (k, v) tuples in the serialized transport, rather than something unique. Doing the same ourselves would maybe be disappointing to those looking to leverage JSON objects as Maps, but would allow support for far more key types. This would require some client-side transformation to convert the entry pairs back into a native map type (since JSON doesn't have one). Presumably a code generator or something like that could do that for you. I can see two paths forward for this space.
I think what @mjmahone is getting at is that maybe there exists a higher level type we could introduce that would allow both Connection and Map to be defined in user space, but I'm slightly worried that side-steps the part where client tooling needs to generate code to handle these types. Certainly the community standard would be the easiest path since there's no change to spec necessary and we can immediately build tools that use it. But it would be more awkward to query than something built into the spec. |
Thanks @leebyron I just wanted to say this was my first time at GraphQL WG meeting. It was very well organized and witnessing agenda being followed on time was a rare sight. Even though folks had strong opinions, it was directed towards the technical details rather than personal. I felt quite welcome. Thank you. I did learn that GraphQL can be a superset of JSON responses. Previously my assumption was that GraphQL responses are always JSON. That changes the design constraints. I am okay if we close this PR / RFC. Happy for someone else to open another more concrete proposal. |
I also very much like the proposal of built in / user-extendable generics-ish types by @mjmahone e.g But having worked in Typescript's codebase, generics can get pretty wild fast. So there needs to be a more trimmed down proposal. As for the transformation proposal by @IvanGoncharov, I feel it has legs. We shouldn't discount developer user experience. GraphQL's alias syntax is powerful. One could say that is possible if frontend made multiple GraphQL queries and stitched it with a rename. Alias makes this much easy. Currently consuming Dictionary structures from GraphQL is painful. Doesn't have to be JSON objects. JS has I do agree that having keys restricted to Strings is bad design. GraphQL supporting a List -> Map transform that can work with a variety of key/ value types is more powerful. I feel there is a world where both @mjmahone and @IvanGoncharov's ideas can live together. Will have to do some deeper thinking. |
Totally agree. Our existing very simple type system already has a fair amount of complex ramifications, and I'd be very worried without some firm constraints. We also need to consider the ecosystem around the query & schema language itself. Introspection, code generators, and more need to know how to make the best use of these things. I think this is all solvable, but we should be confident that it is (and reasonably so, such that the ecosystem doesn't constrain too much) before we seriously consider it |
I'm closing this proposal PR as a no go. We can always open another proposal keeping some of the discussion points in this thread in consideration. |
For the record, please don’t avoid new language features on our behalf! Graphiql, the lsp server and playground use the same underlying parser and language service, we are set up to handle new language features. We added interfaces implements interfaces quite easily. Also remember that there is a ton of graphql tooling outside the js/ts space, often using a graphql-js mirror reference implementation in their language, so a reference PR to graphql-js for any language feature has ripple effects across the many languages that use graphql |
This is an RFC for a new "Map" type to be added to GraphQL.
I acknowledge issue #101, that has 79 comments and 150+ 👍 votes. @leebyron locked the issue with the comment
This is that proposal.
Problem statement
This proposal aims to keep in mind "The Guiding Principles" laid out in the CONTRIBUTING.md.
Currently, GraphQL doesn't offer a way to return a Map/Dictionary response.
A workaround is to return a key/value pair response as suggested in https://stackoverflow.com/questions/56705157/best-way-to-define-a-map-object-in-graphql-schema
response
The problem is searching for the key "foo3" in the list requires traversing through the list. The alternative is to process the response into a local object via
Object.fromEntries
and then use it for fast lookups.Maps/Dictionaries are core data types in most languages. The json spec supports objects with key: value pairs. By having support for Maps, GraphQL clients can make effient key:value lookups on responses.
This proposal introduces
field: { Type }
syntax to specify Maps. Similar to existingfield: [ List ]
syntax.The primary motivation in this proposal is the idea that Maps are Lists with ID! (non-null string) keys, and should behave similar to Lists.
Most relational databases have tables with schemas in the format:
Having the response with IDs as keys gives GraphQL consumers/clients the ability for O(1) map lookups instead of O(n) list lookups.
The other argument is that in many instances, GraphQL sits on top of an existing REST-ful api which returns responses with map responses. A real-world example is algolia.
Algolia indexes map fields for very fast facet lookups. e.g.
To implement a GraphQL api over algolia, it would require changing the shape of
stockByLocation
response. By having GraphQL as schema enforcer, Map type would open a lot more possibilities of GraphQL adoption.The schema for above response would be:
List type
Currently, the List type is the only unbounded type in GraphQL.
SDL
query:
response:
Notice how the query didn't specify
[]
to specify a list response. Based on the type declarationusers: [User!]!
, only the fields of the List's value type are specified.^ NOTE: this is an invalid gql query.
The response can return any number of items in the list. GraphQL doesn't control what will be returned at the 0th index of the list, or the 1st index. This is upto the GraphQL service to determine.
A list can be seen a map with incremental numeric keys. It supports fast lookups at an index.
Map type
Following the principle of "Maps are Lists with string keys, and should behave simiar to Lists."
Note: The value type will still need to be explicitly specified. This is not an escape hatch for
Any
type.SDL
query:
response:
Q: Why non-null string keys only?
A: Because grapqhl responses are json, and json only supports string keys.
Alternative syntax is
field: {ID!: Type}
, however that would indicate that GraphQL may support other key types like Ints. I'd love for users to fall into the pit of success, so feel the semantics should be simple. Only string key types. Less is more.field: { Type }
for Maps,field: [ Type ]
for Lists. The non-null versions being.field: { Type! }!
andfield: [ Type! ]!
.--
Q: What about nested Maps?
A: Nested lists work e.g.
field: [[ Type ]]
, therefore, nested maps should also work in a similar fashion i.e.field: {{ Type }}
. The difference is that there would be no automatic coercion. If the shape of response doesn't match then there is a type error.