Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Map type #888

Closed
wants to merge 1 commit into from
Closed

RFC: Map type #888

wants to merge 1 commit into from

Conversation

nojvek
Copy link

@nojvek nojvek commented Sep 29, 2021

This is an RFC for a new "Map" type to be added to GraphQL.

I acknowledge issue #101, that has 79 comments and 150+ 👍 votes. @leebyron locked the issue with the comment

If someone feels strongly that this concept deserves first-class support in GraphQL, I suggest following the RFC procedure to take this from a general suggestion to an actual proposal.

This is that proposal.

Problem statement

This proposal aims to keep in mind "The Guiding Principles" laid out in the CONTRIBUTING.md.

Currently, GraphQL doesn't offer a way to return a Map/Dictionary response.

A workaround is to return a key/value pair response as suggested in https://stackoverflow.com/questions/56705157/best-way-to-define-a-map-object-in-graphql-schema

type ArticleMapTuple {
     key: String!
     value: Article!
}

type Article {
  name: String!
}

response

[
  {
    "key": "foo1",
    "value": {name: "Foo1"}
  },
  {
    "key": "foo2",
    "value": {name: "Foo2"}
  },
  {
    "key": "foo3",
    "value": {name: "Foo3"}
  },
]

The problem is searching for the key "foo3" in the list requires traversing through the list. The alternative is to process the response into a local object via Object.fromEntries and then use it for fast lookups.

Maps/Dictionaries are core data types in most languages. The json spec supports objects with key: value pairs. By having support for Maps, GraphQL clients can make effient key:value lookups on responses.

This proposal introduces field: { Type } syntax to specify Maps. Similar to existing field: [ List ] syntax.

The primary motivation in this proposal is the idea that Maps are Lists with ID! (non-null string) keys, and should behave similar to Lists.

Most relational databases have tables with schemas in the format:

type SomeEntity {
  id: ID!
  field1: String!
  field2: Int!
}

Having the response with IDs as keys gives GraphQL consumers/clients the ability for O(1) map lookups instead of O(n) list lookups.

{
 "idAbc": {field1: "foo", field2: 123}
}

The other argument is that in many instances, GraphQL sits on top of an existing REST-ful api which returns responses with map responses. A real-world example is algolia.

Algolia indexes map fields for very fast facet lookups. e.g.

{
 id: "123"
 name: "K95 Face Shield 24 PK",
 stockByLocation: {
   "seattle": 30,
   "portland": 40,
   "miami": 30,
   "st_louis": 10,
   ...
 }
}

To implement a GraphQL api over algolia, it would require changing the shape of stockByLocation response. By having GraphQL as schema enforcer, Map type would open a lot more possibilities of GraphQL adoption.

The schema for above response would be:

type InventoryItem {
  id: ID!
  name: String!
  stockByLocations: { Int! }!
}

List type

Currently, the List type is the only unbounded type in GraphQL.

SDL

type Query {
  users: [User!]!
}

type User {
  id: ID!
  firstName: String!
  lastName: String!
}

query:

{
  users {
    id
    firstName
    lastName
  }
}

response:

{
  "users": [
    {"id": "foo", "firstName": "Foo", "lastName": "Bar"},
    {"id": "hello", "firstName": "Hello", "lastName": "World"}
  ]
}

Notice how the query didn't specify [] to specify a list response. Based on the type declaration users: [User!]!, only the fields of the List's value type are specified.

{
  users [{
    id
    firstName
    lastName
  }]
}

^ NOTE: this is an invalid gql query.

The response can return any number of items in the list. GraphQL doesn't control what will be returned at the 0th index of the list, or the 1st index. This is upto the GraphQL service to determine.

A list can be seen a map with incremental numeric keys. It supports fast lookups at an index.

[
    0: {"id": "foo", "firstName": "Foo", "lastName": "Bar"},
    1: {"id": "hello", "firstName": "Hello", "lastName": "World"},
    2: {"id": "jsmith", "firstName": "John", "lastName": "Smith"}
]

Map type

Following the principle of "Maps are Lists with string keys, and should behave simiar to Lists."

Note: The value type will still need to be explicitly specified. This is not an escape hatch for Any type.

SDL

type Query {
  users: {User!}!
}

type User {
  id: ID!
  firstName: String!
  lastName: String!
}

query:

{
  users {
    firstName
    lastName
  }
}

response:

{
  "users": {
    "foo": {"firstName": "Foo", "lastName": "Bar"},
    "hello": {"firstName": "Hello", "lastName": "World"}
  }
}

Q: Why non-null string keys only?

A: Because grapqhl responses are json, and json only supports string keys.

Alternative syntax is field: {ID!: Type}, however that would indicate that GraphQL may support other key types like Ints. I'd love for users to fall into the pit of success, so feel the semantics should be simple. Only string key types. Less is more.

field: { Type } for Maps, field: [ Type ] for Lists. The non-null versions being. field: { Type! }! and field: [ Type! ]!.

--

Q: What about nested Maps?

A: Nested lists work e.g. field: [[ Type ]], therefore, nested maps should also work in a similar fashion i.e. field: {{ Type }}. The difference is that there would be no automatic coercion. If the shape of response doesn't match then there is a type error.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Sep 29, 2021

CLA Signed

The committers are authorized under a signed CLA.

@netlify
Copy link

netlify bot commented Sep 29, 2021

✔️ Deploy Preview for graphql-spec-draft ready!

🔨 Explore the source changes: 96ce95d

🔍 Inspect the deploy log: https://app.netlify.com/sites/graphql-spec-draft/deploys/615418952c3b8600078131da

😎 Browse the preview: https://deploy-preview-888--graphql-spec-draft.netlify.app

@nojvek
Copy link
Author

nojvek commented Sep 29, 2021

Just filled in CLA.

@dotansimha
Copy link
Member

dotansimha commented Sep 29, 2021

This is interesting!
@nojvek in this example, the key is always a string, right? (does it makes sense to have other types of keys? ID or Int or maybe any scalar?)
Also, how will it affect resolvers and execution?

@nodkz
Copy link

nodkz commented Sep 29, 2021

I like this approach.
Pitty that it breaks backward compatibility with old servers and tools.

BUT maybe this RFC opens a way to bring backward incompatible features and helps somehow to speed up & simplify InputUnions implementation.

@nojvek
Copy link
Author

nojvek commented Sep 29, 2021

@dotansimha - The proposal recommends only supporting string keys. Reason being json objects are always string keys. Even in JavaScript doing a x = {}; x[1] = 1; will result in “1” as string keys. If other types are desired, then user should fall back to a list with key, value fields on a custom type. The json response from a graphql api shouldn’t contain any special semantics to handle maps, they should return as plain json objects with key value pairs.

@nodkz why would this be backwards incompatible? IIUC this is an additive change, hence all existing schemas should work as is.

As for resolving, I plan to make a PR on graphql.js to get a working implementation with examples. The core idea is that map field resolvers would work the same way as list field resolvers. The only exception being that no automatic coercion would take place.

Also there would be no per-key resolvers. The field ought to return a map with key value pairs. Like lists, the sub fields of the values can be recursively resolved.

@nodkz
Copy link

nodkz commented Sep 30, 2021

@nojvek category: { Category } will break the parser which affects on already running eslint-plugin-graphql, graphql-eslint, graphql-tools, graphiql, playground, graphql-inspector, graphql-compose and bunch of other tools

Screen Shot 2021-10-01 at 00 15 31

So yep, existing schemas would work, but existing tooling with new syntax - not.

It's bad. BUT it's not a big problem. If will be introduced protocol/spec version to GraphQL. For example via addition graphql: 2.0 as the first line of SDL or JSON introspection like it did in OpenAPI/swager, HTTP 1,2,3, grpc syntax=proto3 and other protocols.

@nojvek
Copy link
Author

nojvek commented Sep 30, 2021

In terms of semantic versioning that isn't considered a breaking change right? It's in the same realm as introducing directives syntax, or any new syntax for that matter. It's an additive change.

Breaking change is when an upgrade would break existing graphql schemas and documents because a feature got removed.

I do understand your point. I do feel pretty motivated by map types, that I'm willing to put the work to make PRs on graphql-js, the spec, language server, Vscode-graphql.

I understand it's not a small amount of work, therefore I would love to keep the scope limited.

--

I take it that this is a strawman level RFC. How does one find a champion to take this to the next level?

@rivantsov
Copy link
Contributor

I would strongly object this approach - map is a list with string keys.
So for a map all values are of the same type?! this is so limiting that it makes the whole map thing practically useless. Considering that it involves serious syntax extension, I do not think it's worth the effort.

Yes, we need a map (dictionary) type. Not everything is strongly typed. There are cases when a column is a container for free-form data that does not fit any specific type. I work with logs; Imagine a log record - there's Time, Source, Message, Severity - strongly typed. But there's also need to save extra info, depends on the system/situation So the solution is Data column that contains key-value pairs. This situation is very common out there, and GraphQL needs to support a 'type' for this 'whatever' data - which would be Map
The keys are strings (like prop names in an object), but the values are of any type, and not necessarily the same for all keys. That MAP would be really useful.

As far as I know, some GraphQL implementations added Map scalar. And I just did it too in NGraphQL.
The problem I see is - it would be logical to allow selection subset for the Map field, to select specific keys if needed; especially if there are many keys. But since Map is implemented as custom Scalar, this I guess is not allowed. I believe we need this part to figure out - yes or no on selection subset.

To sum it up my suggestion: add Map type as dict with values of any type; no new syntax, clarify if subset is allowed or not.

@benjie
Copy link
Member

benjie commented Oct 1, 2021

I think one of the big potential issues with this is it breaks GraphQL's query/response symmetry:

For the query:

{ users { firstName } }

With lists:

{ "users": [ { "firstName": "Alice" }, { "firstName": "Bob" } ] }

With the proposed map:

{ "users": { "KLJSDF239sdfkjh9q": { "firstName": "Alice" }, "llw0gis80q3jisd": { "firstName": "Bob" } } }

You may think of lists as having "index keys" but the critical thing here is that those keys are implicit - they are unstated - so they do not add additional data to the response body.


So for a map all values are of the same type?! this is so limiting that it makes the whole map thing practically useless.

I strongly disagree with this sentiment; it's very common to have { [key: string]: User } or similar map types in applications. And given that the value could be a union your options are even more flexible { [key: string]: User | Org | Post | Forum | ... }. If we were to add oneof to output types then you'd even be able to add scalars and lists to the mix - all without losing type safety.


How does one find a champion to take this to the next level?

Be that champion! Add yourself to the next GraphQL WG: https://github.com/graphql/graphql-wg/blob/main/agendas/2021-10-07.md


category: { Category } will break the parser which affects on already running eslint-plugin-graphql, graphql-eslint, graphql-tools, graphiql, playground, graphql-inspector, graphql-compose and bunch of other tools

Tools that operate based on introspection (and correctly ignore types they do not understand) should not be broken by this additive change. GraphQL only has non-breaking guarantees with respect to existing operation documents remaining valid (including the introspection query), if you're using SDL then it's anticipated that your tool will not support future SDL versions that introduce new syntaxes and will need to be updated.

I'm not sure that GraphiQL/Playground will be broken by this since they use introspection. If they are, they should be modified to handle/ignore the newer types they do not understand in the introspection response.

@nojvek
Copy link
Author

nojvek commented Oct 2, 2021

Be that champion! Add yourself to the next GraphQL WG:

Awesome. I'll make a PR and add to agenda.

As for maps being very flexible and allowing anything, that would go against how request/response format works as @benjie pointed out. IMHO Graphql having strict value types is a feature. I'm not sure how resolution would even work for any type. Union types are great. They will work with existing proposal.

@rivantsov
Copy link
Contributor

rivantsov commented Oct 2, 2021

Yes, GraphQL is about strong types, so far. The problem is that the world is not all strongly typed. There are numerous cases when the type is not known in advance, and Dict<string, object> is the best solution. NoSql world is full of stuff like that, and they are quite loose on types.
Don't get me wrong, when it comes to strong typing - I'm the biggest advocate. My dev language is c#, for g sake! But there are a lot of legit cases when you need an 'object' field among other strongly typed stuff.
Benjie - 'it is very common to have...' - yes, in the world of 30m devs and millions of projects you can have thousands of cases of almost anything. (I am NOT arguing AGAINST this fact). But my problem with suggested map is that it will occupy a conceptual space (map) that belongs to a broader concept (Dict<string, object), that sooner or later we will have to bring in I think, and then it will be hell of a confusion with 2 of these things in the standard. So there's not enough conceptual space for 2 map(s) IMHO.
I think Dict<string, object> covers the cases described here, without any syntax extensions, and it actually is way more useful.

@IvanGoncharov
Copy link
Member

I think this proposal addresses the real issue, I just disagree that we need to add Map type for that.
I see a few problems with the Map type:

  1. We can't guarantee the order of fields in maps, so we drop support for ordered fields in Loosen must to should for serialization supporting ordered maps. #191
    So we can't guarantee the order of fields in JSON for Map type. That means you can add sorting criteria only to fields returning List.
  2. Any Map can be represented as List with key as one of the fields that mean we two ways to represent collections and switching between them results in a breaking change.

Bellow is my strawman proposal, for query-side solution:
I think List is perfect for exposing data (since it maintains order) and we just need to allow the client to control the shape inside the query:

{
  users {
    [id]: {
      firstName
      lastName
    }
  }
}

With following limitations:

  1. Key can only be of type String! or ID!. But you can do [id!] per [RFC] Client controlled nullability operator #867 if id is nullable.
  2. This transformation is valid only on List fields and applied only to the most inner list.
    Meaning if you have a field returning [[User]!]! you will get [Map<User>!]! after transformation.
  3. If two items have the same keys it results in a runtime error

That way schema designer always exposes collections as List and it's up to a client to decide if cares about ordering or not.

@nojvek
Copy link
Author

nojvek commented Oct 6, 2021

@rivantsov do keep in mind that the way nested lists are supported [[String!]], nested maps will work too {{String!}}.

This doesn't fully get Any, but with union and oneOf types you get very close to representing a myriad of types. As for full dynamic Any type, that is not the goal of this proposal. That goes against the principles of GraphQL which brings a typed schema and enforces that contract between services.

@nojvek
Copy link
Author

nojvek commented Oct 6, 2021

@IvanGoncharov thats an interesting idea. Although not a fan that it only works at the first level but not at the second level.

For users to fall in pit of success I believe, they shouldn't have to learn about the special edge cases. I.e map of maps of maps work just like list of lists of lists are supported by GraphQL. Maps are string indices, as Lists are incremental numeric indices

I'm not sure I understand the key ordering argument. The key ordering shouldn't matter right? List like when you query for a list, it's upto the server to decide what to return at 1st index and thereon. I could be missing something.

@IvanGoncharov
Copy link
Member

I'm not sure I understand the key ordering argument. The key ordering shouldn't matter right? List like when you query for a list, it's upto the server to decide what to return at 1st index and thereon. I could be missing something.

@nojvek Typically GraphQL APIs allow you to sort collections based on some criteria you can specify through input args. For example, GitHub's GraphQL API has an orderBy argument on most of its connections. You can search for orderBy on this page: https://docs.github.com/en/graphql/reference/objects

In case you support pagination you need to support server-side ordering since if the client wants to show only part but with particular ordering of data it can't do ordering client-side.

Even if UI can show items in random order, they would jump around on the screen after each refresh.
You can do the client-side ordering yourself but it's bad DX to sort deeply nested data.

So my argument is that GraphQL is currently optimized to show all data from the returned collection and for that, you need ordering. Map type you proposed is optimized for cases where you need to access some subset of data by key.
My argument is that schema designer needs to choose between Map and List.

From reading #101 most popular use case is client-specific pair of key + value and it looks like you don't need ordering on this one since the client will access values by key.
But if you building editing/viewing UI components for these values, you need to order them somehow.

P.S. from reading #101 I found an interesting point on how gRPC solved that problem: #101 (comment)

@IvanGoncharov
Copy link
Member

IvanGoncharov commented Oct 6, 2021

Although not a fan that it only works at the first level but not at the second level.

@nojvek If you are worried about that we can complicate the syntax a bit.
For example, if we have:

type Query {
  chessBoard: [[CheessSquare!]!]!
}

type CheessSquare {
  coordinates: String! # E.g. E2
  piece: ChessPieceEnum
}

If you do:

{
  chessBoard: {
    [coordinate]: {
      piece
    }
  }
}

You will get Map<string, CheessSquare>.
But if you add square bracess like so:

{
  chessBoard: [{
    [coordinate]: {
      piece
    }
  }]
}

It means you leave the first level of the array as-is and converting its items to maps so you get Array<Map<string, CheessSquare>>.

@benjie
Copy link
Member

benjie commented Oct 6, 2021

@IvanGoncharov I believe you missed the Array in your first TypeScript type -> Map<string, Array<CheessSquare>>

@nojvek
Copy link
Author

nojvek commented Oct 6, 2021

My argument is that schema designer needs to choose between Map and List.

This a good point. IIUC this means there would be no change to resolvers or existing schemas, they would keep on resolving as List fields, but the GraphQL service would mush the array into an object. I do like the developer simplicity it brings. Less is more.

And yes I agree, the order of keys would be important and need to be be specified as part of spec. JS objects/dicts do this by default, but Python and other languages may not.

In that sense the spec would be:

  • Only field of type ID or String can be used as keys.
  • If there are duplicate keys, the last key overrides the value.
  • null keys are ignored.
  • If list resolves to a null, map will resolve to a null.

The transformation is equivalent of Python dict comprehension, but with keys ordered by insertion.

{item.id: item for item in list}

That brings up the question, should two level maps be allowed?

type User {
  id: ID!
  teamId: ID
  name: String
  profilePicUrl: String
}

type Query {
  users: [User!]
}
{
 users: {
  [teamId]: {
    [userId]: {
      firstName
      profilePicUrl
    }
  }
 }
}

For performance reasons, I feel only one level should be supported.

User can get map of maps if their types are like this.

type Team {
  id: ID!
  users: [User!]
}
{
 teams: {
  [id]: {
    users: {
     [id]: {
      firstName
      profilePicUrl
    }
   }
  }
 }
}

This is going to be an interesting UX challenge with GraphiQl. Right now it's a nested list with checkboxes, but it will need someway of indicating that a field is used as the map key.

I'm in favor of your proposal @IvanGoncharov.

Although I would argue against square bracket for array syntax. In current spec, to get an array you don't specify square brackets. It's implicit. Do you have real world examples when an array with single key, value pairs are useful?


{
  chessBoard: [{
    [coordinate]: {
      piece
    }
  }]
}

I don't think GraphQL should support this ^

@rivantsov
Copy link
Contributor

rivantsov commented Oct 6, 2021

@nojvek , this is not Relay spec, there's no ID here.
Implicit ordering - although ordering matters for clients in most apps, returning in ANY order if order is not specified is OK. For example, SQL (in any db server) is free to return recs in any order if OrderBy is not specified. Dictionary entries or properties in objects in languages like c# are NOT ordered, rules explicitly state that there is no order, and it works OK all the time.
If you try to force some order all the time on returned set, you put unnecessary burden on server, forcing to sort entries all the time, even when it is a small set and it does not matter for client or server. Much better to follow SQL pattern - server is free to return in any order when order is not specified.
@IvanGoncharov, I like your proposal, at least looks better to me than the original. But then what we get is an option for client to ask server to slightly reformat the returned list (group it by some key). Is this such a big deal for clients, making life so much easier so that it's worth this syntax extension? I think it can be done from the list with one statement by the client itself, doesn't it?

@IvanGoncharov
Copy link
Member

I believe you missed the Array in your first TypeScript type -> Map<string, Array>

@benjie No it's intentional. Idea is that by default (without square brackets) you replace the any-dimensional list with a map of items.
Here is example list:

No square brackets:
Array<ItemType> => Map<string, ItemType>
Array<Array<ItemType>> => Map<string, ItemType>
Array<Array<Array<ItemType>>> => Map<string, ItemType>
...

Single square brackets:
Array<ItemType> => Validation Error
Array<Array<ItemType>> => Array<Map<string, ItemType>>
Array<Array<Array<ItemType>>> => Array<Map<string, ItemType>>
...

Double square brackets:
Array<ItemType> => Validation Error
Array<Array<ItemType>> => Validation Error
Array<Array<Array<ItemType>>> => Array<Array<Map<string, ItemType>>>

So square brackets in my proposal mean leave array at these levels as is but convert all nested arrays to a map.
I think it's the most discoverable behavior + we can suggest adding square brackets inside error message if keys will clash.

@benjie
Copy link
Member

benjie commented Oct 6, 2021

Thanks for clarifying 👍

@IvanGoncharov
Copy link
Member

And yes I agree, the order of keys would be important and need to be be specified as part of spec.

@nojvek The problem is we can't do that. JSON doesn't guarantee the order of fields in the spec and in practice, many JSON parsers don't guarantee it either. See here: https://www.json.org/json-en.html
An object is an unordered set of name/value pairs.

If there are duplicate keys, the last key overrides the value.

It's a deal-breaker for me. No data loss should happen, especially unnoticed.
At the same time, I understand that runtime crashes can happen because no one tested particular query on a big enough dataset.
How about allowing it only on ID fields?
ID is basically a String but with a semantic of being unique atlest on this particular type.
Plus nothing prevents you from having multiple ID typed fields in a type.
So it's even more flexible than the original Map proposal.

null keys are ignored.

Also a deal-breaker for me, for the same reason.
In your original proposal, every item of the map should have a key.
So I don't understand why we should support null in transformation variant.

User can get map of maps if their types are like this.

Agree 👍
We don't want to implement the entire Lodash, just a simple transformation to solve a particular use case.

Although I would argue against square bracket for array syntax. In current spec, to get an array you don't specify square brackets. It's implicit. Do you have real world examples when an array with single key, value pairs are useful?

Agree 👍 If we restrict this transformation to only ID as keys, it will guarantee against name clashes even in multidimensional arrays.

@nojvek
Copy link
Author

nojvek commented Oct 7, 2021

After sleeping over this, I think there's still a case for the original proposal with field: {Value} semantic.

In the list to dict transformation proposal by @IvanGoncharov, there is no way to specify a return type of dict with scalar keys. We can't specify stockByLocation as Map<ID!, Int!>.

{
 id: "123"
 name: "K95 Face Shield 24 PK",
 stockByLocation: {
   "seattle": 30,
   "portland": 40,
   "miami": 30,
   "st_louis": 10,
   ...
 }
}

They are not mutually exclusive proposals. One is syntax sugar to transform list into dict, while the other is allowing resolution of map fields which return map. The map of map of scalars is a valid usecase supported by field: {{Bool!}}

which can return

{
  field: {
    foo: {
      bar: true,
    }
    baz: {
      qaz: false
    }
  }
}

The usecase for returning map types is when using document databases like mongo or KV stores like redis, there are many instances where the document is stored as json containing maps. There's also graphql being a layer on top of existing REST-ful apis that return values in similar shape. E.g an api returning feature flags.

{
  showPage1: true,
  showNewNav: true,
  showNewDesign: false
}

@IvanGoncharov IvanGoncharov added the 💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md) label Oct 7, 2021
@IvanGoncharov
Copy link
Member

After sleeping over this, I think there's still a case for the original proposal with field: {Value} semantic.

@nojvek Yes, had the same idea. And I think I know how to allow this in the transformation approach.
But first I want to voice my concern with the stockByLocation example.
I think it's an excellent example of another concern with Map type I have.

In many cases you can choose multiple keys for the same data, in your example, you can use location name as a key but in other cases, you may want to use id or address as a key.
That means you depend on client needs you may need to create multiple maps for different use cases.
And it promotes denormalization of data (more than one way to get the same data) and it's bad for many things, e.g. caching (Relay, Apollo Client, etc. cache of one map is useless if you ask for another map), optimistic updates (on the client you need to update multiple caches), etc.

My idea is for the schema author to point stuff that can be used as keys and allow clients to specify what key they want to use for indexing. That way data coming from the same type so we don't have a problem with client caches. Moreover, clients can stip transformation (AFAIK they already do this for aliases) and apply transformation after they get the result and store it in the cache.

In the list to dict transformation proposal by @IvanGoncharov, there is no way to specify a return type of dict with scalar keys. We can't specify stockByLocation as Map<ID!, Int!>.

This can also be done with the same transformation if we allow specifying fields in addition to selectionSet.

{
  stockByLocation: locations {
    [name]: stock
  }
}

@IvanGoncharov
Copy link
Member

Another interesting by-product of this proposal is that we don't need to "single entity" field + alias combination anymore.
For example, with GitHub API you may need to get a few selected repositories.
With the current API you can only do:

{
  organization(login: "graphql") {
    GraphQLSpec: repository(name: "graphql-spec") { ...SomeInfo }
    GraphQLJS: repository(name: "graphql-js") { ...SomeInfo }
  }
}

Aliases are non-dynamic (need to be explicitly written in query) and can't have arbitrary names.
With the proposed map transformation you can add repositories filed instead and do:

{
  organization(login: "graphql") {
    repositories(names: ["graphql-spec" "graphql-js"]) {
      [name]: {
        ...SomeInfo
      }
    }
  }
}

And you will get a nice Map with names as keys.

Another use case it allows query dynamic data more naturally, for example we can have JSON type as follows:

type JSON { # making it oneof per #825 will make it even better
  number: Float!
  boolean: Boolean!
  string: String!
  array: [JSON]!
  object(keys: [ID!]): [JSONProperty!]!
}

type JSONProperty {
  key: ID!
  value: JSON
}

That allows you to query JSON and get result with very minimal wrapping:

{
  jsonField {
    object(keys: ["foo", "bar"]) {
      [key]: value {
        number
        string
      }
    }
  }
}

With oneof sematic from #825 you will get:

{
  jsonField: {
    foo: { number: 3.14 }
    bar: { string: "baz" }
  }
}

@nojvek But I don't hijack this PR so I'm ok creating separate one if you think both proposals are worth exploring. What do you think?

@nojvek
Copy link
Author

nojvek commented Oct 7, 2021

@nojvek But I don't hijack this PR so I'm ok creating separate one if you think both proposals are worth exploring. What do you think?

After much thinking, I'm convinced that the the query time transformation from lists to map is a better idea. It allows the same List type to be used in multiple ways.

Another interesting by-product of this proposal is that we don't need to "single entity" field + alias combination anymore.

Yes, this is a pain I've experienced before while using Githubs GraphQL.

This can also be done with the same transformation if we allow specifying fields in addition to selectionSet.

{
  stockByLocation: locations {
    [name]: stock
  }
}

Wonder why I didn't think about this. Yes, this solves the scalar value problem.

--

To discuss in graphqlwg. For the list to map transformation:

  1. For duplicate keys, should last value win? or should it throw an error?

I propose last value wins.

[{id: "Foo", val: 1}, {id: "Foo", val: 2}]

{
  [id]: val
}

Python dict comprehension behavior is last value wins. Object.fromEntries([["Foo", 1], ["Foo", 2]]) in js has similar behavior.

  1. For null keys, do they get ignored or does graphql throw an error.
    If we say that keys can only be ID! or String!, then this can be detected at validation time and we can throw error.

I propose we should throw error for null keys.

  1. Should only ID! be allowed, or both ID! and String! ?

I propose both should be allowed since keys are string. But I'm open to interpretation. In Github's graphql example, repo names are defined as String!

image

https://docs.github.com/en/graphql/overview/explorer

  1. How about Lists of lists? field: [[User!]!]! ?

I propose only field: [User!] is allowed to be transformed on. Array of arrays aren't allowed.

  1. Should order of keys be preserved?

JSON spec doesn't guarantee order as @IvanGoncharov has mentioned. I propose graphql spec should say should instead of must. Even JS for a long time was a should but all modern JS engines now support insertion order on keys. It is part of JS spec now.

@mjmahone
Copy link
Contributor

mjmahone commented Oct 7, 2021

IMO there is a real need for user-created generic types, a la

abstract type KeyPair<K, V> {
  key: K
  value: V 
} 

which you could use in a schema type via:

type User {
  named_friends: [KeyPair<ID, User>]
}

which you could use at execution time like:

query Q {
  me {
    named_friends {
      key
      value {
        name
      }
    }
  }
}

Note we already have generic types (lists and nullable types), we just don't allow user-creation of generic types.

This would also allow different schemas to expose different kinds of maps depending on what's needed:

abstract type MapViaTwoLists<K, V> {
  keys: [K]
  values: [V]
}

abstract type MapViaPairs<K, V> {
  values: [Pair<K, V>]
}

abstract type TreeMapNode<K, V> {
  key: K
  value: V
  left: TreeMapNode<K, V>
  right: TreeMapNode<K, V>
}
abstract type FullTreeMap<K, V> {
  nodes: [TreeMapNode<K, V>]
}

I believe this is an alternative to adding a single map type: it's unclear what is actually wanted for a map type. Do you want to create some tree with O(logn) lookup? A bijection between two groups of data? Something else?

@leebyron
Copy link
Collaborator

leebyron commented Oct 7, 2021

Feedback from Oct WG for posterity:


We should better define what problem we want to solve, which will guide us to the best path to invest proposals into. For example:

If the problem is that we have "bags of data" on the server, such that the keys and values are generally unknown - then the past discussions around the JSON scalar might be most compelling.

If the problem is true Map types on the server that need to be translated to true Map types on the client, then a full featured Map type (or pattern) will be the most compelling.

If the problem is wanting to utilize JSON objects as Maps in the query response, then a transformation syntax will be the most compelling.

It could be that multiple of these are worth exploring to a degree, but better understanding what problem we're solving will help us focus on making the right tradeoffs and explore the solution space most likely to have an impact.


In terms of the transformation syntax. I personally think it is super interesting. It feels inline with other related ideas of aliased fragments and nested objects as well as field chaining. I almost want to say we should hold back on something like this until we could properly explore the use cases and surface area of all kinds of transformation to better understand the total design space.

That said, I want to stay cautious about my own interest because if I apply our guiding principles then I think this path starts to seem misaligned.

Most importantly, "Enable new capabilities motivated by real use cases" seems hard to argue since this is purely a transformation of something we could already do. We have to ask ourselves if having the query and service do this transform rather than a client-side post-process is worth the complexity. Which comes to "Simplicity and consistency over expressiveness and terseness". The syntactic complexity, the restrictions on keys, the concerns about lost-data footguns... these all seem like they're big hits to simplicity and consistency all for some fairly minor expressiveness wins. And then "Favor no change" means we need to set a high bar of added value before making significant changes. I think a broader transformation proposal could potentially leap this hurdle, but right now I'm not so sure.


In terms of the Map type, looking to prior art I think our clearest source of inspiration could be protobuf map types

An important distinction is that this protobuf map type still uses a list of (k, v) tuples in the serialized transport, rather than something unique. Doing the same ourselves would maybe be disappointing to those looking to leverage JSON objects as Maps, but would allow support for far more key types.

This would require some client-side transformation to convert the entry pairs back into a native map type (since JSON doesn't have one). Presumably a code generator or something like that could do that for you.

I can see two paths forward for this space.

  1. An explicit Map type, whatever the syntax. A rough idea could be { key: value } as lists are [value], so as an example { Int: MyType }

  2. A community standard, similar to *Connection, roughly *Map so examples of each:

    # Example of existing Connection pattern:
    type UserConnection {
      edges: [UserEdge]
    }
    type UserEdge {
      node: User
    }
    
    # Potential example of a Map pattern:
    type UserMap {
      entries: [UserEntry]
    }
    type UserEntry {
      key: ID
      value: User
    }

I think what @mjmahone is getting at is that maybe there exists a higher level type we could introduce that would allow both Connection and Map to be defined in user space, but I'm slightly worried that side-steps the part where client tooling needs to generate code to handle these types.

Certainly the community standard would be the easiest path since there's no change to spec necessary and we can immediately build tools that use it. But it would be more awkward to query than something built into the spec.

@nojvek
Copy link
Author

nojvek commented Nov 3, 2021

Thanks @leebyron

I just wanted to say this was my first time at GraphQL WG meeting. It was very well organized and witnessing agenda being followed on time was a rare sight.

Even though folks had strong opinions, it was directed towards the technical details rather than personal. I felt quite welcome. Thank you.

I did learn that GraphQL can be a superset of JSON responses. Previously my assumption was that GraphQL responses are always JSON. That changes the design constraints.

I am okay if we close this PR / RFC. Happy for someone else to open another more concrete proposal.

@nojvek
Copy link
Author

nojvek commented Nov 3, 2021

I also very much like the proposal of built in / user-extendable generics-ish types by @mjmahone

e.g Pair<Val1T, Val2T> or Map<KeyT, ValT>.

But having worked in Typescript's codebase, generics can get pretty wild fast. So there needs to be a more trimmed down proposal.


As for the transformation proposal by @IvanGoncharov, I feel it has legs. We shouldn't discount developer user experience.

GraphQL's alias syntax is powerful. One could say that is possible if frontend made multiple GraphQL queries and stitched it with a rename. Alias makes this much easy.

Currently consuming Dictionary structures from GraphQL is painful. Doesn't have to be JSON objects. JS has Map type, Python has Dict type, Go has map type. Most languages have built in map types that can handle any Key, Value pair of types.

I do agree that having keys restricted to Strings is bad design.

GraphQL supporting a List -> Map transform that can work with a variety of key/ value types is more powerful.

I feel there is a world where both @mjmahone and @IvanGoncharov's ideas can live together. Will have to do some deeper thinking.

@leebyron
Copy link
Collaborator

leebyron commented Nov 4, 2021

But having worked in Typescript's codebase, generics can get pretty wild fast. So there needs to be a more trimmed down proposal.

Totally agree. Our existing very simple type system already has a fair amount of complex ramifications, and I'd be very worried without some firm constraints.

We also need to consider the ecosystem around the query & schema language itself. Introspection, code generators, and more need to know how to make the best use of these things. I think this is all solvable, but we should be confident that it is (and reasonably so, such that the ecosystem doesn't constrain too much) before we seriously consider it

@nojvek
Copy link
Author

nojvek commented Nov 9, 2021

I'm closing this proposal PR as a no go. We can always open another proposal keeping some of the discussion points in this thread in consideration.

@nojvek nojvek closed this Nov 9, 2021
@acao
Copy link
Member

acao commented Nov 9, 2021

For the record, please don’t avoid new language features on our behalf! Graphiql, the lsp server and playground use the same underlying parser and language service, we are set up to handle new language features. We added interfaces implements interfaces quite easily. Also remember that there is a ton of graphql tooling outside the js/ts space, often using a graphql-js mirror reference implementation in their language, so a reference PR to graphql-js for any language feature has ripple effects across the many languages that use graphql

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants