Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Apollo Federation and Microservice Architecture #24

Open
johnkm516 opened this issue Aug 29, 2022 · 40 comments
Open

Support for Apollo Federation and Microservice Architecture #24

johnkm516 opened this issue Aug 29, 2022 · 40 comments

Comments

@johnkm516
Copy link

This is more of a question than an issue. I couldn't find a discussion board or slack, sorry if this is the wrong place to ask.

This repo is exactly the starter-kit I was looking for as it minimizes writing repetitive code that is defined in the SDL anyways, and makes full use of all generated Prisma Inputs in the resolvers.

However I am wondering how scalable is the current architecture of this project? Right now all the generated schemas are input into one application so it looks like it's a monolith. Would it be possible to separate services into different app folders, and federate them with Apollo Federation using Apollo Router? Looking over the gulpfile and the relevant code-gen I definitely think it's possible, but I was wondering if it is worth the effort, as I might be missing something (the current architecture might not be a monolith at all).

@ZenSoftware
Copy link
Owner

Hello @johnkm516. I was planning on setting up a Discord at some point, but currently here is the best place to discuss things. Yes, I was trying to reduce all the most redundant tasks possible by unifying as many systems as possible. I'm glad you see the value in it 😁.

In terms of architecture, it is a sort of monolith in that the API was intended to be a single endpoint to give access to Prisma via GraphQL. I have been looking into Apollo Federation lately and have been mulling over how it would best fit into this project. I'm trying to gain a bit more experience with federating multiple graphs to get a handle of the technology first. I am completely open to re-architecting things if it aids in making things more modular.

The original aim was to have a single javascript import endpoint in which you could access the API. This ended up as an Nx Angular library named @zen/graphql, which is where all the front-end services are being generated. It really simplifies access to the API when there is only one place to look. Though, I was worried about bloating the front-end with unused GraphQL documents that were being code generated. Fortunately there is a way to do tree shaking via webpack magic comments. So only the code that is being used is being included in the final bundle.

You can see the configuration that is generating PURE magic comments here:

pureMagicComment: true

Again, I am a bit inexperienced with GraphQL federation and not sure what would be a generalized solution that would work well. I'd be happy to work on this issue with you. Could you elaborate a bit more on how you were envisioning the services to be separated?

@johnkm516
Copy link
Author

johnkm516 commented Sep 1, 2022

Hi, @ZenSoftware,

To be honest with you, I'm a complete newbie myself. I've spent the past few months trying to figure out all the bits and pieces for a new backend architecture that my organization can adopt (my org currently uses a monolith), and it has been really really difficult trying to understand and fit pieces like Prisma, Nexus, Apollo, together. I want to use GraphQL and the power it brings with Apollo Studio and introspection, but there was so much boilerplate code compared to normal CRUD APIs I had to find a solution that auto-generated much of the boilerplate code. After cloning literally dozens of repos trying to figure all this out your repository was the closest one I came across where it's an schema first approach (where the prisma schema is the single source of truth) and generates all the typedefs and resolvers (using Prisma was important to me).

While I'm not going to be much help on the development front (at least right now as I'm still working to find a solution while simultaneously learning), I did learn a lot in the architecture side of things and the best practices on this end. The architecture I'm envisioning looks like this :

  • A monorepo with a prisma folder in the root folder of the project, and subschema microservice app folders. There would be several schema.prisma files located in each subfolder of the same name of the microservice app inside the root prisma folder, and a script where building from root copies the schema.prisma files to its respective app folder and auto-generates code for each app as this repo does currently. Each of the subschema microservices would use a ApolloFederationDriver. I just tested this on your repo and it seems the buildSubgraphSchema accepts the GraphQL schema generated in app/graphql/resolvers/index.ts, then all I did was switch out the driver to ApolloFederationDriver and gql config to ApolloFederationDriverConfig and commented out the subscriptions portion (I got an error here saying ApolloFederationDriverConfig currently does not support subscriptions). While the single graphQL subschema currently works I haven't tested whether or not the federation part with another subschema actually works, I'm sure there needs to be some modification to the code generation to fit Apollo's subschema conventions. There would of course be a federation gateway service that acts as the single point of API access and handles authentication and authorization, and the supergraph schema would be a managed federation. In terms of Angular, I didn't really consider the code generation and how it would generate the the front-end services as I'm currently solely focused on the backend currently and considering the frontend as well is currently out of scope for me. I did look into it a little bit however, and perhaps https://github.com/jamiepine/prismix could be the key here? Since prismix combines the multiple schemas into one schema, the build process would be something like :
  1. Copy all subschemas to their respective app prisma folders.
  2. Run typedefs and resolver code generation for each of the respective apps
  3. Run prismix to combine all the subschemas into a single schema
  4. Run any frontend code generation that currently relies on a single schema
  • The main problem with microservices and separating schemas is that there are many cases where a service has to get data from another service. My company is an apparel manufacturer, and pretty much everything revolves around purchase orders from our buyers. For example, a purchase order contains several orders with very large amounts of data, from style # (the specific apparel's code), purchase order #, all the different sizes, all the different colors, all the different quantities at different points of the manufacturing process (spreading, cutting, sewing, QA, packaging), ship dates, and we also have to keep track of TNA such as all the logistics and import of raw materials, the export of the product, and so much more (you can see why separation of concerns and modularity with a federated supergraph is extremely important for my company). So I looked into interprocess communication with Kafka, or gRPC, event driven architectures, and honestly all of this makes things too complicated with even more boilerplate code. So here's what I'm thinking :
  1. Maintain separation of concerns and model out schemas using best practices of what an Apollo subgraph should look like
  2. Code generate all the queries and mutations from the prisma schemas as this repo currently does. Do not put any extra logic in the resolvers or anything, the GraphQL API would only serve to represent and serve data (obviously there could be extra resolver logic here as long as it only refers to data within the bounds of the subschema itself with any basic logic within the bounds of the subschema).
  3. Use a microservice orchestration solution like Prefect for all the non-CRUD logic and have it serve as the "brain" of all the main business workflows. That way there is no circular dependencies, no extra latency with hops between microservices, and no hassle in generating protobuf for gRPC or trying to manage non-graphql transporters and pub/sub logic. The orchestration solution would call the gateway and therefore the supergraph, and all the important processes would be done in the orchestration solution (which handles workers as well in a serverless manner and highly scalable), which in turn would also call back to the gateway and mutate the data of all the subgraphs simultaneously through the gateway. Debugging would also be very intuitive and business workflows will be far easier to understand, as solutions like Prefect show a visual rundown of the workflow : https://www.prefect.io/guide/blog/introducing-radar/

The microservice orchestration portion would belong to a different repo for main business processes and is outside the scope of this boilerplate project but just explaining what I'm thinking here.

Transitioning from a monolith to new technology like GraphQL and microservices and trying to decide what the "right" architecture is has been extremely difficult. There's so many different opinions and different implementations and so many different frameworks, libraries / modules after learning and researching so many different stacks and services I still haven't gotten to actually coding (It's always like, oh what's this? Is this better than the other thing? Can this fit into the stack I am currently thinking of?). While I've learned a lot the past months and formed my own opinions on what the backend architecture should look like I don't know for certain if the architecture I describe above is "correct". I'd love to discuss this with you further either here or somewhere else like discord or slack if you'd like. Let me know what you think, or my ideas are even feasible in terms of actually implementing things.

@johnkm516
Copy link
Author

johnkm516 commented Sep 1, 2022

Still researching on solutions in terms of implementation, and found this :
https://github.com/0xR/graphql-transform-federation

Looks like we can apply Middleware while leaving the current code generation as is to apply Apollo directives such as @key and make all the types extendable so that other subgraphs may extend the types, after the fact.

Edit :

Found a gateway solution that comes with a whole range of features including auth and subscription and can unify not just apollo federated graphql but basically every other API as well, and composes during compile time a "virtual graph" : https://docs.wundergraph.com/docs/use-cases/api-composition-and-integration#docker-but-for-ap-is

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 2, 2022

I am going to rework the dev ops so it will be possible to provide arguments that specify the schema.prisma paths and out directories as arguments. It will modularize things and turn the tool from a monolith, into a more generalized tool.

I'd like to bring up some complexities in how to get the tooling for auto-completion to work with multiple GraphQL schemas within the VSCode. There will need to be multiple apollo.config.js files. I actually submitted a feature request for this a while back. This used to be pretty buggy in the past, but hopefully the team working on their extension has ironed everything out with the issues with monorepos and multiple GraphQL schemas.

I'm going to play with some of the tools that you have linked. Thanks for sharing your investigations with me. Give me a little bit to work on this and I will get back to you with a more comprehensive response. All of this should be do-able.

I will be working on this at federation branch currently. I plan to merge it into the base branch and get eventually get rid of the branch.

Here is the latest commit that contains the baseline changes that integrates ApolloFederationDriver.
e5fcdda

@johnkm516
Copy link
Author

johnkm516 commented Sep 4, 2022

@ZenSoftware
I've done some further investigating.
https://github.com/0xR/graphql-transform-federation
currently is out of date and does not work. Furthermore, Wundergraph while it has a lot of features out of the box, currently does not support Apollo Federation 2.

I'm playing around with the graphql-transform-federation in an effort to update the libraries to the latest versions and adapt the code accordingly.
The reason why I think Federation 2 is a must is because of Federation 2's features; specifically, the difference in how subgraph entities are managed and how it simplifies code generation rules for our Prisma schemas. In Federation 1, there is an "originating" subgraph, and other graphs may extend the originating subgraph. In Federation 2, there is no originating subgraph, Apollo Gateway will automatically find entities of the same type and merge them.

image

This very much simplifies code generation, we can simply add the @key directive to every entity object that is code generated from schema. Furthermore, any schemas containing the key only (stubs that are used only to refer to other subgraphs rather than extend any additional data of the entity) can additionally have a code generation rule to add the "resolvable : false" directive :

image

I'm currently working on a brand new project from scratch that adds pal.js code generation and transforms the graphql schema object using my updated graphql-transform-federation. I'm ironing out all the errors currently occurring from updating all the libraries to latest. I'll let you know if I get it working.

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 5, 2022

Hi John. Thanks for sharing your progress. I'm glad you are investigating solutions into getting Apollo Federation v2 working for us. I agree with your analysis that we would want to be v2 compatible. I have worked tirelessly to get all of the dependencies within the project to the latest versions possible myself. We are currently blocked on upgrading to Prisma v4 due to PalJS. I was thinking of directly pulling out the Apollo SDL generation of PalJS into something we would have more control over as well. In particular we seem to need the ability to inject the appropriate @key directives to the GraphQL SDL types of the subgraphs. I'm glad you are ahead of me on this.

As I mentioned previously. I am reworking the dev ops currently. I am ripping out Gulp and refactoring things to simply use ts-node to execute .ts scripts. In the process of redoing things, I am going to rework the code generation that I am producing to be more modular. I think it might help to give you a bit of an overview of the tools currently being used to spit out code for us.

Of course other front-end technologies like React will be able to use our web APIs, but because there is so much technological investment in @zen/auth to handle the authorization for Angular. The first step would be to get the current auth system working with Apollo supergraphs & subgraphs to ensure correctness first.

Auth seems to touch everything, and I believe I will need to extract out the Nest modules under zen/apps/api/src/app into their own standalone node libraries that multiple Nest apps can import & configure.

Let me first get most of the dev ops reworked before I extract the Nest modules into their own standalone libraries. I will need to workout the details on how to best model authorization in our sort of federated model. I haven't worked out all the details out on this just yet. I'll report back once I have made more progress on this. 🎐

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 5, 2022

@johnkm516
I overhauled the dev ops. I completely scrubbed out Gulp from the project and now the build scripts are pure Node scripts written in TS. Furthermore, instead of calling the PalJS CLI, I directly import from @paljs/generator and execute the generator from there. I refactored things under generate.ts at the root of the project. All the dead code has been scrubbed out of the codebase and now it should be much clearer what is being generated.

The package.json scripts have been renamed:

  • apollo:gen is now frontend:gen
  • gql:gen is now api:gen

I managed to reduce the total size of the docker image as well by going over all the packages with a fine tooth comb and removing all unused packages. You can pull the latest changes from the base branch. There is more work to be done, but now at least a single application's API can be generated using the generate.ts script. We can start modularizing things now that a single application is stable.

@johnkm516
Copy link
Author

johnkm516 commented Sep 7, 2022

@ZenSoftware
I updated deps of https://github.com/0xR/graphql-transform-federation and forked into my own repo. You can find it at
https://github.com/johnkm516/graphql-transform-federation

I also created another repo https://github.com/johnkm516/maple off of another template to test if the SDL transform works, and it does (querying the SDL from the Apollo sandbox shows the directive and resolve reference correctly). This portion of the code transforms the already built schema to add the key directive to the User model and resolveReference for findUniqueUser :

const federationSchema = transformSchemaFederation(schemaWithoutFederation, {
  User: {
    keyFields: ['id'],

    resolveReference(reference, context: { [key: string]: any }, info) {
      console.log(info.returnType.toString());
      return delegateToSchema({
        schema: info.schema,
        operation: OperationTypeNode.QUERY,
        fieldName: 'findUniqueUser',
        args: {
          id: (reference as UserWhereUniqueInput).id,
        },
        context,
        info,
      });
    },
  },
});

However I question the usefulness of this as this part also has to be code-generated, for all the different queries and mutations. I think it would be far more useful to code-generate the federation subschema from the get-go from paljs by extending paljs with a federation : true config and generating the resolveReferences and directives accordingly.

I'll clone your federation branch and also look into changing paljs directly. I'll be honest with you, while I've been looking into NestJS for quite a while I never actually got around to coding in it (in fact I never touched typescript before), so as I mentioned before I'm a huge newbie so my code might look weird. I like to think I'm a fast learner so I'll just dive in anyway and hopefully be of some use. I'll again update on any progress I have if I have any.

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 7, 2022

@johnkm516
I'm always happy to see others put in the effort to grow. How you improve as a developer is to solve challenging problems. This is certainly one of them. You seem to be competent with systems integrations, which in itself is a highly valuable skillset and carries over well to programming. Fortunately Nest really couldn't make things easier to build and run a web server. It is actually the ideal platform for newcomers. You can achieve a tremendous amount with very little code.

I changed the output directory for PalJS generated assets from /generated to /paljs. There should be zero ambiguity as to who is generating what now. You can see my little TS script that will generate the varying assets for a single Nest API.

generate.ts

const generator = new ZenGenerator({
  palConfig: require('./pal.js'),
  apiOutPath: path.join(__dirname, 'apps/api/src/app/graphql'),
  caslOutFile: path.join(__dirname, 'apps/api/src/app/auth/casl/generated.ts'),
  frontend: {
    outPath: path.join(__dirname, 'libs/graphql/src/lib'),
  },
});

await generator.run();

We can reuse this script for any number of schema.prisma files & apps. It is currently configured to output to the api application for demonstration purposes.

On another note, for any given Prisma schema there is a lot that PalJS is generating for us. It generates both the TypeScript types under resolversTypes.ts and the GraphQL SDL types within ../typeDefs.ts files. There is so much that it is taking care of for us that it feels a bit daunting to extract out the relevant bits of code from a fork. I'm concerned about losing out on the benefits of getting future upgrades as well.

Instead, maybe it would be best to keep PalJS within the pipeline and do some sort of post-processing step after PalJS that would modify the typeDefs.ts files. You can see the GraphQL SDL type of the Prisma model being defined there. A simple experiment would be to manually add @key directives to the typeDefs.ts files PalJS generates and see if it does what we think it should do.

Before I start experimenting with Apollo Federation v2 myself, I need to extract out the authorization module for Nest and turn it into a standalone library. Multiple Nest modules will be able to import and configure their own @casl/prisma abilities for their own respective Prisma schema that way. That should move us much closer to making everything fully modular.

Let me know if you run into any issues or have any questions. Cheers!

@johnkm516
Copy link
Author

johnkm516 commented Sep 8, 2022

@ZenSoftware

I read through the PalJS code generation yesterday and it's actually quite easy to understand. I created my own fork and added the

extend schema
    @link(url: "https://specs.apollo.dev/federation/v2.0", import: ["@key", "@shareable"])

for federation 2 header in typedefs, as well as adding the "federation : boolean" in config of the generator class.
You can read over my changes in my fork. Given how relatively simple the code generation is (and my changes to it), I don't think my changes would be difficult to merge when PalJS eventually gets upgraded with Prisma V4, since all I'm doing is adding some extra string concatenation.

While I agree with you that our changes should be some sort of post-processing to change the generated files after they're created to not modify PalJS directly, as Apollo gets updated with new directives and Prisma also gets updated, we will have to change our post-processing accordingly; whereas modifying the PalJS generator directly allows us to use @paljs/schema and all the data models that it uses to generate. While my fork of graphql-transform-federation works after updating the dependencies, the way it's implemented it's not easy to add new directives like "@Shareable" that's currently not in the implementation, while modifying the generation from the get-go in PalJS it's a simple string concatenation so I don't have to worry about inputting a GraphQLSchema object, reading and modelling the schema to modify it, then build the modified output GraphQLSchema. So I am more leaning toward merging my changes with any future PalJS upgrades manually since my changes won't be too extensive.

Furthermore, the SDL generation actually only uses a few files and has very little cross-dependencies. It's very much self contained and the code looks extremely simple to extract. We could just extract the SDL generator, upgrade the dependencies to latest including Prisma v4, and modify it for our own use. All the files that are relevant to the SDL generation that we use is contained within :

image

with some utility libraries like Prettier, fs, and some custom type objects from @paljs/types.
How the SDL generation works is it takes the DMMF from @prisma/sdk (now @prisma/internals in Prisma v4) and uses it to concatenate strings and output to the files. Very simple, no other dependencies than the ones I listed. In fact I'm going to try and upgrade the deps to Prisma v4 now.

Let me know what you think.

@johnkm516
Copy link
Author

@ZenSoftware

I upgraded all the deps to latest including to Prisma v4, and tested SDL generation. Works fine.

@johnkm516
Copy link
Author

johnkm516 commented Sep 8, 2022

@ZenSoftware

I have completed modifications to the PalJS code to generate a federation subschema. I haven't tested through a gateway yet but testing the _entities(...): [Entity]! in Apollo Sandbox to emulate a federation gateway call to a subschema it seems to work. Please have a look over at my fork, npm pack it and replace the palJS with mine and try it out. Let me know what you think.

Edit : Ok never mind I'm having issues when I tried building everything from a clean proj. Need to fix deps and build issues. I'll let you know when I get everything working properly.

Edit 2 : I think it works now?
You have to npm pack the /types and /generator and add them to a local lib folder and change the package.json accordingly.

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 8, 2022

@johnkm516

Well done my friend. I have made substantial progress myself. Let me share with you my findings.

According to the Apollo Federation subgraph docs the signature for the __resolveReference function should take 2 paramaters.

  1. Reference object
  2. Context

Therefore the code should look something like this.

paljs/User/resolvers.ts

const resolvers: Resolvers = {
  // ...
  User: {
    __resolveReference(reference, { prisma }) {
      const [field, value] = Object.entries(reference).find(e => e[0] !== '__typename');
      return prisma.user.findUnique({ where: { [field]: value } });
    },
  },
};

There might be a better way to do this, but notice how I extract the @key fields and do the lookup with Prisma directly. This is how we could resolve our Prisma models given a reference object. But there may be some limitations with this approach. Let me mull this one over for a bit and I will get back to you. But the code above should function for us for now.

Furthermore, the TS type can simply be set to any for now.

resolversTypes.ts

export interface User {
  [key: string]: Resolver<any, any, any>;
  id?: Resolver<Client.User, {}, number>;
  // ...

  __resolveReference?: any
}

Because there is an index signature type of [key: string]: Resolver<any, any, any>, it is making it difficult to explicitly type the __resolveReference function without breaking other types. It will be fine to set it to any.

The __resolveReference generated will be consumed by the Nest GraphQL resolver like so. Reference the Nest Federation docs on the @ResolveReference decorator for more info.
graphql/resolvers/User.ts

@ResolveReference()
resolveReference(@Parent() reference, @Context() ctx: IContext) {
  return resolvers.User.__resolveReference(reference, ctx);
}

My code gen templates will need to include the above code for us when generating the Nest resolvers. I'll be sure to do that once we figure out our situation with the PalJS fork.

As you suggested, I used npm pack to install a local package and then changed the imports within tools/zen-generator.ts to utilize your forked PalJS packages. Then I updated the pal.js config at the root to include your newly added federation option you have implemented. Testing things out, your modified packages seem to compile and execute just fine.

pal.js

module.exports = {
  schema: 'prisma/schema.prisma',
  backend: {
    generator: 'sdl',
    output: 'apps/api/src/app/graphql/paljs',
    doNotUseFieldUpdateOperationsInput: true,
    federation: true,
  },
};

Furthermore, I thought that you should also know that I added the GraphQL SDL definitions for the @key and @link directives. The global-schema.gql.ts has been updated within the federation branch to include the new federation directives that we are using, specifically @key and @link. I found the standard SDL signatures at Apollo Federation subgraph specification and copied them over so that Nest GraphQL schema validator is aware of them.

By the way, there is a small bug at johnkm516/prisma-tools/packages/generator/src/sdl/index.ts#L123
whereby it is outputting an uppercase model name. It should be producing the lowercase variant here. You may need to move the code out of createFiles(...) and into a spot where you will have access to the model.name.

You can pull the latest changes from the federation branch. It now contains the compatibility changes needed. Do let me know if you need clarification on anything here.

@johnkm516
Copy link
Author

@ZenSoftware
I made all the modifications you listed.

  • Commented out the Fed2 directives header in typedefs
  • Changed the interface to __resolveReference?: any
  • Changed the code generation to
const resolvers: Resolvers = {
  // ...
  User: {
    __resolveReference(reference, { prisma }) {
      const [field, value] = Object.entries(reference).find(e => e[0] !== '__typename');
      return prisma.user.findUnique({ where: { [field]: value } });
    },
  },
};

I'm currently away on holiday (it's a Thanksgiving-like long weekend in my country) so I won't be able to work until the 13th.
Let me know if you want to change anything with the __resolveReference if you have any better ideas on how we should go about it. During some downtimes over the long weekend I'm going to do some research on auth and how the auth with the federated services and gateway should be handled. I am also thinking about some very granular permissions capabilities; here's what I'm thinking about the design :

At the very top level is the Tenant. Separating the databases even further from DB per service to DB per (tenant + service), it would be easier to also have something like a test tenant when the data gets more and more complicated.

The second level would be Role. Superadmin would have access to all the tenants and all permissions, and admin with all permissions to a specific tenant, and a variety of user Roles such as Finance, Sales, etc.

The third level will be Organizational Group. in a hierarchical tree structure as all organizations would have. For example, if I belong in the X Organization, in the IT department, and in team A of the IT department, I should have all the permissions that belong to X Organization, the IT department, and team A. Team B of the IT department will have all the permissions of X Organization and the IT department, but not necessarily permissions specific to team A, and vice versa. I plan on creating a separate service that syncs the organizational hierarchy with the company's Active Directory later down the road.

Let me know what you think about this data model. If you agree with it I'll make a pull request to this repo later when I start work on it. Figuring out auth comes first though.

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 9, 2022

@johnkm516
The latest commit c010092 is setup to utilize your changes. I've tested it and everything seems to work as expected. Nice work.

I would suggest reading up on the Casl docs and familiarize yourself with the ability builder. From there you can play with casl-ability.factory.ts to learn how to grant/deny access. The way I plan to refactor the Nest auth module would be such that other applications can provide their own CaslAbilityFactory. The auth solution I have created for Nest pretty much encapsulates everything that you could possibly need to describe granular permissions. Here are my docs on auth if you haven't seen it yet.

As for how this will work in our federated model, I was hoping we would be able to just forward the HTTP Authorization header that contains the JWT bearer token to the next Nest api (subgraph). That subgraph will itself create a Casl ability for the user that was extracted from the JWT, and then run their own Casl ability over the user. Architecturally, this is how I have set things up to work already. So I don't believe there is much work to be done. It should be possible to implement your security model that you have described with my auth directives. It's just a matter of testing all of this to iron out any of the kinks.

I'll respond back with more details on auth once I work more on it myself. Have a great long weekend. Things are in a very good place for us thus far. We have the core of the functionality working! 🥂

Edit
Please read up on Nest Guards as a prerequisite to everything. You'll be able to understand how the directives actually are achieving their magic that way. I configured the repo to work with VSCode breakpoints, so I would suggest setting breakpoints in casl-ability.factory.ts and then running queries to see if your auth logic is doing what you expect it to. Here is how you pass the Authorization header via Apollo Studio.

image

image

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 11, 2022

@johnkm516
More progress! I've managed to rewrite the part of @paljs/plugins that was blocking us from upgrading to Prisma v4.
The repo is finally on the latest version of Prisma. I took the liberty to upgrade to the latest version of Node v18 as well and tested everything to ensure it works. The entire technology stack is now on the latest versions possible! Not a lot of projects are able to achieve this. I'm so happy. 🥂

There were changes from Prisma v3 to v4 that broke @paljs/plugins prisma/docs#3334 This is an open issue paljs/prisma-tools#274 that is a dependency update request for Prisma v4. I managed to narrow down what the problems were. sdlInput is a function that is exported from @paljs/plugins that generates most of the GraphQL SDL types at runtime. I extracted out @AhmedElywa code into tools/sdl-inputs.ts and re-wrote it so that the SDL types are generated as static assets for compile time, instead of generating them at runtime. It seems to be a superior solution to simply generate the entire GraphQL schema as static assets. So that unblocks us from upgrading to Prisma v4. Cheers.

@johnkm516
Copy link
Author

Hey, @ZenSoftware!
Great to see you've made progress on upgrading paljs to Prisma v4.
I'm back from the holidays now. I'll pull from your latest federation branch to get started on the identity schema that I talked about last week as well as go over all the info you've linked me (I didn't get a chance to read over anything over the long weekend as I was visiting extended family).

I'll update you on progress as I have any.

@johnkm516
Copy link
Author

@ZenSoftware
So I'm reading over CASL and how auth works, and have a few questions :

  1. How does the auth flow work in the current implementation if we scale our modules to a set of federated subgraphs like we are planning? Where is the User / Roles / Organization, etc. data stored among the set of federated subgraphs?

  2. From what I can see it seems that the responsibility of auth is on each subgraph. So if we were to create a CLI command that generates an app folder with all the base scaffolding that we currently have, all the current auth related files in /app need to be duplicated to the new app folder. Is this correct? If so, this leads me back to Question 1; if I were to query the supergraph running /app (which currently contains our User schema) and a set of other apps /order, /products, how would the auth flow work? how does /order and /products verify the auth token and where is the auth token stored?

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 13, 2022

@johnkm516
The way I have implemented things is that I've done everything the way the Nest docs recommends. So there is a rich resource for you to read up on how everything is being achieved. This project is simply a concrete implementation of what is typically a Nest app. Though I've actually spent the better part of a year just working on auth. I'm happy that it's finally in a more or less idealized state. Originally, the project just started with RBAC. But I ran into severe limitations doing things that way. ABAC was just recently added. There is a lot of power that @casl/prisma brings by exploiting the WhereInputs of Prisma models. It is such an ingenious solution! So many things that I was unable to describe before, are now describable.

Though there is more work to be done. I've been working on trying to modularize the auth system into a standalone library that will be usable between multiple Nest apis / subgraphs. What you see under the auth folder will be pulled out into something under libs. That way multiple subgraphs (Nest apps) can import the guards & decorators exported from @zen/something, which will redirect to ./libs/something. I'll finalize what the name of the 'something' lib will be shortly. The ideal would be that the only work that new Nest apps will have to do would be to simply define their own CaslAbilityFactory to define their authorization scheme for their own respective subgraph.

Specifically how it works is that any Nest app within our project will look for the Authorization HTTP header for any given HTTP request. The header contains the JWT bearer token which is then validated. Validation is done by via the JwtStrategy configured via environment.ts. The RequestUser is extracted from the JWTPayload and provides it to the Nest guard. So our GraphQL resolvers will apply the @casl/prisma ability for a RequestUser during any GraphQL request and will grant/deny the request given the logic defined within the CaslAbilityFactory. Once you play with @casl/prisma, you will begin to understand how you can model your own authorization scheme with near infinite malleability with this architecture.

I exposed project wide assets under the api-interfaces lib. You will find the Role enum existing there. Previously this was defined in the Prisma schema. But then I changed the roles property on the Prisma User to be defined simply as string array instead of a Role enum array. This allows the auth module to be modularized into a standalone library. It also enables the project to introduce roles dynamically into the future. For example a content management system would need the ability to allow for an admin to define custom user roles. You don't want the database to reject updating the roles field with database constraints. Though the static roles for the project will still be centralized via a project wide api-interfaces library. It took a while to figure out how to best do this. But this is the best solution that I have been able to muster thus far.

Let me know if you need further clarification. Auth is always a rather complex subject when done right.

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 14, 2022

@johnkm516
In terms of a supergraph/subgraph auth scheme, I am still trying to wrap my head around what would be best practice myself. But I believe we already have all the pieces we need to create granular permission schemes that would work to architect an auth scheme for any project. From here, it would be up to any individual project to decide on how they want to implement their authorization scheme utilizing the auth decorators provided and the CaslAbilityFactory they implement. It is best practice to unify the way you handle auth into a centralized location for a project. We'll have a solution for that shortly once I finish modularizing the Nest auth module. Just give me a moment to work out the details on this. There are a lot of moving parts.

Otherwise, the federation branch is in a pretty good state. We can spin up at least one Nest api that acts as a GraphQL subgraph for a Prisma schema. That's no small feat. Now it is just a matter of creating a supergraph out of 2 Nest apps. I haven't tried my hand at that just yet. I'm not sure how Prisma works within monorepos. We should investigate that. Would you mind looking into that as well?

@johnkm516
Copy link
Author

@ZenSoftware
For the auth, I was confused on where the user data is resolved and stored. In /app/auth/models/request-user.ts, the User model from Prisma client is imported, but we cannot do this for all the api services. The User model (and the User prisma schema) would be stored and resolved in a single Auth / organization service, and the other services would use the JWT generated from the auth service by decrypting with a shared key with no way to access the user data other than perhaps a shared key. I still have a long ways to go to really understand what's going on; so unless you require my help or anything I'll leave the auth service to you and try to understand it later when it's implemented for the federation.

For Prisma in a monorepo, I'm still trying to figure out a good solution. I have a few ideas but I need to work out the details and see if it's actually doable. I'll get back to you on this soon.

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 14, 2022

@johnkm516
Yes, you are correct. You are seeing the remnants of tightly bound code that should be decoupled somehow. There needs refactoring for the things to be appropriately modular. So not to worry about that for now. I am in the process of taking care of everything that needs to be done for auth. It's fairly sophisticated code, so let me handle it.

To address your question about where the user is being stored, the actual persistence of the user will be the responsibility of the subgraph with the schema.prisma file containing the User Prisma model currently defined at prisma/schema.prisma. Since each subgraph will have their own schema.prisma file. I suspect the User Prisma model would be located within the main apps/<APP_NAME>/schema.prisma file.

A RequestUser is simply a subset of the full User. Specifically it contains the user's id and an array of their roles. You must retrieve the full user if you want more than that. But the user's id is usually all you need for most operations. It is done this way for performance reasons. There are several ways in how you can implement authentication & authorization. For the API to be stateless, we are bound by this mentality of passing around tokens as a method of maintaining user sessions. All the persistence stuff is just regular Prisma. Between subgraphs, the User will be resolved via our newly code generated function __resolveReference(reference, { prisma }) we implemented. Which inevitably is just a Prisma lookup. 😁

User: {
    __resolveReference(reference, { prisma }) {
      const [field, value] = Object.entries(reference).find(e => e[0] !== '__typename');
      return prisma.user.findUnique({ where: { [field]: value } });
    },
  },

I hope that all made sense.

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 14, 2022

It would probably help you if you just saw some high level code of how tokens are being issued. Take a look at
apps/api/src/app/graphql/resolvers/Auth.ts and read through that code. These are the Nest GraphQL resolvers that handle all the authentication stuff for the app.

For the Angular app, you interact with the authorization service via the AuthService libs/auth/src/lib/auth.service.ts. The code itself within auth.service.ts is kind of complicated, but the methods and properties exposed on the service allow for high-level interaction with typical auth stuff.
It's really simple to use. This service simply calls the above GraphQL resolvers and does all the client side state management.

@johnkm516
Copy link
Author

@ZenSoftware
Thank you for the explanation. I'll keep looking over the code and I think I'll have a much better idea when the auth service is extracted into its own service app.

For Prisma, here's what I have so far :

  1. We have to separate the /app and all to-be subsequent Nest API apps with their own package.json and .env, and make them self contained to be compiled and runnable in their own respective folders. Most importantly, each Nest API needs to have their own Prisma client as each API will have their own database (otherwise migrations from one API will overwrite migrations to another). The root folder will also keep most of its dependencies not to generate code for the API, but to schema merge all the prisma.schemas for each of the Nest APIs, purely to generate the frontend code for Angular modules that you have.

  2. As I explained in the third / fourth post of this thread, the root folder would contain, inside the root's Prisma folder, a bunch of subfolders with each subfolder's name corresponding to each of the Nest API apps. We then would create a CLI script in root which copies each schema to their corresponding app folder (as a precaution, before copying we should also compare the existing schema in the Nest API schema with the root folder's Nest API schema, and warn / throw error beforehand that there is a mismatch in case the developer modifies the Nest API schema directly rather than from the root folder. After copying the schema to their respective Nest Api's Prisma folders, the code generation would execute for each folder, generating their own resolvers and their own Prisma client, etc. By all intents and purposes the Nest API would be self contained, we would simply have these scripts / tools to generate all the Nest APIs at once from the root folder as well as declare the schemas in one place in the root folder.

  3. Now is the hard part : we need to be able to combine and merge all the Nest API Prisma schemas into one supergraph schema, which should correspond exactly to how Apollo Federation 2 would represent the data. As per 1, this is to purely generate the frontend Angular module, as the root folder's schema will not actually ever migrate any data into a database.
    I looked into Prismix, which almost has what we want. However Prismix is more like Apollo Federation 1 than Federation 2, in the way the separated schemas are declared. In Prismix :

image

image

As you can see in the examples in Prismix's README, posts.prisma extends the Accounts entity of base.prisma by declaring a stub ID. If we were to follow Federation 2's conventions Accounts in posts.prisma would be resolvable on its own with no originating entity.

I'm going to look into Prismix and see if I can salvage some of the code to create our own schema merger that fits with our own needs. I don't think it's going to take a huge amount of time since looking into Paljs has shown me Prisma's DMMF and just how easy it is to parse. The devil is in the details though, I'm sure there are a huge amount of cases that need to be considered when merging the schemas.

To summarize what we need :

  1. A scaffolding script which generates the boilerplate code for a new Nest API and set up all the necessary files for all our features, including creating a subfolder for the generated API in the root's Prisma folder, etc. The idea would be to input something like npm run api:new and the CLI would prompt for the API folder name, and relevant environment variables, before generating the new API boilerplate.

  2. Our own custom Prisma merger that merges into a main schema that would exactly match a Federation 2 supergraph (I'll get started on this)

  3. Separating dependencies into each Nest API's app with their own .env, each subgraph should have their own Prisma client and database.

Let me know what you think, or perhaps what problems you foresee.

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 18, 2022

@johnkm516
I'm trying my hand at a couple of things right now to try to work out the advantages and disadvantages of things. There are more or less 2 styles of a monorepo. I like to think of it as an Nx style of repo and a Lerna style of repo. The Nx style of repo utilizes a single package.json at the root of the project in which all libraries of the project utilize a single node_modules folder. After some investigation, Prisma doesn't seem to have a good solution for an Nx style of monorepo. They made a decision early in the development of Prisma v2 to utilize the node_modules folder for the generation of the Prisma client. It isn't possible to generate multiple Prisma clients in an Nx style of repo. Prisma v2 would have needed to make the decision to generate the client outside the node_modules folder. This is not something that is configurable by Prisma and is just sort of hardcoded into how things are done. So this forces us into utilizing a Lerna style of monorepo so we can have multiple node_modules folders to house their respective Prisma client as you have suggested. There would require some major refactoring of the entire project to doing things the Lerna way of things.

There are so many advantages to an Nx style of repo. Guaranteed consistency of package versions and simplicity of project management. Nx style of repo would be the preferred choice for this specific starter kit. This adventure we seem to be on is now deserving of it's own repo. What do you think about opening up a repo where we co-share ownership and setup a fresh Lerna style project. We can begin to migrate over the technology from this starter kit into the federated GraphQL project. I think that would be the correct path forward.

@ZenSoftware
Copy link
Owner

@johnkm516
Also, you can reference this comment prisma/docs#1288 (comment) for more specifics on why we need to setup a Lerna repo. We should begin to experiment with how to best implement the architecture you have described. I agree with your analysis and list of requirements. I believe it is still worth the effort to make the Nest authorization module as a standalone library. That has value regardless of the style of monorepo. Sorry, I'm still working on it. I ran into more issues that need resolving. But I'm glad we know what needs to be done at least.

@johnkm516
Copy link
Author

johnkm516 commented Sep 19, 2022

@johnkm516 I'm trying my hand at a couple of things right now to try to work out the advantages and disadvantages of things. There are more or less 2 styles of a monorepo. I like to think of it as an Nx style of repo and a Lerna style of repo. The Nx style of repo utilizes a single package.json at the root of the project in which all libraries of the project utilize a single node_modules folder. After some investigation, Prisma doesn't seem to have a good solution for an Nx style of monorepo. They made a decision early in the development of Prisma v2 to utilize the node_modules folder for the generation of the Prisma client. It isn't possible to generate multiple Prisma clients in an Nx style of repo. Prisma v2 would have needed to make the decision to generate the client outside the node_modules folder. This is not something that is configurable by Prisma and is just sort of hardcoded into how things are done. So this forces us into utilizing a Lerna style of monorepo so we can have multiple node_modules folders to house their respective Prisma client as you have suggested. There would require some major refactoring of the entire project to doing things the Lerna way of things.

My investigations led to the same conclusion; the Prisma client as it's currently implemented wasn't designed to be used in a monorepo. I looked into Nx vs Lerna further and it seems Nx has taken ownership of Lerna. A Lerna style monorepo is still possible, and integrating Nx's task scheduling with Lerna is possible as well. Just something to keep in mind. I definitely agree with you that for our purposes we should have an Nx style of monorepo. Let's hold off on reorganizing the entire project structure for now until a better solution is found for the Prisma client. I'll ask around and investigate further while working on what I mentioned in the previous post.

What do you think about opening up a repo where we co-share ownership and setup a fresh Lerna style project. We can begin to migrate over the technology from this starter kit into the federated GraphQL project. I think that would be the correct path forward.

I would be honored. To be honest I did not realize the amount of extensive modifications that modularizing and supporting federation would require, so I understand it's now running a bit out of scope of the original repo. The code is all yours, so far I have just contributed a little here and there as my skills allow. So it's entirely up to you what you want to do; I look forward to working with you on this project to realize all that we've discussed here regardless.

@johnkm516
Copy link
Author

@ZenSoftware

For prismix and paljs/generator, here's where I'm at :

Prismix is as simple as paljs/generator in terms of modifying it. For our custom implementation, I'm thinking of the following when we eventually restructure the project to a monorepo :

  1. Prismix will act as a schema validator during the merge. It will detect models and fields with the same name, and if fields with the same name on the same model are detected, it will throw an error if the field's attributes are not exactly the same (relations, unique, key, etc.) If a field is declared on the same model in two separate schemas (with the same attributes and everything), it should also throw an error saying the field should be @shareable. Here is the issue though, Prisma doesn't support custom attributes, at least not yet. Prisma field extensions are in the works.

  2. The plan is to have our own custom implementation of Prismix validate and enforce all the necessary rules to match a Federation 2 subgraph schema, then if the validation passes, proceed with the schema merge, then proceed with the code generation for each subgraph using Paljs/generator for each subgraph. I'm thinking for now we don't support @Shareable or other non-prisma attributes until prisma supports custom attributes. If having the @shareable directive becomes a make or break thing, the only thing we could do is use string regex to extract from a commented @shareable string (which definitely is not ideal, but prisma will throw errors if we have any custom attributes in the schema).

I'm currently working on the schema validation portion of Prismix now. It will validate :

  1. Whether shared models across two schemas have the same primary key (and thus keyField(s) which should match)
  2. Whether shared models across two schemas have any conflicting non-keyField fields. If we won't allow @shareable directive until Prisma allows custom attributes, Prismix should throw error here. If we will allow @shareable directive, the Prisma schemas have to be decorated with a //@shareable instead temporarily so that Paljs can read it and generate the @shareable directive accordingly. The validation portion here would change depending on if we decide to support //@shareable or not (I'm leaning on yes).

Let me know what you think.

@johnkm516
Copy link
Author

@ZenSoftware

I decided to just support every directive in Federation 2 using a commented out attribute in the prisma schema. I added support for shareable, external, and some others just as a test in code generation for paljs/generator and it works beautifully. Keep in mind there is no validation here at all, all it does is read the schema and add the @shareable, @external, etc if it exists as //@shareable, //@external in the prisma schema. Again, the commented out attributes aren't ideal, but it's what we have to live with until Prisma supports custom attributes.

The validation of all these directives as the subgraphs are merged into the supergraph are meant to be done by a managed federation. Since Prisma only deals with databases, the user can change the directive freely and code-generate again without any issues.

@ZenSoftware
Copy link
Owner

@johnkm516
I heard about Nrwl taking over stewardship of Lerna as well. That is very beneficial for us. On the Nrwl YouTube channel they recently posted a short video on Lerna: Lerna Crash Course. I've updated all the dependencies across all the branches again. We're on the latest versions for virtually all of our dependencies. I've been playing with some of the new features, like Prisma metrics. Everything works perfectly.

I'm glad your investigations into Prismix were so fruitful. Very nice work. I agree, it would be ideal to have a processing step whereby we validate the multiple schema files of the project to ensure correctness within the supergraph. Since you worked out the details on this, it now seems we have the majority of the pieces we need for a fully functioning federated supergraph. It's just a matter of getting all the dev ops in place and work out all the conventions for the organization of the project assets. I'm currently investigating the integration between Nx and Lerna to work out some of the more nittier gritty details. Let me play around with things for just a bit longer and I'll create the new repo for us and share access. I'll keep you informed. 🎐

@johnkm516
Copy link
Author

johnkm516 commented Sep 23, 2022

@ZenSoftware
I'm making on some pretty colossal changes to Prismix and Paljs/Generator. I've therefore made completely standalone repositories of these forks. I think I'll be maintaining these myself as my implementations will have to stray from what the original code owners intended.

I've finished the Federation 2 directive parsing, and am making huge changes to the logic for both Prismix and Paljs/Generator for code generating stubs and validating them. I spent the last week really struggling to wrap my head around how to implement this, trying to figure out which fields a subgraph should resolve (and therefore store in the DB). While our initial modifications to Paljs/Generator worked out great since we were able to update to Prisma 4, there are many many more cases we have to solve. The most difficult case that I'm currently working on :

image

A Reviews subgraph contains a Product entity, which is not resolvable in the Reviews subgraph. So in our Prisma schema, we need to be able to declare a stub of Product without Prisma creating a separate table for it, and without us having to declare a relation for it. Here is how I plan on a subschema declaring an non-resolvable entity :

My implementation of Prismix will first parse all the schema files in a raw string (Prismix already did this through the getCustomAttributes function anyways, and it is impossible to implement what we need without making two read passes unfortunately). It will detect //@externalModel or some other custom attribute (I haven't decided yet), and remove it and the associated external model from the in-memory schema string during runtime so these declarations don't cause Prisma validation errors. Simultaneously, Prismix will parse the stub model for its keyField, and generate a new schema that will pass Prisma validation along with helpful explicit comments that PalJS/Generator will use to code-generate. Therefore, unlike the original Prismix implementation that I forked from, my implementation will have output files for subschemas as well as the merged base schema. Because Prismix and the federation portion of PalJs/Generator will be so intertwined, I plan on adding my implementation of Prismix as a dependency in PalJs/Generator. I need your feedback / opinions on the different schema declarations :

Implementation 1A :

input reviews.prisma

model Review {
  id         Int     @id @default(autoincrement())
  product Product?   //@externalModel
  score Int
}

model Product {
  id  Int @id
  name  String
  manufacturer  String
  @@unique("name", "manufacturer")
}

output reviews.prisma

//**THIS FILE IS AUTO-GENERATED BY PRISMIX. DO NOT MODIFY**//

model Review {
  id         Int     @id @default(autoincrement())
  product_id   Int?    //@externalModel("Product")
  product_name  String?  //@externalModel("Product")
  product_manufactuer  String?  //@externalModel("Product")
  score Int
  @@unique("product_name", "product_manufacturer")
}

//@stub
//model Product {
//  id  Int @id
//  name  String
//  manufacturer  String
//  @@unique("name", "manufacturer")
//}

Paljs/Generator would read stub and generate @key directives
@key(fields: "id") @key(fields: "product_name product_manufacturer")
according to all the candidate keys.

Implementation 1B :

input reviews.prisma

model Review {
  id         Int     @id @default(autoincrement())
  product Product?   //@externalModel
  score Int
}

model Product @key(fields: "id") @key(fields: "name manufacturer") {
  id  Int 
  name  String
  manufacturer  String
}

output reviews.prisma

//**THIS FILE IS AUTO-GENERATED BY PRISMIX. DO NOT MODIFY**//

model Review {
  id         Int     @id @default(autoincrement())
  product_id   Int?    //@externalModel("Product")
  product_name  String?  //@externalModel("Product")
  product_manufactuer  String?  //@externalModel("Product")
  @@unique("product_name", "product_manufacturer")
  score Int
}

//@stub
//model Product @key(fields: "id") @key(fields: "product_name product_manufacturer") {
//  id  Int 
//  product_name  String
//  product_manufacturer  String
//}

For both 1A and 1B, if Product is required in Review, it would select the first keyField if multiple keyFields are declared.
Like so :

model Review {
  id         Int     @id @default(autoincrement())
  product_id   Int    //@externalModel("Product")
  score Int
}

Note how the output subschema generated by Prismix autogenerates fields that will actually be stored in a Prisma database, and therefore must be declared in the schema that Prisma client will actually use for migration for the subgraph. Paljs/Generator will recognize @stub and the proceeding Product model and code-generate the resolver and the stub model type accordingly, according to Federation 2 specifications.

Pros : input schema follows very closely with Prisma model declarations (1A) or Apollo Federation model declarations (1B).
Cons : the product_id is autogenerated, which I think we might want to generally avoid as we want the user to explicitly declare fields.

Implementation 2A :

input reviews.prisma

model Review {
  id         Int     @id @default(autoincrement())
  product_id Int   //@externalModel(model: "Product", references: "id")
  score Int
}

output reviews.prisma

//**THIS FILE IS AUTO-GENERATED BY PRISMIX. DO NOT MODIFY**//

model Review {
  id         Int     @id @default(autoincrement())
  product_id   Int    //@externalModel(model: "Product", references: "id")
  score Int
}

//@stub
//model Product {
//  id  Int @id
//}

Pros : All fields are explicitly declared
Cons : None I can think of right now

Therefore I'm leaning towards implementation 2 right now, but I need a second opinion in case I might be missing a case here. The above examples are only in cases of stub models that won't be saving any data to a database other than a reference foreign key resolved in another subgraph, because in one-to-many or many-to-many cases Product would be extended and therefore resolvable in the same subgraph (and therefore a valid Prisma model must be declared to save the relationship accordingly)

There is a lot to do. I also want to be able to handle explicitly declared keyfields especially in cases where there are nested object keys like :

type User @key(fields: "id organization { id }") {
  id: ID!
  organization: Organization!
}

Since this thread is getting quite bloated, I'm going to start creating issues with the above design specs and more when I get around to it in my repositories for Prismix-Federation and Prisma-Tools-Federation so I can document and organize the information better. I'll link them here when I post them.

Also, if you happen to have a discord channel or a slack channel, please send me an invite. I'd love to talk to you in a more live setting; I promise I won't bother you too much 😂. Since you're in Canada there's going to be quite a time difference anyways (I'm in South Korea), but it would be good to share information and talk about design decisions that don't necessarily belong in this thread. I look forward to contributing and working with you further.

@ZenSoftware
Copy link
Owner

ZenSoftware commented Sep 23, 2022

@johnkm516
I have found out that Prisma has been investing heavily into adding the feature of handling multiple schemas. How to use Prisma with multiple database schemas. It is now available as a preview feature, though it is currently extremely limited. Prisma migrate doesn't work with it currently. I am pondering on how we can utilize the @@schema directive they plan to integrate as a possible solution to keeping the project as an Nx style monorepo. Now that I've experienced the advantages of an Nx style monorepo, it is so hard to let go of the advantages that it brings.

In terms of project structure I imagine we would have an apps/<APP_NAME>/schema.prisma file for each subgraph.

  1. We could read the angular.json at the root of the project, which lists all the applications for the project.
  2. Scan each app directory for an existing schema.prisma at its relative root.
  3. Concatenate the individual schema.prisma files together into one.

As an example:
apps/base/schema.prisma

model User {
  id     Int     @id
  orders Order[]

  @@schema("base")
}

apps/transactional/schema.prisma

model Order {
  id      Int  @id
  user    User @relation(fields: [id], references: [id])
  user_id Int

  @@schema("transactional")
}

Finally, concatenation would result into this:

generator client {
  provider        = "prisma-client-js"
  previewFeatures = ["multiSchema"]
}

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
  schemas  = ["base", "transactional"]
}

model User {
  id     Int     @id
  orders Order[]

  @@schema("base")
}

model Order {
  id      Int  @id
  user    User @relation(fields: [id], references: [id])
  user_id Int

  @@schema("transactional")
}

Ideally, we would leverage off of Prisma's multi-schema feature to do the heavy lifting for validation. We then architect things in such a manner such that a valid Prisma multi-schema would also produce valid Apollo subgraphs/supergraph. This would be the architecture that we might ultimately want. That is if Prisma multi-schema was actually feature complete. I haven't tried my hand at playing with the preview feature just yet. But depending on how much is usable, we may want to rethink our efforts as there may be a better way of doing things. We may not have to integrate these custom directives you have been working on if we can get things working with this preview feature. Let's investigate this further and see what we discover.

Yes, I totally agree that it would be really nice to have a more direct line of communication. Discord is what I typically use. Could you please add me as a friend. I'll delete the comment after I've confirmed you've read it.

@johnkm516
Copy link
Author

@ZenSoftware
I've read it. I'll add you when I get home from work. I look forward to talking to you!

@johnkm516
Copy link
Author

@ZenSoftware

About Prisma multiSchemas, I also read up on it and immediately gave up on it after reading this :

Currently it is not allowed to have models with the same name in your Prisma schema that use the @@schema attribute to reference different database schemas. All models in your Prisma schema must have unique names. Otherwise, your Prisma schema will fail to validate.

Another thing is that Prisma also doesn't support custom attributes, models, nor custom validation extensions. It's understandable considering Prisma schemas were meant to be used to model databases only, but since like us, many people want to directly code-generate from Prisma schemas, they are aware of this feature request and are working on it.

We should definitely keep track of this feature once it matures into something we can use in a monorepo. However for our immediate purposes I'll keep working on Prismix; Prisma type, model, and custom validation extensions won't arrive for a long while. I'll document in detail the exact cases we must consider to declare and translate a prisma schema into a federation 2 subschema with all its features. Stub models that aren't resolvable in the subschema is one such case that I outlined above, but more test cases will help here.

@ZenSoftware
Copy link
Owner

Ah, ok! Yes, it is very problematic that there is a requirement for unique names. I think there is actually no choice but to continue with your efforts with Prismix as we do need an immediate solution. We will just keep it in the back of our minds and do our best to harmonize a solution with an expectation that this feature is eventually coming. I'll review your suggestions for the style of syntax that is to be used and get back to you with a more thorough response. We would ideally choose the option that causes for the least amount of redundancy.

@johnkm516
Copy link
Author

@ZenSoftware
I made some modifications to the Prisma schema declaration to follow closely with how Prisma handles relationships between models and need your feedback on how the code-generation should work... I'm having trouble wrapping my head around what the typedefs and resolvers should look like since the external model will not have a Prisma-client model. The model itself should resolve externally, but then what should the inputargs look like?

Please read over my issue here : johnkm516/prismix-federation#1

@ZenSoftware
Copy link
Owner

ZenSoftware commented Oct 1, 2022

@johnkm516
Hi John. Ok, so I've managed to wrap my head around more of the problem and I think it will help if I elicit what I have come to understand thus far. We've been treating the Prisma DMMF (Data Model Meta Format) as our method of defining business models for the project. The DMMF was originally only designed to meet the needs of managing database migrations. Though, since the DMMF is so powerful in its ability to describe relational data, it can be used beyond just handling the data migrations and can be used to be the source of truth to generate the data access layer. Our data layer is constituted of several layers and can be broken down as:

  • Prisma DMMF - schema.prisma file to define the data models & relations
  • Prisma Migrate - Database migration tool
  • PostgreSQL - Relational database
  • Prisma Client - Object relational mapper
  • Apollo Server - Network access to the ORM via HTTP (GraphQL)
  • NestJS GraphQL - Hosts the API server and handles authorization
  • Apollo Client - Front-end client to make GraphQL calls

Historically, the reason you spin up a new database is for the purposes of isolating that data from other schemas. You generally want most of your data models to exist within a single database. The reason for this is that you can define foreign key constraints between tables, which enables the expressiveness to query data utilizing a query language. When you move a data model into a different schema, the foreign key between relations are simply stored as a primitive value, without defining the relational constraints. There is technically a relation between the tables when you only store the ID and don't define the foreign key constraint, but you lose most of the robustness of the query language because the database doesn't have any information about the relation.

This is the problem that we are encountering. When you move a data model into its a different schema.prisma file, we seem to be forced to remove the foreign key constraint being defined. I see that you have encountered this while customizing Prismix to get Prisma Migrate to work. Though, the root cause of why we are removing constraints goes a bit deeper than just trying to get the database migration tool to work. A database schema is intended to be an isolated set of data but we are defining cross schema relations by storing primary keys from other schemas. Thus without defining the foreign key constraints, there is no way to do things like a cascading delete, so deleting a record with child entries existing in a separate schema has to be handled explicitly.

Speaking more specifically about Apollo federation, you can see the limitations we are facing more clearly with the following code:

Nest Docs Apollo Federation example

// User resolver
@ResolveReference()
resolveReference(reference: { __typename: string; id: string }) {
  return this.usersService.findById(reference.id);
}

The only information we have is the User id whenever a related resource requires a User instance. Therefore, the sophisticated querying that Prisma client enables is hidden away from the GraphQL subgraph. I recall the way you originally wrote the __resolveReference when updating PalJS was that you were expecting the WhereInput of the data model as an argument, rather than the reference type { __typename: string; id: string } as shown in the example. I now understand why you were doing this, as you were trying to expose the capabilities of Prisma client between Apollo subgraphs somehow.

I heard the Prisma team describing multi-schema support as being a very difficult problem to solve. I am beginning to see why. We are probably encountering the same problems that they are trying to solve.

Despite this, I don't believe our efforts are in vain. There is much more clarity in how to modularize the repo now. We've also managed to integrate Apollo Federation v2 into the tech stack! Which enables many new possibilities. I think it is still a good idea to refactor things such that we can specifying custom locations for generated Prisma clients, and have the all tooling work for multiple Nest apps within the monorepo. The stitching together of Apollo subgraphs (even with the limitations that we are now aware of) can be handled at the level of Nest.

Let me know what your thoughts are. Sorry it took so long for the response. It took up until now to really understand the nature of the problem.

@johnkm516
Copy link
Author

@ZenSoftware
I've finished work on the generator, such that everything composes in the supergraph. I had a hell of a time doing this and touched basically every portion of the generator, but it works now. Here's what I did :

  1. I changed the paljs's federation config from boolean to string. This option now denotes the name of the subgraph.

  2. Initially, I added and imported wrapSchema from GraphQL tools and renamed all the input types to have the subgraph name as a prefix. So for example, if the federation option is set to Products, the wrapSchema and renamers would rename all the input types to something like Products_ProductCreateInput. This worked, and when testing on Apollo Sandbox everything was renamed fine. However when I tried to use rover CLI to update the supergraph, none of the inputs were renamed. I tried finding a solution for this, but I think the cause of the issue is that the wrapSchema is in the transformer function of the GraphQL driver, and somehow Rover introspects the schema before the transformer somehow. This portion of the code is in model-names.ts.

  3. So with the schema wrapper not working properly with the Rover CLI, I decided to edit the generator itself such that all the files that it outputs renames non-entity / non-output types to prefix with the subgraph name. To do this I heavily modified the DMMF model to include input type maps, output type maps, and all possible keyfields. I use the maps to check if any field / type is a Input type, and prefix it accordingly if the federation option is set. I also modify a huge number of other fields in the generator to be prefixed with federation, including the mutations and queries themselves. Mutations and Queries also cannot have the same name in two different subgraphs, because if the Input type in subgraph Products has queries / mutations using an input type named Products_ProductCreateInput, the input type in subgraph Reviews has queries / mutations using an input type named Reviews_ProductCreateInput, which results in composition errors. Basically, if the input types could be merged using the union strategy none of this is needed, but due to the way Apollo Federation works I had to rename input types and then rename every other type / field that uses the renamed input types such that each subgraph has their own query / mutation. Whether this is good or bad is subjective, I think some people might like seeing exactly which input type, query, mutation belongs to which subgraph to know exactly where their data will resolve, while still having entities be merged in the supergraph like you would expect with Apollo Federation.

  4. This is a bug in the original paljs/generator, but I fixed it. Paljs/generator creates the updateMany mutation even when a model has no non-unique & non-relation fields. In order for you to use updateMany you have to have fields that can be updated with the same value on that model, but if all the fields are unique there are no fields that you can update many with. I changed the generator such that instead of generating updateMany it creates a comment explaining why updateMany doesn't exist for this model.

  5. I moved SDL-inputs into paljs/generator, and all the "need to be renamed" types in this file is now renamed with the subgraph name if the federation option is set.

  6. I made some changes to the CASL templates in Zen on my local since they refer to the resolvers in the paljs generated files. All I changed is to check if federation is set in the paljs config, and rename the CASL resolver's types accordingly to prefix with the subgraph name.

With this work now done I'm waiting on you to modularize Auth. When you finish putting auth in its own module I can help you modify the type declarations such that it composes in the supergraph.

In terms of future plans for a feature, I am currently contemplating a design for the generator to generate idempotent mutations for the Saga pattern. For example, if I have the following prisma model :

model User { id Int @id @default(autoincrement()) balance Float }

I would want a reserveBalance(Float balanceToReserve, string uuid) mutation. In order for the mutation to be idempotent, a separate event model with uuid and query field would be needed, and if reserveBalance is queried multiple times with the same uuid, the mutation would first check if the uuid in the database exists before applying the change, therefore making this idempotent. In the opposite vein the generator should also generate a reverseBalanceReservation(string uuid) mutation to reverse the change if the saga fails at any point and the changes need to be reversed. While this is a simple example, the design needs to take into account nested parameters and multiple parameters, and a way for the user to declare what type of saga event they want.

I haven't seen you on for a while. I hope you're doing ok. Leave me a message on Discord when you can.

@johnkm516 johnkm516 reopened this Oct 14, 2022
@ZenSoftware
Copy link
Owner

@johnkm516
Hi John. Terribly sorry it has been so long. Life became rather chaotic for me. I am now back from my hiatus.

I just wanted to let you know that I finally refactored the project to allow for multiple schema.prisma files within the monorepo. There was a major refactoring of the auth code. Now there is a library named nest-auth that multiple apps can leverage. That should do it! I am a bit exhausted tonight and I am about to crawl into bed. I'll get back to you with a more thorough response with respect to your latest comment.

Thanks for your patience my friend. We'll catch up soon.🎐

@johnkm516
Copy link
Author

johnkm516 commented Dec 2, 2022

@ZenSoftware
Hey, whatever's been up with you, hope you're doing ok.

You should check my profile for a standalone fork I made of zen, called zen-federation. During your hiatus I implemented an absolutely colossal amount of features. Everything I talked about with you on Discord few months back, from the monorepo with multiple schema files, app generation using the Nx generator, modifying PalJS generator to latest Prisma using the latest Prisma features, etc.

Some of the things I did include but are not limited to :

  1. PalJS Generator and corresponding generator.ts in Zen :
  • Now supports 3 additional generated queries / methods. The new findUniqueOrThrow / findFirstOrThrow queries. GroupBy query. GroupBy was particularly difficult because Prisma does some Typescript voodoo to type the input / outputs. For both extendedWhereUnique feature and GroupBy, I had to use type rather than interface in SDLInput to implement this.

  • Supports optimistic concurrency control that was added as experimental feature in Prisma 4.5.0 previewFeatures = ["extendedWhereUnique"] in prisma schema.

  • Renames all inputs, queries, and mutations to prefix by the service name. The default service that comes with the repo is called Auth, with the default user model in it. The option for federation in the PalJS Generator now accepts string rather than boolean. So when you generate resolvers using my version of the PalJS Generator, you would set the option to "Auth" and queries and mutations and inputs will be prefixed by "Auth_", for example, Auth_findUniqueUser. This is because combining inputs, queries and mutations is impossible in the current implementation of Apollo Federation. Inputs can't be an entity that is federated, and having two inputs with the same name but with different fields fails to compose the supergraph.

  1. The workspace generator :
  • Generating entirely new nestjs api service is now fully implemented. The app-generator creates a new app folder with everything you need including a new prisma client. The generator actually does so much more :

  • I implemented and fully modified docker-compose. It now launches PGAdmin, and the dockerfile along with the package.json and the app-generator automatically adds new databases to the pgpass file, and reloads the PGAdmin container so that when you log into PGAdmin all the service databases are automatically pre-connected without you having to input DB server info. So for example, in the base repo there is the Auth service, so PGAdmin will have the Auth PostgreSQL DB connected, and adding another service called Products using the app-generator will automatically connect you to the Products PostgreSQL DB without you having to do anything.

  • Implemented multi-stage docker-compose for production-ready images, reduced image size down to ~500mb despite still copying node_modules. This is in docker-compose-prod.yml.

  • Added apollo router in docker-compose.

  • The workspace generator modifies package.json to add new scripts like npm run start:###service_name###, and also modifies docker-compose.yml and docker-compose-prod.yml to launch the PostgreSQL DB and build the new app service container, all right out of the box. The app generator is designed to do literally everything for you, all you do is click generate and then add a model in its prisma-client and you have a new service up and ready to go within 1 minute.

I suggest you clone my repo and take a look. I could easily modify the app-generator to use the modularized auth lib rather than generating auth-related files. Needless to say there are a huge list of changes that I made that I cannot list off the top of my head. I'll be on Discord on Monday KST. Let's discuss combining the code into a new repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants