Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP(schema-wasm): support schema split into multiple files #4787

Merged
merged 15 commits into from
Apr 8, 2024

Conversation

SevInf
Copy link
Contributor

@SevInf SevInf commented Mar 22, 2024

  • getDMMF
  • validate
  • getConfig
  • format - partially done, multi-file input needs to be expsoed to wasm
  • add mergeSchemas

Closes https://github.com/prisma/team-orm/issues/1038

/integration

tomhoule and others added 2 commits March 21, 2024 13:58
This commit implements multi-file schema handling in the Prisma Schema Language.

At a high level, instead of accepting a single string, `psl::validate_multi_file()` is an alternative to `psl::validate()` that accepts something morally equivalent to:

```json
{
  "./prisma/schema/a.prisma": "datasource db { ... }",
  "./prisma/schema/nested/b.prisma": "model Test { ... }"
}
```

There are tests for PSL validation with multiple schema files, but most of the rest of engines still consumes the single file version of `psl::validate()`. The implementation and the return type are shared between `psl::validate_multi_file()` and `psl::validate()`, so the change is completely transparent, other than the expectation of passing in a list of (file_name, file_contents) instead of a single string. The `psl::validate()` entry point should behave exactly the same as `psl::multi_schema()` with a single file named `schema.prisma`. In particular, it has the exact same return type.

Implementation
==============

This is achieved by extending `Span` to contain, in addition to a start and end offset, a `FileId`. The `FileId` is a unique identifier for a file and its parsed `SchemaAst` inside `ParserDatabase`. The identifier types for AST items in `ParserDatabase` are also extended to contain the `FileId`, so that they can be uniquely referred to in the context of the (multi-file) schema. After the analysis phase (the `parser_database` crate), consumers of the analyzed schema become multi-file aware completely transparently, no change is necessary in the other engines.

The only changes that will be required at scattered points across the codebase are the `psl::validate()` call sites that will need to receive a `Vec<Box<Path>, SourceFile>` instead of a single `SourceFile`. This PR does _not_ deal with that, but it makes where these call sites are obvious by what entry points they use: `psl::validate()`, `psl::parse_schema()` and the various `*_assert_single()` methods on `ParserDatabase`.

The PR contains tests confirming that schema analysis, validation and displaying diagnostics across multiple files works as expected.

Status of this PR
=================

This is going to be directly mergeable after review, and it will not affect the current schema handling behaviour when dealing with a single schema file.

Next steps
==========

- Replace all calls to `psl::validate()` with calls to `psl::validate_multi_file()`.
- The `*_assert_single()` calls should be progressively replaced with their multi-file counterparts across engines.
- The language server should start sending multiple files to prisma-schema-wasm in all calls. This is not in the spirit of the language server spec, but that is the most immediate solution. We'll have to make `range_to_span()` in `prisma-fmt` multi-schema aware by taking a FileId param.

Links
=====

Relevant issue: prisma/prisma#2377

Also see the [internal design doc](https://www.notion.so/prismaio/Multi-file-Schema-24d68fe8664048ad86252fe446caac24?d=68ef128f25974e619671a9855f65f44d#2889a038e68c4fe1ac9afe3cd34978bd).
@SevInf SevInf added this to the 5.12.0 milestone Mar 22, 2024
@SevInf SevInf force-pushed the psl-multi-file-schema branch from f34efb7 to 1e42d35 Compare March 22, 2024 15:12
@SevInf SevInf force-pushed the feat/multi-schema-fmt branch from 883146e to 0128219 Compare March 26, 2024 15:23
Copy link
Contributor

github-actions bot commented Mar 26, 2024

WASM Query Engine file Size

Engine This PR Base branch Diff
Postgres 2.124MiB 2.124MiB 228.000B
Postgres (gzip) 836.395KiB 836.163KiB 238.000B
Mysql 2.092MiB 2.092MiB 228.000B
Mysql (gzip) 823.310KiB 823.329KiB -19.000B
Sqlite 1.988MiB 1.987MiB 306.000B
Sqlite (gzip) 784.003KiB 784.104KiB -103.000B

Copy link

codspeed-hq bot commented Mar 26, 2024

CodSpeed Performance Report

Merging #4787 will improve performances by 5.67%

Comparing feat/multi-schema-fmt (e32d0dd) with main (dcdb692)

Summary

⚡ 1 improvements
✅ 10 untouched benchmarks

Benchmarks breakdown

Benchmark main feat/multi-schema-fmt Change
large_read 8.1 ms 7.7 ms +5.67%

Copy link
Contributor

github-actions bot commented Mar 26, 2024

✅ WASM query-engine performance won't change substantially (0.996x)

Full benchmark report
DATABASE_URL="postgresql://postgres:postgres@localhost:5432/bench?schema=imdb_bench&sslmode=disable" \
node --experimental-wasm-modules query-engine/driver-adapters/executor/dist/bench.mjs
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
"
cpu: AMD EPYC 7763 64-Core Processor
runtime: node v18.20.0 (x64-linux)

benchmark                   time (avg)             (min … max)       p75       p99      p999
-------------------------------------------------------------- -----------------------------
• movies.findMany() (all - ~50K)
-------------------------------------------------------------- -----------------------------
Web Assembly: Baseline     288 ms/iter       (286 ms … 293 ms)    290 ms    293 ms    293 ms
Web Assembly: Latest       374 ms/iter       (371 ms … 378 ms)    376 ms    378 ms    378 ms
Web Assembly: Current      373 ms/iter       (373 ms … 376 ms)    374 ms    376 ms    376 ms
Node API: Current          195 ms/iter       (193 ms … 198 ms)    196 ms    198 ms    198 ms

summary for movies.findMany() (all - ~50K)
  Web Assembly: Current
   1.92x slower than Node API: Current
   1.3x slower than Web Assembly: Baseline
   1x faster than Web Assembly: Latest

• movies.findMany({ take: 2000 })
-------------------------------------------------------------- -----------------------------
Web Assembly: Baseline  11'664 µs/iter (11'508 µs … 12'446 µs) 11'673 µs 12'446 µs 12'446 µs
Web Assembly: Latest    15'564 µs/iter (14'964 µs … 21'262 µs) 15'262 µs 21'262 µs 21'262 µs
Web Assembly: Current   15'108 µs/iter (14'918 µs … 16'021 µs) 15'101 µs 16'021 µs 16'021 µs
Node API: Current        7'897 µs/iter   (7'742 µs … 8'220 µs)  7'934 µs  8'220 µs  8'220 µs

summary for movies.findMany({ take: 2000 })
  Web Assembly: Current
   1.91x slower than Node API: Current
   1.3x slower than Web Assembly: Baseline
   1.03x faster than Web Assembly: Latest

• movies.findMany({ where: {...}, take: 2000 })
-------------------------------------------------------------- -----------------------------
Web Assembly: Baseline   1'884 µs/iter   (1'738 µs … 3'619 µs)  1'848 µs  3'396 µs  3'619 µs
Web Assembly: Latest     2'422 µs/iter   (2'311 µs … 3'734 µs)  2'409 µs  3'033 µs  3'734 µs
Web Assembly: Current    2'368 µs/iter   (2'300 µs … 3'027 µs)  2'371 µs  2'741 µs  3'027 µs
Node API: Current        1'390 µs/iter   (1'302 µs … 1'630 µs)  1'398 µs  1'593 µs  1'630 µs

summary for movies.findMany({ where: {...}, take: 2000 })
  Web Assembly: Current
   1.7x slower than Node API: Current
   1.26x slower than Web Assembly: Baseline
   1.02x faster than Web Assembly: Latest

• movies.findMany({ include: { cast: true } take: 2000 }) (m2m)
-------------------------------------------------------------- -----------------------------
Web Assembly: Baseline     549 ms/iter       (540 ms … 570 ms)    560 ms    570 ms    570 ms
Web Assembly: Latest       757 ms/iter       (750 ms … 779 ms)    763 ms    779 ms    779 ms
Web Assembly: Current      757 ms/iter       (752 ms … 766 ms)    760 ms    766 ms    766 ms
Node API: Current          481 ms/iter       (469 ms … 494 ms)    488 ms    494 ms    494 ms

summary for movies.findMany({ include: { cast: true } take: 2000 }) (m2m)
  Web Assembly: Current
   1.57x slower than Node API: Current
   1.38x slower than Web Assembly: Baseline
   1x faster than Web Assembly: Latest

• movies.findMany({ where: {...}, include: { cast: true } take: 2000 }) (m2m)
-------------------------------------------------------------- -----------------------------
Web Assembly: Baseline  76'114 µs/iter (75'805 µs … 76'876 µs) 76'427 µs 76'876 µs 76'876 µs
Web Assembly: Latest       108 ms/iter       (107 ms … 110 ms)    109 ms    110 ms    110 ms
Web Assembly: Current      108 ms/iter       (107 ms … 110 ms)    108 ms    110 ms    110 ms
Node API: Current       64'024 µs/iter (62'923 µs … 65'224 µs) 65'215 µs 65'224 µs 65'224 µs

summary for movies.findMany({ where: {...}, include: { cast: true } take: 2000 }) (m2m)
  Web Assembly: Current
   1.68x slower than Node API: Current
   1.41x slower than Web Assembly: Baseline
   1x faster than Web Assembly: Latest

• movies.findMany({ take: 2000, include: { cast: { include: { person: true } } } })
-------------------------------------------------------------- -----------------------------
Web Assembly: Baseline     974 ms/iter       (971 ms … 980 ms)    979 ms    980 ms    980 ms
Web Assembly: Latest     1'262 ms/iter   (1'254 ms … 1'277 ms)  1'267 ms  1'277 ms  1'277 ms
Web Assembly: Current    1'265 ms/iter   (1'259 ms … 1'286 ms)  1'271 ms  1'286 ms  1'286 ms
Node API: Current          894 ms/iter       (868 ms … 934 ms)    920 ms    934 ms    934 ms

summary for movies.findMany({ take: 2000, include: { cast: { include: { person: true } } } })
  Web Assembly: Current
   1.42x slower than Node API: Current
   1.3x slower than Web Assembly: Baseline
   1x faster than Web Assembly: Latest

• movie.findMany({ where: { ... }, take: 2000, include: { cast: { include: { person: true } } } })
-------------------------------------------------------------- -----------------------------
Web Assembly: Baseline     138 ms/iter       (136 ms … 147 ms)    138 ms    147 ms    147 ms
Web Assembly: Latest       175 ms/iter       (174 ms … 176 ms)    175 ms    176 ms    176 ms
Web Assembly: Current      174 ms/iter       (174 ms … 175 ms)    175 ms    175 ms    175 ms
Node API: Current          106 ms/iter       (105 ms … 107 ms)    107 ms    107 ms    107 ms

summary for movie.findMany({ where: { ... }, take: 2000, include: { cast: { include: { person: true } } } })
  Web Assembly: Current
   1.65x slower than Node API: Current
   1.27x slower than Web Assembly: Baseline
   1x faster than Web Assembly: Latest

• movie.findMany({ where: { reviews: { author: { ... } }, take: 100 }) (to-many -> to-one)
-------------------------------------------------------------- -----------------------------
Web Assembly: Baseline     869 µs/iter     (811 µs … 1'501 µs)    866 µs  1'358 µs  1'501 µs
Web Assembly: Latest     1'215 µs/iter   (1'147 µs … 1'881 µs)  1'214 µs  1'697 µs  1'881 µs
Web Assembly: Current    1'212 µs/iter   (1'155 µs … 1'650 µs)  1'215 µs  1'531 µs  1'650 µs
Node API: Current          774 µs/iter     (694 µs … 1'443 µs)    789 µs  1'096 µs  1'443 µs

summary for movie.findMany({ where: { reviews: { author: { ... } }, take: 100 }) (to-many -> to-one)
  Web Assembly: Current
   1.57x slower than Node API: Current
   1.39x slower than Web Assembly: Baseline
   1x faster than Web Assembly: Latest

• movie.findMany({ where: { cast: { person: { ... } }, take: 100 }) (m2m -> to-one)
-------------------------------------------------------------- -----------------------------
Web Assembly: Baseline     860 µs/iter     (813 µs … 1'392 µs)    860 µs  1'305 µs  1'392 µs
Web Assembly: Latest     1'188 µs/iter   (1'131 µs … 1'652 µs)  1'195 µs  1'497 µs  1'652 µs
Web Assembly: Current    1'208 µs/iter   (1'140 µs … 1'873 µs)  1'205 µs  1'755 µs  1'873 µs
Node API: Current          775 µs/iter     (710 µs … 1'064 µs)    797 µs    933 µs  1'064 µs

summary for movie.findMany({ where: { cast: { person: { ... } }, take: 100 }) (m2m -> to-one)
  Web Assembly: Current
   1.56x slower than Node API: Current
   1.41x slower than Web Assembly: Baseline
   1.02x slower than Web Assembly: Latest

After changes in e32d0dd

@Jolg42 Jolg42 modified the milestones: 5.12.0, 5.13.0 Apr 2, 2024
@jkomyno
Copy link
Contributor

jkomyno commented Apr 5, 2024

Using prisma-fmt-wasm, the following now works:

  • ./prisma/schema.prisma is a self-standing Prisma schema that is valid on its own

    datasource db {
      provider = "postgresql"
      url = env("DBURL")
    }
    
    model A {
      id String @id
      b_id String @unique
      b B @relation(fields: [b_id], references: [id])
    }
    
    model B {
      id String @id
      a  A?
    }
  • schema-1.prisma is only pair when paired with some other Prisma

    model B {
      id String @id
      a  A?
    }
  • schema-2.prisma is only pair when paired with some other Prisma

    datasource db {
      provider = "postgresql"
      url = env("DBURL")
    }
    
    model A {
      id String @id
      b_id String @unique
      b B @relation(fields: [b_id], references: [id])
    }
  • index.cjs shows how the validate function works:

    const fs = require('node:fs/promises')
    const path = require('node:path')
    const { validate } = require('./prisma_schema_build')
    
    async function main() {
      const schema = await fs.readFile(path.join(__dirname, 'prisma/schema.prisma'), 'utf-8')
      const schema1 = await fs.readFile(path.join(__dirname, 'prisma/schema-1.prisma'), 'utf-8')
      const schema2 = await fs.readFile(path.join(__dirname, 'prisma/schema-2.prisma'), 'utf-8')
    
      validate(JSON.stringify({ prismaSchema: schema }))
      console.log('[validate] schema.prisma is valid')
    
      try {
        validate(JSON.stringify({ prismaSchema: schema1 }))
      } catch (e) {
        console.log('[validate] schema1.prisma is not valid on its own')
        console.error(JSON.parse(e.message).message)
      }
    
      try {
        validate(JSON.stringify({ prismaSchema: schema2 }))
      } catch (e) {
        console.log('[validate] schema2.prisma is not valid on its own')
        console.error(JSON.parse(e.message).message)
      }
    
      validate(JSON.stringify({ prismaSchema: [['schema-1.prisma', schema1], ['schema-2.prisma', schema2]] }))
      console.log('[validate] schema1.prisma + schema.2.prisma are valid')
    }
    
    main()
  • Output:
    Screenshot 2024-04-05 at 12 52 55

@jkomyno jkomyno marked this pull request as ready for review April 5, 2024 09:42
@jkomyno jkomyno requested a review from a team as a code owner April 5, 2024 09:42
@jkomyno jkomyno requested review from Druue and removed request for a team April 5, 2024 09:42
Copy link
Contributor Author

@SevInf SevInf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM provided that requested changes are made.
Note: I also reviewed only your part, somebody probably should look through my changes too.

I also think before we merge this, we should merge psl-multischema-branch into wait and work off main from now on.

prisma-fmt/src/merge_schemas.rs Outdated Show resolved Hide resolved
prisma-fmt/src/merge_schemas.rs Outdated Show resolved Hide resolved
psl/psl-core/src/reformat.rs Show resolved Hide resolved
Base automatically changed from psl-multi-file-schema to main April 8, 2024 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants