Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: file exporters (docx and pdf) #1143

Merged
merged 70 commits into from
Nov 6, 2024
Merged

Conversation

YousefED
Copy link
Collaborator

@YousefED YousefED commented Oct 11, 2024

💖 This feature is sponsored by DINUM 🇫🇷 and ZenDiS 🇩🇪

This PR adds the concept of Exporters to BlockNote. An Exporter is a loosely coupled, type safe way to transform documents from BlockNote-style JSON to another format (for example, .pdf and .docx).

Demo for PDF: https://blocknote-git-feature-file-exporters-typecell.vercel.app/interoperability/converting-blocks-to-pdf
Demo for Word: https://blocknote-git-feature-file-exporters-typecell.vercel.app/interoperability/converting-blocks-to-docx

pdf

Requirements

  • Exporters can be defined in separate packages. i.e.: if you don't need docx export functionality, you don't need to include any docx related dependencies
  • Exporters are type-safe, even for custom schemas. So for example, when trying to export a document with a custom schema using the default exporters, you'll get a type error (as you're trying to transform a document with an exporter that might not support all content in your document)
  • Exporters are customizable: the consumer can customize how certain types of content (i.e.: blocks, inline content, styles) are transformed
  • It needs to be possible to easily make changes to the "output document". For example; add custom headers / footers

Architecture

As a reminder, the BlockNote schema consists of 3 main types: blocks, inline content, and styles (see docs). First, exporters define a mapping for each of them. Exporters run Blocks (BlockNote JSON) through the mappings to get a new, "transformed" document.

Note: Client-side vs. Server-side

In the current setup, we have chosen to make Exporters work completely client-side. This has the following benefits:

  • no server needs to be able to see / touch your data (i.e.: support for E2EE)
  • complete control over the exported files. Because we generate pdf / docx files "from the ground up"; we should be able to control every single aspect of the exported files. If we'd go for a JSON -> HTML -> Docx / pdf approach using a "3rd party" HTML -> DocX / PDF exporter,

The main downside of this approach is that we'll need to write custom exporters for each file type, instead of creating 1 exporter to HTML and letting third party libraries handle the HTML -> Docx and HTML -> pdf exporting separately. However, it's likely that this approach would also run into limitations or a lot of custom HTML fiddling to make sure the "HTML -> xxx" converter works well.

Docx

The Docx exporter uses the docxJs to create .docx files

PDF

The PDF exporter uses https://react-pdf.org to create .pdf files. I've also looked at the following libraries:

Copy link

vercel bot commented Oct 11, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
blocknote ✅ Ready (Inspect) Visit Preview Nov 6, 2024 11:58am
blocknote-website ✅ Ready (Inspect) Visit Preview Nov 6, 2024 11:58am

### Custom mappings / custom schemas

The `PDFExporter` constructor takes a `schema` and `mappings` parameter.
A _mapping_ defines how to convert a BlockNote schema element (a Block, Inline Content, or Style) to a React-PDF element.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to React-PDF needed to match the docx export docs

[Business subscription](/pricing).
</Callout>

First, install the `@blocknote/xl-docx-exporter` and `docx` packages:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the docs are nice and clear in general! But I would add a sentence at the start that the backbone of the PDF and docx conversions are the React-PDF and docxjs packages, and that customizing the export functionality requires being familiar with their APIs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this file?

import path from "path";
const NAME_WORKER_STATE = "__vitest_worker__";

// from https://github.com/vitest-dev/vitest/blob/main/packages/vitest/src/runtime/utils.ts#L8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use comment why this is needed

return option;
}

// async function createMD5FromBuffer(buffer: Buffer) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this?

return new Paragraph({
...blockPropsToStyles(block.props, exporter.options.colors),
children: exporter.transformInlineContent(block.content),
style: "Normal",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come the others don't need style & font?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Go over this file together

TableRow,
} from "docx";

export const Table = (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this in a separate file? It's not that complex and AFAIK only used in one place

const str = prettyDOM(view.container, undefined, { highlight: false });
expect(str).toMatchFileSnapshot("__snapshots__/example.jsx");

// would be nice to compare pdf images, but currently doesn't work on mac os (due to node canvas installation issue)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I was running into an issue with node canvas that might be related, and fixed it with this I think:
Automattic/node-canvas#913 (comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Go over this file together

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants