Skip to content

causaly/html-to-pdf

Repository files navigation

html-to-pdf

html-to-pdf

A Node.js service that converts HTML content to PDF using Puppeteer. Designed to be deployed as a containerized service.

Build Status License: MIT Docker

Features

  • Built-in HTML sanitization to prevent XSS attacks
  • Docker support for easy deployment
  • Header and footer support with automatic margin adjustment

Getting Started

Docker

  1. Pull the service:

    docker pull ghcr.io/causaly/html-to-pdf
  2. Run the service:

    docker run -p 8087:8087 ghcr.io/causaly/html-to-pdf

Requirements: Docker 20.x or higher

Development

Requirements: Node.js v22.x or higher

  1. Install dependencies:

    # Chrome will be downloaded automatically
    npm install
  2. Set up environment variables:

    cp .env.example .env
  3. Start development server:

    npm run dev

Production

  1. Install dependencies:

    # Chrome will be downloaded automatically
    npm install
  2. Build the project:

    npm run build
  3. Start production server:

    npm start

The service will be available at http://localhost:8087.

Usage

Basic Example

curl -X POST http://localhost:8087 \
  -H "Content-Type: application/json" \
  -d '{
    "body": "<html><body><h1>Hello World</h1></body></html>",
    "filename": "document.pdf"
  }' \
  --output document.pdf

With Headers and Footers

curl -X POST http://localhost:8087 \
  -H "Content-Type: application/json" \
  -d '{
    "body": "<html><body><h1>My Document</h1><p>Content here...</p></body></html>",
    "header": "<div style=\"font-size: 10px; text-align: center;\">Document Header</div>",
    "footer": "<div style=\"font-size: 10px; text-align: center;\">Page <span class=\"pageNumber\"></span> of <span class=\"totalPages\"></span></div>",
    "filename": "document.pdf"
  }' \
  --output document.pdf

API Reference

POST /
Content-Type: application/json

Request Body:

Field Type Required Description
body string Yes Valid HTML string for the main content
filename string No Custom filename for the PDF (defaults to auto-generated)
header string No HTML string for page header
footer string No HTML string for page footer

Example Request:

{
  "body": "<html><body><h1>Hello World</h1></body></html>",
  "filename": "document.pdf",
  "header": "<div style='font-size: 10px; text-align: center;'>Document Header</div>",
  "footer": "<div style='font-size: 10px; text-align: center;'>Page <span class='pageNumber'></span> of <span class='totalPages'></span></div>"
}

Success Response (200):

  • Content-Type: application/pdf
  • Content-Disposition: attachment; filename="[filename]"
  • Body: PDF file buffer

Error Responses:

Validation Error (400):

{
  "message": "A validation error occurred while creating the PDF.",
  "reason": "Validation error: Expected string, received undefined at \"body\""
}

Internal Error (500):

{
  "message": "An internal error occurred while creating the PDF.",
  "reason": "Error message details"
}

Configuration

The service requires basic configuration via environment variables:

Variable Required Default Description
HOST Yes - Server host address
PORT Yes - Server port
NODE_ENV No development Node.js environment
DEPLOY_ENV No development Deployment environment
LOG_FORMAT No pretty Log format (pretty or gcp)
LOG_LEVEL No info Log level (debug, info, warn, error, none)
ROLLBAR_ACCESS_TOKEN No - Rollbar access token for error tracking
CSP_POLICY No Loose policy with 'unsafe-inline' for script/style Content Security Policy for the rendered HTML (see the security notes below)

Environment File

Create a .env file from the example template:

cp .env.example .env

Then customize as needed. See .env.example for all available configuration options.

Development

Local Development

  1. Install dependencies:

    npm install
  2. Create environment file:

    cp .env.example .env
    # Edit .env to customize configuration if needed
  3. Start development server:

    npm run dev

Docker Development

  1. Build the image:

    docker build --platform linux/amd64 -t html-to-pdf .
  2. Run the container:

    docker run --platform linux/amd64 -p 8087:8087 html-to-pdf
  3. Run with environment variables:

    docker run --platform linux/amd64 -p 8087:8087 \
      -e LOG_LEVEL=debug \
      -e LOG_FORMAT=pretty \
      html-to-pdf

Security notes

This service renders HTML by design; this includes fetching and rendering any resources included in the document you render. It's important to think about the implications of this when configuring and deploying the service.

Isolation

Be aware you're giving users of this service the ability to fetch resources with the service, which means it performs server side requests by design. Isolate the service from other systems. Take care to ensure you're not exposing internal endpoints.

CSP

A CSP will be applied to the rendered document. By default, this policy is permissive. You will almost certainly want to tighten this. Consider disabling unsafe-inline for scripts and styles and using hash sources for any inline scripts or script attributes.

Contributing

Source code contributions are most welcome. Please open a PR, ensure the linter is satisfied and all tests pass.

License

This project is licensed under the MIT License. See the LICENSE file for details.

We are hiring

Causaly is building the world's largest biomedical knowledge platform, using technologies such as TypeScript, React and Node.js. Find out more about our openings at https://jobs.ashbyhq.com/causaly.

About

A Node.js service that converts HTML content to PDF

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 3

  •  
  •  
  •