The pdf-lib npm module is a great tool for creating and editting PDFs with Node.js. Puppeteer is a great tool for generating PDFs from HTML, but unfortunately browser support for print layouts in CSS is not very good in my experience. The pdf-lib module gives you very fine grained control over PDFs, and is great for tasks like merging PDFs, adding page numbers and watermarks, splitting PDFs, and basically anything else you might use the ILovePDF API for.

Getting Started

Let's use pdf-lib to create a simple PDF document. The PDF document will have 1 page with the Mastering JS logo in the middle.

const { PDFDocument } = require('pdf-lib');
const fs = require('fs');

run().catch(err => console.log(err));

async function run() {
  // Create a new document and add a new page
  const doc = await PDFDocument.create();
  const page = doc.addPage();

  // Load the image and store it as a Node.js buffer in memory
  let img = fs.readFileSync('./logo.png');
  img = await doc.embedPng(img);

  // Draw the image on the center of the page
  const { width, height } = img.scale(1);
  page.drawImage(img, {
    x: page.getWidth() / 2 - width / 2,
    y: page.getHeight() / 2 - height / 2
  });

  // Write the PDF to a file
  fs.writeFileSync('./test.pdf', await doc.save());
}

Running the above script generates the below PDF. Working with pdf-lib is pretty easy, there's just a few gotchas: note that PDFDocument#embedPng() and PDFDocument#save() return promises, so you need to use await.

Merging 2 PDFs

The killer feature for pdf-lib is that you can modify existing PDFs, not just create new ones. For example, suppose you have two PDFs: one containing the cover of an eBook, and one containing the eBook content. How can you merge the two? I used the ILovePDF API for my last eBook, but pdf-lib makes this task easy in Node.js.

Here's two PDF files: cover.pdf and page-30-31.pdf. The below script uses pdf-lib to combine the two into a single test.pdf file.

const { PDFDocument } = require('pdf-lib');
const fs = require('fs');

run().catch(err => console.log(err));

async function run() {
  // Load cover and content pdfs
  const cover = await PDFDocument.load(fs.readFileSync('./cover.pdf'));
  const content = await PDFDocument.load(fs.readFileSync('./page-30-31.pdf'));

  // Create a new document
  const doc = await PDFDocument.create();

  // Add the cover to the new doc
  const [coverPage] = await doc.copyPages(cover, [0]);
  doc.addPage(coverPage);

  // Add individual content pages
  const contentPages = await doc.copyPages(content, content.getPageIndices());
  for (const page of contentPages) {
    doc.addPage(page);
  }

  // Write the PDF to a file
  fs.writeFileSync('./test.pdf', await doc.save());
}

Below is what the merged PDF looks like.

Adding Page Numbers

One of the biggest pain points of generating PDFs from HTML with Puppeteer is how painful it is to add page numbers. Seems simple, but CSS print layouts still don't quite work for that case. Take a look at the time I wrote a for loop with hard-coded pixel offsets to get page numbers to show up correctly.

For example, here's a PDF containing first 4 pages of Mastering Async/Await without the page numbers: ./content.pdf. Below is a script that adds page numbers to each page in the PDF.

const { PDFDocument, StandardFonts, rgb } = require('pdf-lib');
const fs = require('fs');

run().catch(err => console.log(err));

async function run() {
  const content = await PDFDocument.load(fs.readFileSync('./content.pdf'));

  // Add a font to the doc
  const helveticaFont = await content.embedFont(StandardFonts.Helvetica);

  // Draw a number at the bottom of each page.
  // Note that the bottom of the page is `y = 0`, not the top
  const pages = await content.getPages();
  for (const [i, page] of Object.entries(pages)) {
    page.drawText(`${+i + 1}`, {
      x: page.getWidth() / 2,
      y: 10,
      size: 15,
      font: helveticaFont,
      color: rgb(0, 0, 0)
    });
  }

  // Write the PDF to a file
  fs.writeFileSync('./test.pdf', await content.save());
}

Below is what the page numbers the script added look like.

Moving On

The Node.js ecosystem is filled with excellent libraries for solving almost any problem you can think of. The pdf-lib module lets you modify PDFs, sharp lets you handle almost anything with images, pkg bundles Node projects into standalone executables, and so many more. Before you start looking for an online API to solve an issue you're seeing, try searching npm, you might find a better solution.

Found a typo or error? Open up a pull request! This post is available as markdown on Github
comments powered by Disqus