The pdf-lib npm module is a great tool for creating and editting PDFs with Node.js. Puppeteer is a great tool for generating PDFs from HTML, but unfortunately browser support for print layouts in CSS is not very good in my experience. The pdf-lib module gives you very fine grained control over PDFs, and is great for tasks like merging PDFs, adding page numbers and watermarks, splitting PDFs, and basically anything else you might use the ILovePDF API for.
Getting Started
Let's use pdf-lib to create a simple PDF document. The PDF document will have 1 page with the Mastering JS logo in the middle.
const { PDFDocument } = require('pdf-lib');
const fs = require('fs');
run().catch(err => console.log(err));
async function run() {
// Create a new document and add a new page
const doc = await PDFDocument.create();
const page = doc.addPage();
// Load the image and store it as a Node.js buffer in memory
let img = fs.readFileSync('./logo.png');
img = await doc.embedPng(img);
// Draw the image on the center of the page
const { width, height } = img.scale(1);
page.drawImage(img, {
x: page.getWidth() / 2 - width / 2,
y: page.getHeight() / 2 - height / 2
});
// Write the PDF to a file
fs.writeFileSync('./test.pdf', await doc.save());
}
Running the above script generates the below PDF. Working with pdf-lib is pretty easy,
there's just a few gotchas: note that PDFDocument#embedPng()
and PDFDocument#save()
return promises, so you need to use await
.
Merging 2 PDFs
The killer feature for pdf-lib is that you can modify existing PDFs, not just create new ones. For example, suppose you have two PDFs: one containing the cover of an eBook, and one containing the eBook content. How can you merge the two? I used the ILovePDF API for my last eBook, but pdf-lib makes this task easy in Node.js.
Here's two PDF files: cover.pdf
and page-30-31.pdf
. The below script uses pdf-lib to combine the two into a single
test.pdf
file.
const { PDFDocument } = require('pdf-lib');
const fs = require('fs');
run().catch(err => console.log(err));
async function run() {
// Load cover and content pdfs
const cover = await PDFDocument.load(fs.readFileSync('./cover.pdf'));
const content = await PDFDocument.load(fs.readFileSync('./page-30-31.pdf'));
// Create a new document
const doc = await PDFDocument.create();
// Add the cover to the new doc
const [coverPage] = await doc.copyPages(cover, [0]);
doc.addPage(coverPage);
// Add individual content pages
const contentPages = await doc.copyPages(content, content.getPageIndices());
for (const page of contentPages) {
doc.addPage(page);
}
// Write the PDF to a file
fs.writeFileSync('./test.pdf', await doc.save());
}
Below is what the merged PDF looks like.
Adding Page Numbers
One of the biggest pain points of generating PDFs from HTML with Puppeteer is
how painful it is to add page numbers. Seems simple, but CSS print layouts still don't quite work for that case.
Take a look at the time I wrote a for
loop with hard-coded pixel offsets to get page numbers to show up correctly.
For example, here's a PDF containing first 4 pages of Mastering Async/Await without
the page numbers: ./content.pdf
. Below is a script that
adds page numbers to each page in the PDF.
const { PDFDocument, StandardFonts, rgb } = require('pdf-lib');
const fs = require('fs');
run().catch(err => console.log(err));
async function run() {
const content = await PDFDocument.load(fs.readFileSync('./content.pdf'));
// Add a font to the doc
const helveticaFont = await content.embedFont(StandardFonts.Helvetica);
// Draw a number at the bottom of each page.
// Note that the bottom of the page is `y = 0`, not the top
const pages = await content.getPages();
for (const [i, page] of Object.entries(pages)) {
page.drawText(`${+i + 1}`, {
x: page.getWidth() / 2,
y: 10,
size: 15,
font: helveticaFont,
color: rgb(0, 0, 0)
});
}
// Write the PDF to a file
fs.writeFileSync('./test.pdf', await content.save());
}
Below is what the page numbers the script added look like.
Moving On
The Node.js ecosystem is filled with excellent libraries for solving almost any problem you can think of. The pdf-lib module lets you modify PDFs, sharp lets you handle almost anything with images, pkg bundles Node projects into standalone executables, and so many more. Before you start looking for an online API to solve an issue you're seeing, try searching npm, you might find a better solution.