Protobuf is a neat format for serializing and deserializing objects. Like JSON, but with several key tradeoffs. Here's how you can use protobufs in Node.js, including how to use them with Express.
Hello, Protobuf
The biggest difference between protobufs and unstructured formats like JSON and XML is that you
must define data types with protobufs. The most common way to define the structure of your data is a
.proto
file. Below is a sample .proto
file that defines a User
as an object with two properties:
a string name
and an int age
.
package userpackage;
syntax = "proto3";
message User {
string name = 1;
int32 age = 2;
}
There are several npm packages for working with protobufs. The most downloaded one is protobufjs, so we'll use that one for the purposes of this tutorial. First, import protobufjs
and use the load()
function to tell protobufjs to load the user.proto
file:
const protobuf = require('protobufjs');
run().catch(err => console.log(err));
async function run() {
const root = await protobuf.load('user.proto');
const User = root.lookupType('userpackage.User');
}
Now that protobufjs knows about the user type, you can use User
to validate and serialize objects.
There's a User.verify()
function that validates an arbitrary object to ensure that name
and age
are the correct types:
console.log(User.verify({ name: 'test', age: 2 })); // null
console.log(User.verify({ propertyDoesntExist: 'test' })); // null
console.log(User.verify({ age: 'not a number' })); // "age: integer expected"
Note that protobuf properties aren't required by default. In the above examples, verify()
doesn't report an error if name
isn't specified.
Encoding and Decoding
While protobufs do perform basic data validation, they're mostly used to serialize and deserialize objects. To serialize an object, you should call User.encode(obj).finish()
, which returns a buffer that contains the protobuf representation of the object.
const buf = User.encode({ name: 'Bill', age: 30 }).finish();
console.log(Buffer.isBuffer(buf)); // true
console.log(buf.toString('utf8')); // Gnarly string that contains "Bill"
console.log(buf.toString('hex')); // 0a0442696c6c101e
To deserialize the object, you should use the decode()
function:
const buf = User.encode({ name: 'Bill', age: 30 }).finish();
const obj = User.decode(buf);
console.log(obj); // User { name: 'Bill', age: 30 }
Protobufs are usually used for transferring data over the network as an alternative to JSON or XML. One reason why is that a serialized protobuf payload is much smaller, because it doesn't need to include property names. For example, the below example shows the difference in size between the protobuf representation and JSON representation of a small object:
const asProtobuf = User.encode({ name: 'Joe', age: 27 }).finish();
const asJSON = JSON.stringify({ name: 'Joe', age: 27 });
asProtobuf.length; // 7
asJSON.length; // 23, 3x bigger!
Because protobuf knows about the keys in the object ahead of time from the .proto
file, protobufs are much more space efficient than JSON. You can trim down your JSON key name lengths using patterns like Mongoose aliases, but you'll never get quite as small as protobufs because protobufs don't serialize key names at all!
Sending Protobufs in HTTP
Serializing and deserializing data isn't very useful unless you're storing data in files or transferring data over the network. So let's take a look at how you can use protobufs to serialize HTTP request and response bodies with Express. Below is a standalone example of sending protobufs to an Express server using Axios:
const axios = require('axios');
const express = require('express');
const protobuf = require('protobufjs');
const app = express();
run().catch(err => console.log(err));
async function run() {
const root = await protobuf.load('user.proto');
const doc = { name: 'Bill', age: 30 };
const User = root.lookupType('userpackage.User');
app.get('/user', function(req, res) {
res.send(User.encode(doc).finish());
});
app.post('/user', express.text({ type: '*/*' }), function(req, res) {
// Assume `req.body` contains the protobuf as a utf8-encoded string
const user = User.decode(Buffer.from(req.body));
Object.assign(doc, user);
res.end();
});
await app.listen(3000);
let data = await axios.get('http://localhost:3000/user').then(res => res.data);
// "Before POST User { name: 'Bill', age: 30 }"
console.log('Before POST', User.decode(Buffer.from(data)));
const postBody = User.encode({ name: 'Joe', age: 27 }).finish();
await axios.post('http://localhost:3000/user', postBody).
then(res => res.data);
data = await axios.get('http://localhost:3000/user').then(res => res.data);
// "After POST User { name: 'Joe', age: 27 }"
console.log('After POST', User.decode(Buffer.from(data)));
}
Moving On
Protobufs are a neat alternative for transferring data over the wire, especially when combined with grpc. Protobufs offer some minimal type checking because decode()
throws an error if a property has an unexpected type, and protobufs are smaller than their JSON equivalents. However, the downsides are that protobufs are not human readable (they're a binary format, not a text format like JSON or XML), and you can't decode the data without the .proto
file. Protobufs are harder to debug and harder to work with, but can be useful if you're extremely concerned about minimizing the amount of data you send over the network.