Getting data into and out of MongoDB is a pain. Although MongoDB 3.0 and 3.2
introduced a lot of great new features, changes in MongoDB's authentication
have introduced a lot of nasty quirks to the
shell's copyDatabase()
function.
Similarly, the
mongodump/mongorestore and
mongoimport/mongoexport
binaries that come with MongoDB are only useful for a small handful of
use cases.
Mongoimport and mongoexport only let you import/export a single collection,
and mongodump/mongorestore produce binary data that isn't human readable.
Neither is an adequate solution to "I want to put some
sample data into my GitHub repo so I don't have to generate data by
pointing and clicking in my web app."
One common pattern I see for this task is "seed scripts," complex scripts that generate a data set and put it into MongoDB. This pattern seems to be particularly popular among Fullstack Academy grads. However, seed scripts are limited: they build up data imperatively rather than declaratively, and building multiple seed scripts for different data sets requires a lot of discipline. Seed scripts are only marginally better than writing shell script wrappers around mongoimport/export, which is why I wrote dookie, a tool that allows you to pre-process JSON or YAML data before inserting it into MongoDB.
How Dookie Works
Dookie has two fundamental operations, 'push' and 'pull'. In other words, you can push data into MongoDB and pull data out of MongoDB. Let's say you have the following JSON object:
{
"people": [
{
"_id": {
"$oid": "561d87b8b260cf35147998ca"
},
"name": "Axl Rose"
},
{
"_id": {
"$oid": "561d88f5b260cf35147998cb"
},
"name": "Slash"
}
],
"bands": [
{
"_id": "Guns N' Roses",
"members": [
"Axl Rose",
"Slash"
]
}
]
}
This object represents the state of a MongoDB database. The database has 2
collections: "people" and "bands". The "people" collection has two documents,
and the "bands" collection has one. In the MongoDB shell, this database
looks like what you see below. Note that the objects with $oid
keys in the
JSON representation become MongoDB ObjectIds - dookie supports
MongoDB extended JSON syntax.
> show collections
bands
people
> db.bands.find().pretty()
{ "_id" : "Guns N' Roses", "members" : [ "Axl Rose", "Slash" ] }
> db.people.find().pretty()
{ "_id" : ObjectId("561d87b8b260cf35147998ca"), "name" : "Axl Rose" }
{ "_id" : ObjectId("561d88f5b260cf35147998cb"), "name" : "Slash" }
>
Dookie's "pull" operation lets you export this data from MongoDB into
a JSON file. The below command exports the database named "test" into
the file export.json
.
$ ./node_modules/.bin/dookie pull --db test --file ./export.json
Writing data from mongodb://localhost:27017/test to ./export.json
$ head export.json
{
"people": [
{
"_id": {
"$oid": "561d87b8b260cf35147998ca"
},
"name": "Axl Rose"
},
{
"_id": {
You can also use dookie from Node.js. Dookie has a pull()
function that
takes in a
MongoDB connection string and
returns a
promise. The below script writes the contents of the "test"
database to export.json
.
const dookie = require('dookie');
const fs = require('fs');
dookie.pull('mongodb://localhost:27017/test').then(function(res) {
fs.writeFileSync('./export.json', res);
});
The "push" operation does the opposite. Once you have export.json
, you
can then replace the test database with the contents of export.json
.
$ mongo
MongoDB shell version: 3.2.0
connecting to: test
> db.dropDatabase();
{ "dropped" : "test", "ok" : 1 }
> ^C
bye
$ dookie push --db test --file ./export.json
Writing data from ./export.json to test
Success!
$ mongo
MongoDB shell version: 3.2.0
connecting to: test
Server has startup warnings:
> show collections
bands
people
> db.bands.find().pretty()
{ "_id" : "Guns N' Roses", "members" : [ "Axl Rose", "Slash" ] }
>
You can also do the same thing from Node.js. This is very handy for mocha
tests. These days my
API integration tests
almost always have dookie.push()
in a beforeEach()
hook.
const dookie = require('dookie');
const fs = require('fs');
const data = JSON.parse(fs.readFileSync('./export.json', 'utf8'));
dookie.push('mongodb://localhost:27017/test', data).then(function() {
console.log('done!');
});
The "push" operation also supports YAML, so if you have
a file named export.yml
like you see below:
people:
- _id:
# MongoDB extended JSON syntax
$oid: 561d87b8b260cf35147998ca
name: Axl Rose
- _id:
$oid: 561d88f5b260cf35147998cb
name: Slash
bands:
- _id: Guns N' Roses
members:
- Axl Rose
- Slash
You can import it using dookie push
. In my experience YAML a better language
for dookie data sets than JSON, because it's easier to read and supports
comments.
Pre-processors For Dookie Push
So far you've used dookie to import and export data as-is. However, dookie's push operation has some powerful syntactic sugar inspired by the CSS pre-processor stylus. I occasionally describe dookie as "stylus for MongoDB data sets."
The first helper you'll learn about is $extend
. Let's say your web app
allows you to search through Arnold Schwarzenegger movies, and you want to
create a data set to test your search API. Every movie has some things in
common, and you don't want to copy/paste these details everywhere. The
$extend
syntax lets your documents inherit properties from "variables."
In dookie, a "variable" is a top-level field that starts with "$" (because
MongoDB collection names can't start with "$").
$movie:
deleted: false
leadActor: Arnold Schwarzenegger
movies:
- $extend: $movie
name: Jingle All The Way
supportingActors:
- Jake Lloyd
- $extend: $movie
name: "Terminator 2: Judgment Day"
supportingActors:
- Linda Hamilton
- Robert Patrick
If you run dookie push
on the above YAML file, you'll get a database with
1 collection that contains 2 documents, both with the correct deleted
and
leadActor
fields.
$ dookie push --db test --file ./test.yml
Writing data from ./test.yml to test
Success!
$ mongo
MongoDB shell version: 3.2.0
connecting to: test
> show collections
movies
> db.movies.find().pretty()
{
"_id" : ObjectId("56bcbeb496339ea4340a71b3"),
"name" : "Jingle All The Way",
"supportingActors" : [
"Jake Lloyd"
],
"deleted" : false,
"leadActor" : "Arnold Schwarzenegger"
}
{
"_id" : ObjectId("56bcbeb496339ea4340a71b4"),
"name" : "Terminator 2: Judgment Day",
"supportingActors" : [
"Linda Hamilton",
"Robert Patrick"
],
"deleted" : false,
"leadActor" : "Arnold Schwarzenegger"
}
The $extend
syntax is even more powerful when you combine it with $require
.
Let's say you want to write multiple test files that leverage the above
data set. For instance, suppose you wanted to test what happens when a movie
has deleted
set to true, so you wrote the below file deleted.yml
.
$require: "./test.yml"
movies:
- $extend: $movie
deleted: true
name: Hercules in New York
The $require
syntax above pulls in the test.yml
file from the $extend
example, including all of its variables. The $require
syntax also allows
you to attach an additional movie. Once you push the above file, you get
a 'movies' collection with 3 documents.
$ dookie push --db test --file ./deleted.yml
Writing data from ./deleted.yml to test
Success!
$ mongo
MongoDB shell version: 3.2.0
connecting to: test
> db.movies.find().pretty()
{
"_id" : ObjectId("56bcc074b57de15135356a0f"),
"name" : "Jingle All The Way",
"supportingActors" : [
"Jake Lloyd"
],
"deleted" : false,
"leadActor" : "Arnold Schwarzenegger"
}
{
"_id" : ObjectId("56bcc074b57de15135356a10"),
"name" : "Terminator 2: Judgement Day",
"supportingActors" : [
"Linda Hamilton",
"Robert Patrick"
],
"deleted" : false,
"leadActor" : "Arnold Schwarzenegger"
}
{
"_id" : ObjectId("56bcc074b57de15135356a11"),
"deleted" : true,
"name" : "Hercules in New York",
"leadActor" : "Arnold Schwarzenegger"
}
Moving On
Stop using seed scripts and mongoimport/mongoexport scripts for your MongoDB and Node.js test data! Dookie lets you build and compose data sets in a declarative manner, and stand up the data sets from mocha tests or from the command line. Dookie has been the single biggest time-saver in my MongoDB REST API workflow over the last year. Check dookie out on npm and save yourself a lot of headache with MongoDB sample data.