MongoDB 3.2.0 was released earlier this week. It includes some exciting
features:
WiredTiger storage engine by default,
huge performance improvements to replica set failover,
and
config servers as replica sets for sharding.
These improvements get all the press, but I'd like to shine some light on
a couple other features that I can't wait to start using in my Node apps.
The first new feature is the new bitwise query operators: $bitsAllClear
,
$bitsAllSet
, $bitsAnyClear
, and $bitsAnySet
.
Why Bitwise Query Operators?
If you're like I was when I first heard of this feature, you might be thinking, "why on Earth would anybody query based on bits?" My favorite use case for bitwise query operators is bit maps for storing whether your user is available during a given time. For instance, suppose you have two collections, users and events, that look like the below mongoose schemas.
var userSchema = new Schema({
name: String
});
var eventSchema = new Schema({
from: Date,
to: Date,
attending: [{ type: ObjectId, ref: 'User' }]
});
How would you compute whether a given user has an event at a given time? That's not that bad, you'd just execute a separate query. How about computing all the users that are available at a given time? Given the current data structure, that query is going to be very expensive.
In MongoDB, you want to store what you query for. Thus, if you want to query for users based on whether they have an event at a given time, you should store that data in the user collection. You can store this data using 365 * 24 = 8760 bits: the x-th bit is on if the user is attending an event during the x-th hour of the year. The modified user schema would look like what you see below.
var userSchema = new Schema({
name: String,
booked: {
type: Buffer,
validate: function(v) {
return v.length === 365 * 24 / 8; // buffer length in bytes
}
}
});
This schema design has some nice properties. The booked
buffer is
fixed at 1095 bytes, so it doesn't add an exorbitant amount of network
overhead. Computing whether a given user is available on a given date
is easy: a user is available next January 2 from 2am to 3am if the 26th
bit in the buffer is 1. You can even compute whether a user is booked at
a given time in the browser, without
querying MongoDB. You can also query for all users that are available
next January 2 from 2am to 3am using the new $bitsAllClear
operator.
The only downside is that you're limited to querying a year in advance,
which should be sufficient for most use cases.
Let's take a look at how that works in the shell.
Bitwise Query Operators in the MongoDB Shell
Let's create a user with a fake booked
buffer. This user will be booked
for the first 8 hours of the year and free after that. To create
this user, run this command in the shell. The HexData
function is just
a handy way to create a buffer from a hex string in the MongoDB shell.
db.users.insert({ name: 'Val', booked: HexData(0, 'FF0000') });
You can now run a query to find all users that are available at a given time.
// Available at January 1, 12:00am. Returns nothing.
db.users.find({ booked: { $bitsAllClear: [0] } });
// Available at January 1, 3:00am. Returns nothing.
db.users.find({ booked: { $bitsAllClear: [3] } });
// Available at January 1, 8:00am. Returns the user
db.users.find({ booked: { $bitsAllClear: [8] } });
// Available at January 1, 7:00am **and** January 1, 8:00am. Returns nothing.
db.users.find({ booked: { $bitsAllClear: [7, 8] } });
// Available at January 1, 7:00am **or** January 1, 8:00am. Returns the user.
db.users.find({ booked: { $bitsAnyClear: [7, 8] } });
You could use the $where
operator to run a similar query in MongoDB 3.0.
However, $where
can't take advantage of MongoDB indexes, so its performance
is limited. The bitwise query operators, on the other hand, can use indexes.
// Create an index on `booked`
db.users.ensureIndex({ booked: 1 });
// And now `.explain()` returns an index scan rather than a
// collection scan.
db.users.find({ booked: { $bitsAllSet: [7] } }).hint({ booked: 1 }).explain()
/** Outputs something like below:
* {
* "queryPlanner" : {
* //...
* "winningPlan" : {
* // ...
* "inputStage" : {
* "stage" : "IXSCAN",
* // ...
* }
* }
* }
* }
*/
Bitwise Query Operators in Node.js
In order to use bitwise query operators, you should use
mongodb driver >= 2.1.0
or
mongoose >= 4.3.1
. Here's
how executing a basic query for users that are available on January 1
at 8am looks.
var userSchema = new mongoose.Schema({
name: String,
booked: {
type: Buffer,
validate: function(v) {
return v.length === 365 * 24 / 8; // buffer length in bytes
}
}
});
var User = mongoose.model('User', userSchema);
User.find({ booked: { $bitsAllClear: [8] } }, function(error, docs) {
/** use docs */
});
Moving On
That's it for bitwise query operators. These operators may seem like a
niche feature at first glance, but the ability to query based on bit maps
provides you another mechanism to avoid expensive joins. In the next
post, I'll discuss the new
$lookup
aggregation operator, AKA MongoDB joins.