MongoDB 3.2.0 was released earlier this week. It includes some exciting features: WiredTiger storage engine by default, huge performance improvements to replica set failover, and config servers as replica sets for sharding. These improvements get all the press, but I'd like to shine some light on a couple other features that I can't wait to start using in my Node apps. The first new feature is the new bitwise query operators: $bitsAllClear, $bitsAllSet, $bitsAnyClear, and $bitsAnySet.

Why Bitwise Query Operators?

If you're like I was when I first heard of this feature, you might be thinking, "why on Earth would anybody query based on bits?" My favorite use case for bitwise query operators is bit maps for storing whether your user is available during a given time. For instance, suppose you have two collections, users and events, that look like the below mongoose schemas.

var userSchema = new Schema({
  name: String
});

var eventSchema = new Schema({
  from: Date,
  to: Date,
  attending: [{ type: ObjectId, ref: 'User' }]
});

How would you compute whether a given user has an event at a given time? That's not that bad, you'd just execute a separate query. How about computing all the users that are available at a given time? Given the current data structure, that query is going to be very expensive.

In MongoDB, you want to store what you query for. Thus, if you want to query for users based on whether they have an event at a given time, you should store that data in the user collection. You can store this data using 365 * 24 = 8760 bits: the x-th bit is on if the user is attending an event during the x-th hour of the year. The modified user schema would look like what you see below.

var userSchema = new Schema({
  name: String,
  booked: {
    type: Buffer,
    validate: function(v) {
      return v.length === 365 * 24 / 8; // buffer length in bytes
    }
  }
});

This schema design has some nice properties. The booked buffer is fixed at 1095 bytes, so it doesn't add an exorbitant amount of network overhead. Computing whether a given user is available on a given date is easy: a user is available next January 2 from 2am to 3am if the 26th bit in the buffer is 1. You can even compute whether a user is booked at a given time in the browser, without querying MongoDB. You can also query for all users that are available next January 2 from 2am to 3am using the new $bitsAllClear operator. The only downside is that you're limited to querying a year in advance, which should be sufficient for most use cases. Let's take a look at how that works in the shell.

Bitwise Query Operators in the MongoDB Shell

Let's create a user with a fake booked buffer. This user will be booked for the first 8 hours of the year and free after that. To create this user, run this command in the shell. The HexData function is just a handy way to create a buffer from a hex string in the MongoDB shell.

db.users.insert({ name: 'Val', booked: HexData(0, 'FF0000') });

You can now run a query to find all users that are available at a given time.

// Available at January 1, 12:00am. Returns nothing.
db.users.find({ booked: { $bitsAllClear: [0] } });
// Available at January 1, 3:00am. Returns nothing.
db.users.find({ booked: { $bitsAllClear: [3] } });
// Available at January 1, 8:00am. Returns the user
db.users.find({ booked: { $bitsAllClear: [8] } });
// Available at January 1, 7:00am **and** January 1, 8:00am. Returns nothing.
db.users.find({ booked: { $bitsAllClear: [7, 8] } });
// Available at January 1, 7:00am **or** January 1, 8:00am. Returns the user.
db.users.find({ booked: { $bitsAnyClear: [7, 8] } });

You could use the $where operator to run a similar query in MongoDB 3.0. However, $where can't take advantage of MongoDB indexes, so its performance is limited. The bitwise query operators, on the other hand, can use indexes.

// Create an index on `booked`
db.users.ensureIndex({ booked: 1 });
// And now `.explain()` returns an index scan rather than a
// collection scan.
db.users.find({ booked: { $bitsAllSet: [7] } }).hint({ booked: 1 }).explain()
/** Outputs something like below:
 *  {
 *    "queryPlanner" : {
 *        //...
 *      "winningPlan" : {
 *            // ...
 *            "inputStage" : {
 *                "stage" : "IXSCAN",
 *                // ...
 *            }
 *        }
 *    }
 *  }
*/

Bitwise Query Operators in Node.js

In order to use bitwise query operators, you should use mongodb driver >= 2.1.0 or mongoose >= 4.3.1. Here's how executing a basic query for users that are available on January 1 at 8am looks.

var userSchema = new mongoose.Schema({
  name: String,
  booked: {
    type: Buffer,
    validate: function(v) {
      return v.length === 365 * 24 / 8; // buffer length in bytes
    }
  }
});

var User = mongoose.model('User', userSchema);

User.find({ booked: { $bitsAllClear: [8] } }, function(error, docs) {
  /** use docs */
});

Moving On

That's it for bitwise query operators. These operators may seem like a niche feature at first glance, but the ability to query based on bit maps provides you another mechanism to avoid expensive joins. In the next post, I'll discuss the new $lookup aggregation operator, AKA MongoDB joins.

Found a typo or error? Open up a pull request! This post is available as markdown on Github
comments powered by Disqus