Account Transactions

Metadata, courtesy of http://www.flickr.com/photos/68751915@N05/6629034769

MongoDB does not as of 2.6 support any notion of a transaction across a document or multiple documents. It does however guarantee atomic operations on single documents. This allows us to implement a Two-Phase commit strategy using double bookkeeping. However it’s important to note that due to only single document operations being atomic, MongoDB can only offer transaction-like semantics. It’s still possible for applications to return intermediate during the two-phase commit or rollback.

In this chapter we will use two collections to simulate a bank account system. The first collection accounts contains all the customer accounts and the second one transactions is our transaction book-keeping collection. The goal is to transfer 100 from Joe to Peter using a two-phase commit.

Two-phase Commit

First create two accounts

var col = db.getSisterDB("bank").accounts;
col.insert({name: "Joe", balance: 1000, pendingTransactions:[]});
col.insert({name: "Peter", balance: 1000, pendingTransactions:[]});

Let’s set up the initial transaction

var col = db.getSisterDB("bank").transactions;
col.insert({source: "Joe", destination: "Peter", amount: 100, state: "intital"});

Let’s first update the transaction to pending

var col = db.getSisterDB("bank").transactions;
var transaction = col.findOne({state: "initial"});
col.update({_id: transaction._id}, {$set: {state: "pending"}});

Next apply the transaction to both accounts

var col = db.getSisterDB("bank").accounts;
col.update({
    name: transaction.source, pendingTransactions: {$ne: transaction._id}
  }, {
    $inc: {balance: -transaction.value}, $push: {pendingTransactions: transaction._id}
  });
col.update({
    name: transaction.source, pendingTransactions: {$ne: transaction._id}
  }, {
    $inc: {balance: transaction.value} , $push: {pendingTransactions: transaction._id}
  });

Set the transaction to committed state

var col = db.getSisterDB("bank").transactions;
col.update({_id: transaction._id}, {$set: {state: "commited"}});

Remove the pending transactions from the accounts

var col = db.getSisterDB("bank").accounts;
col.update({name: transaction.source}, {$pull: {pendingTransactions: transaction._id}});
col.update({name: transaction.destination}, {$pull: {pendingTransactions: transaction._id}});

Finally set the transaction state to done

var col = db.getSisterDB("bank").transactions;
col.update({_id: transaction._id}, {$set: {state: "done"}});

Rollback

There are two types of errors during a two-phase commit that might force us to rollback the transaction.

  1. There is an error before applying the transaction to the accounts. To recover from this all transactions marked pending need to be retrieved and the application must retry applying the transaction to the accounts.
  2. There is an error after applying the transaction to the accounts but before marking the transaction as done. To recover from this all transactions marked committed need to be retrieved and start from removing the pending transactions.

Correcting for the two error cases will allow an application to resume a transaction a get the accounts to a consistent state.

Recovery Process

It’s helpful to have a process that will look for for any transactions left in pending or commited state. To determine the time a transaction has been sitting in an interrupted state it might be worth adding a create_at timestamp.

But for some cases you might need to undo (rollback) a transaction due to the application canceling the transaction or because it cannot be recovered (for example if the one of the accounts does not exist during the transaction).

There are two points in the two-phase commit we can rollback.

  1. If you have applied the transaction to the accounts you should not rollback. Instead create a new transaction and switch original source and destination fields.
  2. If you have created the transaction but have not yet applied it you can use the following steps.

First set the transaction state to canceling

var col = db.getSisterDB("bank").transactions;
col.update({_id: transaction._id}, {$set: {state: "canceling"}});

Next let’s undo the transaction. Notice that a non-applied transaction means the transaction _id has not yet been removed from the pendingTransactions array.

var col = db.getSisterDB("bank").accounts;
col.update({
    name: transaction.source, pendingTransactions: transaction._id
  }, {
    $inc: {balance: transaction.value}, $pull: {pendingTransactions: transaction._id}
  });
col.update({
    name: transaction.destination, pendingTransactions: transaction._id
  }, {
    $inc: {balance: -transaction.value} , $pull: {pendingTransactions: transaction._id}
  });

Finally set the transaction state to canceled

var col = db.getSisterDB("bank").transactions;
col.update({_id: transaction._id}, {$set: {state: "canceled"}});

Concurrent Transaction Application

Let’s imagine to applications A1 and A2 that both start processing the single transaction T1. Given that the transaction is still in initial then.

  1. A1 can apply T1 before A2 starts
  2. A2 will then apply T1 again because it does not appear as pending in the accounts documents.

You can avoid this by making is explicit in the transaction which application is handling it.

var col = db.getSisterDB("bank").transactions;
col.findAndModify({
    query: {state: "initial", application: {$exists: 0}}
  , update: {$set: {state: "pending", application: "A1"}}
  , new: true});

findAndModify will retrieve and update the document atomically. This guarantees that only a single application can tag a transaction as being processed by it. In this case a transaction in the initial state is marked as being processed by A1 if the application field does not exist.

If the transaction fails or needs to be rolled back, you can retrieve the pending transactions for a specific application A1.

var col = db.getSisterDB("bank").transactions;
col.transactions.find({application: "A1", state: "pending"});

Notes

Real world applications will likely be more complex requiring updating more than just the balance of the accounts. The application might need to update pending credits, pending debits as well as to assure that the account has sufficient balance to cover the transaction.

If these fields are part of the account document they can still occur within a single update ensuing an atomic update of all the fields.