Error Handling

Error handling using the .NET driver is a requirement for a fault-tolerant system. While MongoDB supports automatic replica set failover and the driver supports multiple mongos’s, this doesn’t make a program immune to errors.

Almost all errors will occur while attempting to perform an operation on the server. In other words, you will not receive an error when constructing a MongoClient, getting a database, or getting a collection. This is because we are connecting to servers in the background and continually trying to reconnect when a problem occurs. Only when you attempt to perform an operation do those errors become apparent.

There are a few types of errors you will see.

Server Selection Errors

Even when some servers are available, it might not be possible to satisfy a request. For example, using tag sets in a read preference when no server exists with those tags or attempting to write to a replica set when the Primary is unavailable. Both of these would result in a TimeoutException. Below is an example exception (formatted for readability) when attempting to insert into a replica set without a primary.

System.TimeoutException: A timeout occured after 30000ms selecting a server using 
CompositeServerSelector{ 
    Selectors = 
        WritableServerSelector, 
        LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } 
}. 

Client view of cluster state is 
{ 
    ClusterId : "1", 
    Type : "ReplicaSet", 
    State : "Connected", 
    Servers : [
        { 
            ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/clover:30000" }", 
            EndPoint: "Unspecified/clover:30000", 
            State: "Disconnected", 
            Type: "Unknown" 
        }, 
        { 
            ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/clover:30001" }", 
            EndPoint: "Unspecified/clover:30001", 
            State: "Connected", 
            Type: "ReplicaSetSecondary", 
            WireVersionRange: "[0, 3]" 
        }, 
        { 
            ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/clover:30002" }", 
            EndPoint: "Unspecified/clover:30002", 
            State: "Disconnected", 
            Type: "Unknown"
        }, 
    ] 
}.

Inspecting this error will tell you that we were attempting to find a writable server. The servers that are known are clover:30000, clover:30001, and clover:30002. Of these, clover:30000 and clover:30002 are disconnected and clover:30001 is connected and a secondary. By default, we wait for 30 seconds to fulfill the request. During this time, we are trying to connect to clover:30000 and clover:30002 in hopes that one of them becomes available and and takes the role of primary. In this case, we timed out waiting.

Note
Even though the above scenario prevents any writes from occuring, reads can still be sent to clover:30001 as long as the read preference is Secondary, SecondaryPreferred, or Nearest.

Connection Errors

A server may be unavailable for a variety of reasons. Each of these reasons will manifest themselves as a TimeoutException. This is because, over time, it is entirely possible for the state of the servers to change in such a way that a matching server becomes available.

important
There is nothing that can be done at runtime by your application to resolve these problems. The best thing to do is to catch them, log them, and raise a notification so that someone in your ops team can handle them.

Note

If you are having trouble discovering why the driver can’t connect, enabling network tracing on the System.Net.Sockets source may help discover the problem.

Server is unavailable

When a server is not listening at the location specified, the driver can’t connect to it.

System.TimeoutException: A timeout occured after 30000ms selecting a server using
CompositeServerSelector{ 
    Selectors = 
        WritableServerSelector, 
        LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } 
}. 

Client view of cluster state is 
{ 
    ClusterId : "1", 
    Type : "Unknown", 
    State : "Disconnected", 
    Servers : [
        { 
            ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/localhost:27017" }", 
            EndPoint: "Unspecified/localhost:27017", 
            State: "Disconnected", 
            Type: "Unknown", 
            HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server. 
---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it 127.0.0.1:27017
   at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult).
   ... snip ..."
        }
    ]
}

We see that we are attempting to connect to localhost:27017, but “No connection could be made because the target machine actively refused it”.

Fixing this problem either involves starting a server at the specified location or restarting your application with the correct host and port.

DNS problems

When DNS is misconfigured, or the hostname provided is not registered, resolution from hostname to IP address may fail.

System.TimeoutException: A timeout occured after 30000ms selecting a server using 
CompositeServerSelector{ 
    Selectors = 
        WritableServerSelector, 
        LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } 
}. 

Client view of cluster state is 
{ 
    ClusterId : "1", 
    Type : "Unknown", 
    State : "Disconnected", 
    Servers : [
        { 
            ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/idonotexist:27017" }", 
            EndPoint: "Unspecified/idonotexist:27017", 
            State: "Disconnected", 
            Type: "Unknown", 
            HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server. 
---> System.Net.Sockets.SocketException: No such host is known
   at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)
   ... snip ..."
        }
    ]
}

We see that we are attempting to connect to idonotexist:27017, but “No such host is known”.

Fixing this problem either involves bringing up a server at the specified location, fixing DNS to resolve the host correctly, or modifing your hosts file.

Replica Set Misconfiguration

DNS problems might be seen when a replica set is misconfigured. It is imperative that the hosts in your replica set configuration be DNS resolvable from the client. Just because the replica set members can talk to one another does not mean that your application servers can also talk to the replica set members.

warning
Even when providing a seed list with resolvable host names, if the replica set configuration uses unresolvable host names, the driver will fail to connect.

Taking too long to respond

When the latency between the driver and the server is too great, the driver may give up.

System.TimeoutException: A timeout occured after 30000ms selecting a server using
CompositeServerSelector{ 
    Selectors = 
        WritableServerSelector, 
        LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } 
}.

Client view of cluster state is 
{ 
    ClusterId : "1", 
    Type : "Unknown", 
    State : "Disconnected", 
    Servers : [
        {
            ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/somefaroffplace:27017" }", 
            EndPoint: "Unspecified/somefaroffplace:27017", 
            State: "Disconnected", 
            Type: "Unknown",
            HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server. 
---> System.TimeoutException: Timed out connecting to Unspecified/somefaroffplace:27017. Timeout was 00:00:30.
   at MongoDB.Driver.Core.Connections.TcpStreamFactory.<ConnectAsync>d__7.MoveNext()
   ... snip ...
        }
    ]
}

We see that attempting to connect to somefaroffplace:27017 failed because we “Timed out connecting to Unspecified/somefaroffplace:27017. Timeout was 00:00:30.”

The default connection timeout is 30 seconds and can be changed using the MongoClientSettings.ConnectTimeout property or the connectTimeout option on a connection string.

Authentication Errors

When the credentials or the authentication mechanism is incorrect, the application will fail to connect.

System.TimeoutException: A timeout occured after 30000ms selecting a server using
CompositeServerSelector{ 
    Selectors = 
        WritableServerSelector, 
        LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } 
}. 
        
Client view of cluster state is 
{ 
    ClusterId : "1", 
    Type : "Unknown", 
    State : "Disconnected", 
    Servers : [
        {
            ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/localhost:27017" }", 
            EndPoint: "Unspecified/localhost:27017",
            State: "Disconnected", 
            Type: "Unknown", 
            HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server. 
 ---> MongoDB.Driver.MongoAuthenticationException: Unable to authenticate using sasl protocol mechanism SCRAM-SHA-1. 
 ---> MongoDB.Driver.MongoCommandException: Command saslStart failed: Authentication failed..
   ... snip ...
        }
    ]
}

We see that attempting to connect to localhost:27017 failed because the driver was “Unable to authenticate using sasl protocol mechanism SCRAM-SHA-1.” In place of SCRAM-SHA-1 could be any other authentication protocol supported by MongoDB.

Fixing this problem either involves adding the specified user to the server or restarting the application with the correct credentials and mechanisms.

Operation Errors

After successfully selecting a server to run an operation against, errors are still possible. Unlike connection errors, it is sometimes possible to take action at runtime.

Note

Most of the exceptions that are thrown from an operation inherit from MongoException. In many cases, they also inherit from MongoServerException. The server exception contains a ConnectionId which can be used to tie the operation back to a specific server and a specific connection on that server. It is then possible correllate the error you are seeing in your application with an error in the server logs.

Connection Errors

A server may go down after it was selected, but before the operation was executed. These will always manifest as a MongoConnectionException. Inspecting the inner exception will provide the actual details of the error.

MongoDB.Driver.MongoConnectionException: An exception occurred while receiving a message from the server. 
---> System.IO.IOException: Unable to read data from the transport connection: Anestablished connection was aborted by the software in your host machine. 
---> System.Net.Sockets.SocketException: An established connection was aborted by the software in your host machine at System.Net.Sockets.Socket.BeginReceive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, AsyncCallback callback, Object state)
   at System.Net.Sockets.NetworkStream.BeginRead(Byte[] buffer, Int32 offset, Int32 size, AsyncCallback callback, Object state)
   --- End of inner exception stack trace ---
... snip ...

We see from this exception that a transport connection that was successfully open at one point has been aborted.

There are too many forms of this type of exception to enumerate. In general, it is not safe to retry operations that threw a MongoConnectionException unless the operation was idempotent. Simply getting an exception of this type doesn’t give any insight into whether the operations was received by the server or what happened in the server if it was received.

Write Exceptions

When performing a write, it is possible to receive a MongoWriteException. This exception has two important properties, WriteError and WriteConcernError.

Write Error

A write error means that there was an error applying the write. The cause could be many different things. The WriteError contains a number of properties which may help in the diagnosis of the problem. The Code property will indicate specifically what went wrong. For general categories of errors, the driver also provides a helpful Category property which classifies certain codes.

MongoDB.Driver.MongoWriteException: A write operation resulted in an error. E11000 duplicate key error index: test.people.$_id_ dup key: { : 0 } 
---> MonoDB.Driver.MongoBulkWriteException`1[MongoDB.Bson.BsonDocument]: A bulk write oeration resulted in one or more errors. E11000 duplicate key error index: test.people.$_id_ dup key: { : 0 }
   at MongoDB.Driver.MongoCollectionImpl`1.<BulkWriteAsync>d__11.MoveNext() in :\projects\mongo-csharp-driver\src\MongoDB.Driver\MongoCollectionImpl.cs:line 16

We see from this exception that we’ve attempted to insert a document with a duplicate _id field. In this case, the write error would contain the category DuplicateKeyError.

Write Concern Error

A write concern error indicates that the server was unable to guarantee the write operation to the level specified. See the server’s documentation for more information.

Bulk Write Exceptions

A MongoBulkWriteException will occur when using the InsertManyAsync or BulkWriteAsync. This exception is just a rollup of a bunch of individual write errors. It also includes a write concern error and a Result property.

MongoDB.Driver.MongoBulkWriteException`1[MongoDB.Bson.BsonDocument]: A bulk write operation resulted in one or more errors. 
  E11000 duplicate key error index: test.people.$_id_ dup key: { : 0 }
   at MongoDB.Driver.MongoCollectionImpl`1.<BulkWriteAsync>d__11.MoveNext() in c :\projects\mongo-csharp-driver\src\MongoDB.Driver\MongoCollectionImpl.cs:line 166

Above, we see that a duplicate key exception occured. In this case, two writes existed in the batch. Inspected the WriteErrors property would allow the identification of which write failed.

Ordered Writes

Bulk writes are allowed to be processed either in order or without order. When processing in order, the first error that occurs will stop the processing of all subsequent writes. In this case, all of the unprocessed writes will be available via the UnprocessedRequests property.