Error Handling
Error handling using the .NET driver is a requirement for a fault-tolerant system. While MongoDB supports automatic replica set failover and the driver supports multiple mongos’s, this doesn’t make a program immune to errors.
Almost all errors will occur while attempting to perform an operation on the server. In other words, you will not receive an error when constructing a MongoClient
, getting a database, or getting a collection. This is because we are connecting to servers in the background and continually trying to reconnect when a problem occurs. Only when you attempt to perform an operation do those errors become apparent.
There are a few types of errors you will see.
Server Selection Errors
Even when some servers are available, it might not be possible to satisfy a request. For example, using tag sets in a read preference when no server exists with those tags or attempting to write to a replica set when the Primary is unavailable. Both of these would result in a TimeoutException
. Below is an example exception (formatted for readability) when attempting to insert into a replica set without a primary.
System.TimeoutException: A timeout occurred after 30000ms selecting a server using
CompositeServerSelector{
Selectors =
WritableServerSelector,
LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 }
}.
Client view of cluster state is
{
ClusterId : "1",
Type : "ReplicaSet",
State : "Connected",
Servers : [
{
ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/clover:30000" }",
EndPoint: "Unspecified/clover:30000",
State: "Disconnected",
Type: "Unknown"
},
{
ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/clover:30001" }",
EndPoint: "Unspecified/clover:30001",
State: "Connected",
Type: "ReplicaSetSecondary",
WireVersionRange: "[0, 3]"
},
{
ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/clover:30002" }",
EndPoint: "Unspecified/clover:30002",
State: "Disconnected",
Type: "Unknown"
},
]
}.
Inspecting this error will tell you that we were attempting to find a writable server. The servers that are known are clover:30000
, clover:30001
, and clover:30002
. Of these, clover:30000
and clover:30002
are disconnected and clover:30001
is connected and a secondary. By default, we wait for 30 seconds to fulfill the request. During this time, we are trying to connect to clover:30000
and clover:30002
in hopes that one of them becomes available and and takes the role of primary. In this case, we timed out waiting.
Note
Even though the above scenario prevents any writes from occuring, reads can still be sent toclover:30001
as long as the read preference is Secondary
, SecondaryPreferred
, or Nearest
.
Connection Errors
A server may be unavailable for a variety of reasons. Each of these reasons will manifest themselves as a TimeoutException
. This is because, over time, it is entirely possible for the state of the servers to change in such a way
that a matching server becomes available.
important
There is nothing that can be done at runtime by your application to resolve these problems. The best thing to do is to catch them, log them, and raise a notification so that someone in your ops team can handle them.Note
If you are having trouble discovering why the driver can’t connect, enabling network tracing on the System.Net.Sockets
source may help discover the problem.
Server is unavailable
When a server is not listening at the location specified, the driver can’t connect to it.
System.TimeoutException: A timeout occurred after 30000ms selecting a server using
CompositeServerSelector{
Selectors =
WritableServerSelector,
LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 }
}.
Client view of cluster state is
{
ClusterId : "1",
Type : "Unknown",
State : "Disconnected",
Servers : [
{
ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/localhost:27017" }",
EndPoint: "Unspecified/localhost:27017",
State: "Disconnected",
Type: "Unknown",
HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server.
---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it 127.0.0.1:27017
at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult).
... snip ..."
}
]
}
We see that we are attempting to connect to localhost:27017
, but “No connection could be made because the target machine actively refused it”.
Fixing this problem either involves starting a server at the specified location or restarting your application with the correct host and port.
DNS problems
When DNS is misconfigured, or the hostname provided is not registered, resolution from hostname to IP address may fail.
System.TimeoutException: A timeout occurred after 30000ms selecting a server using
CompositeServerSelector{
Selectors =
WritableServerSelector,
LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 }
}.
Client view of cluster state is
{
ClusterId : "1",
Type : "Unknown",
State : "Disconnected",
Servers : [
{
ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/idonotexist:27017" }",
EndPoint: "Unspecified/idonotexist:27017",
State: "Disconnected",
Type: "Unknown",
HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server.
---> System.Net.Sockets.SocketException: No such host is known
at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)
... snip ..."
}
]
}
We see that we are attempting to connect to idonotexist:27017
, but “No such host is known”.
Fixing this problem either involves bringing up a server at the specified location, fixing DNS to resolve the host correctly, or modifing your hosts file.
Replica Set Misconfiguration
DNS problems might be seen when a replica set is misconfigured. It is imperative that the hosts in your replica set configuration be DNS resolvable from the client. Just because the replica set members can talk to one another does not mean that your application servers can also talk to the replica set members.
warning
Even when providing a seed list with resolvable host names, if the replica set configuration uses unresolvable host names, the driver will fail to connect.Taking too long to respond
When the latency between the driver and the server is too great, the driver may give up.
System.TimeoutException: A timeout occurred after 30000ms selecting a server using
CompositeServerSelector{
Selectors =
WritableServerSelector,
LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 }
}.
Client view of cluster state is
{
ClusterId : "1",
Type : "Unknown",
State : "Disconnected",
Servers : [
{
ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/somefaroffplace:27017" }",
EndPoint: "Unspecified/somefaroffplace:27017",
State: "Disconnected",
Type: "Unknown",
HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server.
---> System.TimeoutException: Timed out connecting to Unspecified/somefaroffplace:27017. Timeout was 00:00:30.
at MongoDB.Driver.Core.Connections.TcpStreamFactory.<ConnectAsync>d__7.MoveNext()
... snip ...
}
]
}
We see that attempting to connect to somefaroffplace:27017
failed because we “Timed out connecting to Unspecified/somefaroffplace:27017. Timeout was 00:00:30.”
The default connection timeout is 30 seconds and can be changed using the MongoClientSettings.ConnectTimeout
property or the connectTimeout
option on a connection string.
Authentication Errors
When the credentials or the authentication mechanism is incorrect, the application will fail to connect.
System.TimeoutException: A timeout occurred after 30000ms selecting a server using
CompositeServerSelector{
Selectors =
WritableServerSelector,
LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 }
}.
Client view of cluster state is
{
ClusterId : "1",
Type : "Unknown",
State : "Disconnected",
Servers : [
{
ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/localhost:27017" }",
EndPoint: "Unspecified/localhost:27017",
State: "Disconnected",
Type: "Unknown",
HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server.
---> MongoDB.Driver.MongoAuthenticationException: Unable to authenticate using sasl protocol mechanism SCRAM-SHA-1.
---> MongoDB.Driver.MongoCommandException: Command saslStart failed: Authentication failed..
... snip ...
}
]
}
We see that attempting to connect to localhost:27017
failed because the driver was “Unable to authenticate using sasl protocol mechanism SCRAM-SHA-1.” In place of SCRAM-SHA-1 could be any other authentication protocol supported by MongoDB.
Fixing this problem either involves adding the specified user to the server or restarting the application with the correct credentials and mechanisms.
Operation Errors
After successfully selecting a server to run an operation against, errors are still possible. Unlike connection errors, it is sometimes possible to take action at runtime.
Note
Most of the exceptions that are thrown from an operation inherit from MongoException
. In many cases, they also inherit from MongoServerException
. The server exception contains a ConnectionId
which can be used to tie the operation back to a specific server and a specific connection on that server. It is then possible correllate the error you are seeing in your application with an error in the server logs.
Connection Errors
A server may go down after it was selected, but before the operation was executed. These will always manifest as a MongoConnectionException
. Inspecting the inner exception will provide the actual details of the error.
MongoDB.Driver.MongoConnectionException: An exception occurred while receiving a message from the server.
---> System.IO.IOException: Unable to read data from the transport connection: Anestablished connection was aborted by the software in your host machine.
---> System.Net.Sockets.SocketException: An established connection was aborted by the software in your host machine at System.Net.Sockets.Socket.BeginReceive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, AsyncCallback callback, Object state)
at System.Net.Sockets.NetworkStream.BeginRead(Byte[] buffer, Int32 offset, Int32 size, AsyncCallback callback, Object state)
--- End of inner exception stack trace ---
... snip ...
We see from this exception that a transport connection that was successfully open at one point has been aborted.
There are too many forms of this type of exception to enumerate. In general, it is not safe to retry operations that threw a MongoConnectionException
unless the operation was idempotent. Simply getting an exception of this type doesn’t give any insight into whether the operations was received by the server or what happened in the server if it was received.
Write Exceptions
When performing a write, it is possible to receive a MongoWriteException
. This exception has two important properties, WriteError
and WriteConcernError
.
Write Error
A write error means that there was an error applying the write. The cause could be many different things. The WriteError
contains a number of properties which may help in the diagnosis of the problem. The Code
property will indicate specifically what went wrong. For general categories of errors, the driver also provides a helpful Category
property which classifies certain codes.
MongoDB.Driver.MongoWriteException: A write operation resulted in an error. E11000 duplicate key error index: test.people.$_id_ dup key: { : 0 }
---> MonoDB.Driver.MongoBulkWriteException`1[MongoDB.Bson.BsonDocument]: A bulk write oeration resulted in one or more errors. E11000 duplicate key error index: test.people.$_id_ dup key: { : 0 }
at MongoDB.Driver.MongoCollectionImpl`1.<BulkWriteAsync>d__11.MoveNext() in :\projects\mongo-csharp-driver\src\MongoDB.Driver\MongoCollectionImpl.cs:line 16
We see from this exception that we’ve attempted to insert a document with a duplicate _id field. In this case, the write error would contain the category DuplicateKeyError
.
Write Concern Error
A write concern error indicates that the server was unable to guarantee the write operation to the level specified. See the server’s documentation for more information.
Bulk Write Exceptions
A MongoBulkWriteException
will occur when using the InsertManyAsync
or BulkWriteAsync
. This exception is just a rollup of a bunch of individual write errors. It also includes a write concern error and a Result
property.
MongoDB.Driver.MongoBulkWriteException`1[MongoDB.Bson.BsonDocument]: A bulk write operation resulted in one or more errors.
E11000 duplicate key error index: test.people.$_id_ dup key: { : 0 }
at MongoDB.Driver.MongoCollectionImpl`1.<BulkWriteAsync>d__11.MoveNext() in c :\projects\mongo-csharp-driver\src\MongoDB.Driver\MongoCollectionImpl.cs:line 166
Above, we see that a duplicate key exception occurred. In this case, two writes existed in the batch. Inspected the WriteErrors
property would allow the identification of which write failed.
Ordered Writes
Bulk writes are allowed to be processed either in order or without order. When processing in order, the first error that occurs will stop the processing of all subsequent writes. In this case, all of the unprocessed writes will be available via the UnprocessedRequests
property.