Aggregation Framework

The aggregation pipeline is a framework for data aggregation, modeled on the concept of data processing pipelines.

Prerequisites

  • The example below requires a restaurants collection in the test database. To create and populate the collection, follow the directions in github.

  • Include the following import statements:

     import com.mongodb.Block;
     import com.mongodb.client.MongoClients;
     import com.mongodb.client.MongoClient;
     import com.mongodb.client.MongoCollection;
     import com.mongodb.client.MongoDatabase;
     import com.mongodb.client.model.Aggregates;
     import com.mongodb.client.model.Accumulators;
     import com.mongodb.client.model.Projections;
     import com.mongodb.client.model.Filters;
         
     import org.bson.Document;
    
  • Include the following code which the examples in the tutorials will use to print the results of the aggregation:

     Block<Document> printBlock = new Block<Document>() {
            @Override
            public void apply(final Document document) {
                System.out.println(document.toJson());
            }
        };
    

Connect to a MongoDB Deployment

Connect to a MongoDB deployment and declare and define a MongoDatabase and a MongoCollection instances.

For example, include the following code to connect to a standalone MongoDB deployment running on localhost on port 27017 and define database to refer to the test database and collection to refer to the restaurants collection.

MongoClient mongoClient = MongoClients.create();
MongoDatabase database = mongoClient.getDatabase("test");
MongoCollection<Document> collection = database.getCollection("restaurants");

For additional information on connecting to MongoDB, see Connect to MongoDB.

Perform Aggregation

To perform aggregation, pass a list of aggregation stages to the MongoCollection.aggregate() method. The Java driver provides the Aggregates helper class that contains builders for aggregation stages.

In the following example, the aggregation pipeline

  • First uses a $match stage to filter for documents whose categories array field contains the element Bakery. The example uses Aggregates.match to build the $match stage.

  • Then, uses a $group stage to group the matching documents by the stars field, accumulating a count of documents for each distinct value of stars. The example uses Aggregates.group to build the $group stage and Accumulators.sum to build the accumulator expression. For the accumulator expressions for use within the $group stage, the Java driver provides Accumulators helper class.

    collection.aggregate(
      Arrays.asList(
              Aggregates.match(Filters.eq("categories", "Bakery")),
              Aggregates.group("$stars", Accumulators.sum("count", 1))
      )
    ).forEach(printBlock);
    

Use Aggregation Expressions

For $group accumulator expressions, the Java driver provides Accumulators helper class. For other aggregation expressions, manually build the expression Document.

In the following example, the aggregation pipeline uses a $project stage to return only the name field and the calculated field firstCategory whose value is the first element in the categories array. The example uses Aggregates.project and various Projections methods to build the $project stage.

collection.aggregate(
      Arrays.asList(
          Aggregates.project(
              Projections.fields(
                    Projections.excludeId(),
                    Projections.include("name"),
                    Projections.computed(
                            "firstCategory",
                            new Document("$arrayElemAt", Arrays.asList("$categories", 0))
                    )
              )
          )
      )
).forEach(printBlock);