Understanding MongoDB Aggregation: $out and $merge

nghiadinhtrong

MongoDB’s aggregation is a powerful tool for processing and transforming data in collections. Among its wide array of features, the $out and $merge stages are particularly useful for persisting the results of an aggregation pipeline back into the database. Both stages enable workflows that involve saving transformed data for later use, but they have distinct use cases and behaviors.

What is the $out Stage?

The $out stage writes the results of an aggregation pipeline directly into a collection. This can be useful when you need to create a new dataset or completely replace the contents of an existing collection with the pipeline’s output.

Key Features:

Replaces Entire Collection: When $out targets an existing collection, it replaces its contents with the aggregation results.
Creates a New Collection: If the target collection does not exist, MongoDB will create it automatically.
Atomic Operation: If the pipeline succeeds, the new or replaced collection is atomically updated to ensure consistency.
Not Compatible with Sharded Outputs: $out can only write to unsharded collections.

Example of $out:

Suppose you have a collection named sales and you want to create a new collection summarizing total sales by product:

db.sales.aggregate([
  { $group: { _id: "$product", totalSales: { $sum: "$amount" } } },
  { $out: "product_totals" }
]);

This pipeline groups sales data by product and writes the summarized data into a new collection called product_totals.

What is the $merge Stage?

The $merge stage also writes aggregation results to a collection, but it provides more flexibility than $out. Instead of simply overwriting the target collection, $merge allows you to update, insert, or replace specific documents in the target collection based on customizable criteria.

Key Features:

Flexible Update Behavior: $merge can insert new documents, update existing ones, or replace documents entirely.
Customizable Match Logic: You can specify how documents in the pipeline output match with documents in the target collection.
Supports Sharded Collections: Unlike $out, $merge supports writing to sharded collections.
Preserves Existing Data: $merge does not replace the entire collection unless explicitly configured to do so.

Example:

Continuing with the sales example, let’s update a product_summary collection with total sales data for each product:

db.sales.aggregate([
  { $group: { _id: "$product", totalSales: { $sum: "$amount" } } },
  { $merge: {
      into: "product_summary",
      whenMatched: "merge",
      whenNotMatched: "insert"
    }
  }
]);

In this case:

If a product already exists in the product_summary collection, its total sales will be updated.
If a product does not exist, it will be inserted as a new document.

Key Differences Between $out and $merge

Feature	$out	$merge
Target Collection	Fully replaced	Selective insert, update, or replace
Sharded Collection	Not supported	Supported
Use Case	Overwrite or create a new dataset	Incrementally update a dataset
Granularity	Operates on the entire collection	Operates on individual documents

Best Practices

Use $out for Simpler Workflows: If you need to create or completely replace a collection, $out is straightforward and efficient.
Use $merge for Incremental Updates: If your use case involves updating or augmenting an existing dataset without erasing the entire collection, $merge is the better choice.
Test in Non-Production Environments: Both stages modify data directly. Test your pipelines in a staging environment to avoid unintended data loss or corruption.
Ensure Indexing: When using $merge, ensure the target collection has appropriate indexes for the matching fields to improve performance.

Solutions

Technology advisory

Cloud engineering

Data solutions

AI and machine learning

Application engineering

Maintenance and support

Business process solutions

Quality solutions

Industry

Financial services and insurance

Healthcare

Retail

Travel

Media and publishing

Hi-tech and IOT

Logistics and supply chain

Education

Our thinking

News

Insights

Blog

Understanding MongoDB Aggregation: $out and $merge

nghiadinhtrong

Table of Contents

What is the $out Stage?

What is the $merge Stage?

Key Differences Between $out and $merge

Best Practices

Like this:

Related

nghiadinhtrong

Leave a CommentCancel reply

Suggested Article

NashTech

Solutions

Useful links

Connect with us

Our achievements

Solutions

Industry

Our thinking

Understanding MongoDB Aggregation: $out and $merge

nghiadinhtrong

Table of Contents

What is the $out Stage?

What is the $merge Stage?

Key Differences Between $out and $merge

Best Practices

Share this:

Like this:

Related

nghiadinhtrong

Leave a CommentCancel reply

Suggested Article

NashTech

Solutions

Useful links

Connect with us

Our achievements

Discover more from NashTech Blog