diff --git a/snooty.toml b/snooty.toml index 96dd2c2e9..55b428e51 100644 --- a/snooty.toml +++ b/snooty.toml @@ -11,9 +11,9 @@ toc_landing_pages = [ "/connect", "/security", "/security/authentication", + "/aggregation-tutorials", "/data-formats", "/connect/connection-options", - "/aggregation", "/crud", "/crud/query", "crud/update", diff --git a/source/aggregation.txt b/source/aggregation.txt index d1509fd75..eafa24d31 100644 --- a/source/aggregation.txt +++ b/source/aggregation.txt @@ -1,9 +1,9 @@ .. _node-aggregation: .. _nodejs-aggregation: -====================== -Aggregation Operations -====================== +=========== +Aggregation +=========== .. facet:: :name: genre @@ -18,27 +18,18 @@ Aggregation Operations :depth: 2 :class: singlecol -.. toctree:: - :titlesonly: - :maxdepth: 1 - - Pipeline Stages - .. _nodejs-aggregation-overview: Overview -------- -In this guide, you can learn how to use the {+driver-long+} to perform -**aggregation operations**. - -Aggregation operations process data in your MongoDB collections and return -computed results. The MongoDB Aggregation framework is modeled on the concept of -data processing pipelines. Documents enter a pipeline comprised of one or more -stages, and this pipeline transforms the documents into an aggregated result. +In this guide, you can learn how to use **aggregation operations** in +the MongoDB Node.js driver. -To learn more about the aggregation stages supported by the {+driver-short+}, -see :ref:`node-aggregation-pipeline-stages`. +Aggregation operations are expressions you can use to produce reduced +and summarized results in MongoDB. MongoDB's aggregation framework +allows you to create a pipeline that consists of one or more stages, +each of which performs a specific operation on your data. .. _node-aggregation-tutorials: @@ -51,67 +42,114 @@ see :ref:`node-aggregation-pipeline-stages`. Analogy ~~~~~~~ -The aggregation pipeline is similar to an automobile factory assembly line. An -assembly line has stations with specialized tools that are used to perform -specific tasks. For example, when building a car, the assembly line begins with -a frame. As the car frame moves though the assembly line, each station assembles -a separate part. The result is a transformed final product, the finished car. - -The *aggregation pipeline* is the assembly line, the *aggregation stages* are -the assembly stations, the *expression operators* are the specialized tools, and -the *aggregated result* is the finished product. - -Compare Aggregation and Find Operations ---------------------------------------- - -The following table lists the different tasks you can perform with find -operations compared to what you can achieve with aggregation operations. The -aggregation framework provides expanded functionality that allows you to -transform and manipulate your data. - -.. list-table:: - :header-rows: 1 - :widths: 50 50 - - * - Find Operations - - Aggregation Operations - - * - | Select *certain* documents to return - | Select *which* fields to return - | Sort the results - | Limit the results - | Count the results - - | Select *certain* documents to return - | Select *which* fields to return - | Sort the results - | Limit the results - | Count the results - | Group the results - | Rename fields - | Compute new fields - | Summarize data - | Connect and merge data sets - -Server Limitations ------------------- - -Consider the following :manual:`limitations -` when performing aggregation operations: - -- Returned documents must not violate the :manual:`BSON document size limit - ` of 16 megabytes. -- Pipeline stages have a memory limit of 100 megabytes by default. If required, - you can exceed this limit by enabling the `AllowDiskUse - `__ - property of the ``AggregateOptions`` object that you pass to the - ``aggregate()`` method. - -Additional information ----------------------- - -To view a full list of expression operators, see :manual:`Aggregation Operators -` in the {+mdb-server+} manual. - -To learn about explaining MongoDB aggregation operations, see :manual:`Explain -Results ` and :manual:`Query Plans -` in the {+mdb-server+} manual. +You can think of the aggregation pipeline as similar to an automobile factory. +Automobile manufacturing requires the use of assembly stations organized +into assembly lines. Each station has specialized tools, such as +drills and welders. The factory transforms and +assembles the initial parts and materials into finished products. + +The **aggregation pipeline** is the assembly line, **aggregation +stages** are the assembly stations, and **expression operators** are the +specialized tools. + +Comparing Aggregation and Query Operations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Using query operations, such as the ``find()`` method, you can perform the following actions: + +- Select *which documents* to return +- Select *which fields* to return +- Sort the results + +Using aggregation operations, you can perform the following actions: + +- Perform all query operations +- Rename fields +- Calculate fields +- Summarize data +- Group values + +Aggregation operations have some :manual:`limitations `: + +- Returned documents must not violate the :manual:`BSON-document size limit ` + of 16 megabytes. + +- Pipeline stages have a memory limit of 100 megabytes by default. You can exceed this + limit by setting the ``allowDiskUse`` property of ``AggregateOptions`` to ``true``. See + the `AggregateOptions API documentation <{+api+}/interfaces/AggregateOptions.html>`__ + for more details. + +.. important:: $graphLookup exception + + The :manual:`$graphLookup + ` stage has a strict + memory limit of 100 megabytes and will ignore ``allowDiskUse``. + +References +~~~~~~~~~~ + +To view a full list of expression operators, see :manual:`Aggregation +Operators ` in the Server manual. + +To learn about assembling an aggregation pipeline and view examples, see +:manual:`Aggregation Pipeline ` in the +Server manual. + +To learn more about creating pipeline stages, see :manual:`Aggregation +Stages ` in the Server manual. + +Runnable Examples +----------------- + +The example uses sample data about restaurants. The following code +inserts data into the ``restaurants`` collection of the ``aggregation`` +database: + +.. literalinclude:: /code-snippets/aggregation/agg.js + :start-after: begin data insertion + :end-before: end data insertion + :language: javascript + :dedent: + +.. tip:: + + For more information on connecting to your MongoDB deployment, see the :doc:`Connection Guide `. + +Aggregation Example +~~~~~~~~~~~~~~~~~~~ + +To perform an aggregation, pass a list of aggregation stages to the +``collection.aggregate()`` method. + +In the example, the aggregation pipeline uses the following aggregation stages: + +- A :manual:`$match ` stage to filter for documents whose + ``categories`` array field contains the element ``Bakery``. + +- A :manual:`$group ` stage to group the matching documents by the ``stars`` + field, accumulating a count of documents for each distinct value of ``stars``. + +.. literalinclude:: /code-snippets/aggregation/agg.js + :start-after: begin aggregation + :end-before: end aggregation + :language: javascript + :dedent: + +This example produces the following output: + +.. code-block:: json + :copyable: false + + { _id: 4, count: 2 } + { _id: 3, count: 1 } + { _id: 5, count: 1 } + +For more information, see the `aggregate() API documentation <{+api+}/classes/Collection.html#aggregate>`__. + +Additional Examples +~~~~~~~~~~~~~~~~~~~ + +You can find another aggregation pipeline example in the `Aggregation +Framework with Node.js Tutorial +`_ +blog post on the MongoDB website. diff --git a/source/aggregation/pipeline-stages.txt b/source/aggregation/pipeline-stages.txt deleted file mode 100644 index b6c4343a5..000000000 --- a/source/aggregation/pipeline-stages.txt +++ /dev/null @@ -1,318 +0,0 @@ -.. _node-aggregation-pipeline-stages: - -=========================== -Aggregation Pipeline Stages -=========================== - -.. contents:: On this page - :local: - :backlinks: none - :depth: 2 - :class: singlecol - -.. facet:: - :name: genre - :values: reference - -.. meta:: - :keywords: node.js, code example, transform, pipeline - :description: Learn the different possible stages of the aggregation pipeline in the Node.js Driver. - -Overview ------------- - -In this guide, you can learn how to create an aggregation pipeline and pipeline -stages by using methods in the {+driver-long+}. - -Build an Aggregation Pipeline ------------------------------ - -You can use the {+driver-short+} to build an aggregation pipeline by creating a -pipeline variable or passing aggregation stages directly into the aggregation -method. See the following examples to learn more about each of these approaches. - -.. tabs:: - - .. tab:: Create a Pipeline - :tabid: pipeline-definition - - .. code-block:: javascript - - // Defines the aggregation pipeline - const pipeline = [ - { $match: { ... } }, - { $group: { ... } } - ]; - - // Executes the aggregation pipeline - const results = collection.aggregate(pipeline); - - .. tab:: Direct Aggregation - :tabid: pipeline-direct - - .. code-block:: javascript - - // Defines and executes the aggregation pipeline - collection.aggregate([ - { $match: { ... } }, - { $group: { ... } } - ]); - -Aggregation Stage Methods -------------------------- - -The following table lists the stages in the aggregation pipeline. To learn more -about an aggregation stage and see a code example in a {+environment+} application, -follow the link from the stage name to its reference page in the {+mdb-server+} -manual. - -.. list-table:: - :header-rows: 1 - :widths: 30 70 - - * - Stage - - Description - - * - :manual:`$addFields ` - - Adds new fields to documents. Outputs documents that contain both the - existing fields from the input documents and the newly added fields. - - ``$set`` is an alias for ``$addFields``. - - * - :manual:`$bucket ` - - Categorizes incoming documents into groups, called buckets, - based on a specified expression and bucket boundaries. - - * - :manual:`$bucketAuto ` - - Categorizes incoming documents into a specific number of - groups, called buckets, based on a specified expression. - Bucket boundaries are automatically determined in an attempt - to evenly distribute the documents into the specified number - of buckets. - - * - :manual:`$changeStream ` - - Returns a change stream cursor for the collection. - - Instead of being passed to the ``aggregate()`` method, - ``$changeStream`` uses the ``watch()`` method on a ``Collection`` - object. - - * - :manual:`$changeStreamSplitLargeEvent - ` - - Splits large change stream events that exceed 16 MB into smaller fragments returned - in a change stream cursor. - - Instead of being passed to the ``aggregate()`` method, - ``$changeStreamSplitLargeEvent`` uses the ``watch()`` method on a - ``Collection`` object. - - * - :manual:`$collStats ` - - Returns statistics regarding a collection or view. - - * - :manual:`$count ` - - Returns a count of the number of documents at this stage of - the aggregation pipeline. - - * - :manual:`$currentOp ` - - Returns a stream of documents containing information on active and - dormant operations and any inactive sessions that are holding locks as - part of a transaction. - - * - :manual:`$densify ` - - Creates new documents in a sequence of documents where certain values in a - field are missing. - - * - :manual:`$documents ` - - Returns literal documents from input expressions. - - * - :manual:`$facet ` - - Processes multiple aggregation pipelines - within a single stage on the same set - of input documents. Enables the creation of multi-faceted - aggregations capable of characterizing data across multiple - dimensions, or facets, in a single stage. - - * - :manual:`$geoNear ` - - Returns documents in order of nearest to farthest from a - specified point. This method adds a field to output documents - that contains the distance from the specified point. - - * - :manual:`$graphLookup ` - - Performs a recursive search on a collection. This method adds - a new array field to each output document that contains the traversal - results of the recursive search for that document. - - * - :manual:`$group ` - - Groups input documents by a specified identifier expression and applies - the accumulator expressions, if specified, to each group. Consumes all - input documents and outputs one document per each distinct group. The - output documents contain only the identifier field and, if specified, - accumulated fields. - - * - :manual:`$indexStats ` - - Returns statistics regarding the use of each index for the collection. - - * - :manual:`$limit ` - - Passes the first *n* documents unmodified to the pipeline, where *n* is - the specified limit. For each input document, outputs either one document - (for the first *n* documents) or zero documents (after the first *n* - documents). - - * - :manual:`$listSampledQueries ` - - Lists sampled queries for all collections or a specific collection. Only - available for collections with :manual:`Queryable Encryption - ` enabled. - - * - :manual:`$listSearchIndexes ` - - Returns information about existing :ref:`Atlas Search indexes - ` on a specified collection. - - * - :manual:`$lookup ` - - Performs a left outer join to another collection in the - *same* database to filter in documents from the "joined" - collection for processing. - - * - :manual:`$match ` - - Filters the document stream to allow only matching documents - to pass unmodified into the next pipeline stage. - For each input document, outputs either one document (a match) or zero - documents (no match). - - * - :manual:`$merge ` - - Writes the resulting documents of the aggregation pipeline to - a collection. The stage can incorporate (insert new - documents, merge documents, replace documents, keep existing - documents, fail the operation, process documents with a - custom update pipeline) the results into an output - collection. To use this stage, it must be - the last stage in the pipeline. - - * - :manual:`$out ` - - Writes the resulting documents of the aggregation pipeline to - a collection. To use this stage, it must be - the last stage in the pipeline. - - * - :manual:`$project ` - - Reshapes each document in the stream, such as by adding new - fields or removing existing fields. For each input document, - outputs one document. - - * - :manual:`$redact ` - - Reshapes each document in the stream by restricting the content for each - document based on information stored in the documents themselves. - Incorporates the functionality of ``$project`` and ``$match``. Can be used - to implement field level redaction. For each input document, outputs - either one or zero documents. - - * - :manual:`$replaceRoot ` - - Replaces a document with the specified embedded document. The - operation replaces all existing fields in the input document, - including the ``_id`` field. Specify a document embedded in - the input document to promote the embedded document to the - top level. - - The ``$replaceWith`` stage is an alias for the ``$replaceRoot`` stage. - - * - :manual:`$replaceWith ` - - Replaces a document with the specified embedded document. - The operation replaces all existing fields in the input document, including - the ``_id`` field. Specify a document embedded in the input document to promote - the embedded document to the top level. - - The ``$replaceWith`` stage is an alias for the ``$replaceRoot`` stage. - - * - :manual:`$sample ` - - Randomly selects the specified number of documents from its - input. - - * - :manual:`$search ` - - Performs a full-text search of the field or fields in an - :atlas:`Atlas ` - collection. - - This stage is available only for MongoDB Atlas clusters, and is not - available for self-managed deployments. To learn more, see - :atlas:`Atlas Search Aggregation Pipeline Stages - ` in the Atlas documentation. - - * - :manual:`$searchMeta ` - - Returns different types of metadata result documents for the - :atlas:`Atlas Search ` query against an - :atlas:`Atlas ` - collection. - - This stage is available only for MongoDB Atlas clusters, - and is not available for self-managed deployments. To learn - more, see :atlas:`Atlas Search Aggregation Pipeline Stages - ` in the Atlas documentation. - - * - :manual:`$set ` - - Adds new fields to documents. Like the ``Project()`` method, - this method reshapes each - document in the stream by adding new fields to - output documents that contain both the existing fields - from the input documents and the newly added fields. - - * - :manual:`$setWindowFields ` - - Groups documents into windows and applies one or more - operators to the documents in each window. - - * - :manual:`$skip ` - - Skips the first *n* documents, where *n* is the specified skip - number, and passes the remaining documents unmodified to the - pipeline. For each input document, outputs either zero - documents (for the first *n* documents) or one document (if - after the first *n* documents). - - * - :manual:`$sort ` - - Reorders the document stream by a specified sort key. The documents remain unmodified. - For each input document, outputs one document. - - * - :manual:`$sortByCount ` - - Groups incoming documents based on the value of a specified - expression, then computes the count of documents in each - distinct group. - - * - :manual:`$unionWith ` - - Combines pipeline results from two collections into a single - result set. - - * - :manual:`$unset ` - - Removes/excludes fields from documents. - - ``$unset`` is an alias for ``$project`` that removes fields. - - * - :manual:`$unwind ` - - Deconstructs an array field from the input documents to - output a document for *each* element. Each output document - replaces the array with an element value. For each input - document, outputs *n* Documents, where *n* is the number of - array elements. *n* can be zero for an empty array. - - * - :manual:`$vectorSearch ` - - Performs an :abbr:`ANN (Approximate Nearest Neighbor)` or - :abbr:`ENN (Exact Nearest Neighbor)` search on a - vector in the specified field of an - :atlas:`Atlas ` collection. - - This stage is available only for MongoDB Atlas clusters, and is not - available for self-managed deployments. To learn more, see - :ref:`Atlas Vector Search `. - -API Documentation -~~~~~~~~~~~~~~~~~ - -To learn more about assembling an aggregation pipeline, see :manual:`Aggregation -Pipeline ` in the {+mdb-server+} manual. - -To learn more about creating pipeline stages, see :manual:`Aggregation Stages -` in the {+mdb-server+} manual. - -For more information about the methods and classes used on this page, see the -following API documentation: - -- `Collection <{+api+}/classes/Collection.html>`__ -- `aggregate() <{+api+}/classes/Collection.html#aggregate>`__ -- `watch() <{+api+}/classes/Collection.html#watch>`__ -- `AggregateOptions <{+api+}/interfaces/AggregateOptions.html>`__ -