NoSQL Database Aggregation

Aggregation of NoSql datasets is an important feature in many applications. restdb.io supports queries with both grouping and aggregation of data sets. This is very helpful in developing custom reports, visual charts, data analysis etc. The table below shows all aggregation and grouping functions:

FunctionFormatCommentExample
MinMIN:fieldReturns objecth={"$aggregate":["MIN:score"]}
MaxMAX:fieldReturns objecth={"$aggregate":["MAX:score"]}
AvgAVG:fieldReturns valueh={"$aggregate":["AVG:score"]}
SumSUM:fieldReturns valueh={"$aggregate":["SUM:score"]}
CountCOUNT:propertyReturns value with chosen property nameh={"$aggregate":["COUNT:nplayers"]}
Groupby$groupby: ["field", ...]Returns "groupkey":[array]h={"$groupby":["category"]}
Groupby (dates)$groupby: ["$YEAR:field", ...]Predefined values for: $YEAR, $MONTH, $DAY, $HOUR, $SECh={"$groupby":["$YEAR:registered"]}
Groupby (dates with formats)$groupby: ["$DATE:format", ...]Format strings for: ss, hh, mm, dd, MM, YY. All formats at momentjs.comh={"$groupby":["$DATE.MMM:registered"]}
Grand totals$aggregate-grand-total: trueRecursive aggregation functions of groupsh={"$groupby":["category"], "$aggregate": ["AVG:score"], "$aggregate-grand-total": true}

Examples

In the following examples we'll be using a data collection of players in the database:

https://rdb-simpledb.restdb.io/rest/players

We've used the random data generator to populate the players collection with 999 items. The collection data items looks like this example extract:

    [
      {
        "_id": "56ebbfe7db116334000004f0",
        "name": "Jack_Nicolas",
        "email": "Mavis_Maggio@hotmail.com",
        "score": 794,
        "category": "cum",
        "team": "Earum voluptates",
        "registered": "1991-09-03T00:00:00.000Z"
      },
      {
        "_id": "56ebbfe7db116334000004f1",
        "name": "Christiansen.Guadalupe",
        "email": "Gerhold_Twila@yahoo.com",
        "score": 489,
        "category": "nihil",
        "team": "Animi perspiciatis",
        "registered": "2011-05-22T00:00:00.000Z"
      },
      ...
    ]

Basic aggregation of data

Data aggregation is done by using standard functions on a data selection (i.e. Query). The aggregation parameters are passed in as query parameters or as query hints. We'll be using query hints in the following examples. The following example queries shows how to use aggregation functions and what results they produce.

Finding the minimum or maximum

https://rdb-simpledb.restdb.io/rest/players?h={"$aggregate":["MIN:score"]}

The MIN function searches all items in the player collection and returns the item with the lowest score:

    {
      "MIN score": {
        "_id": "56ebbfe7db1163340000073b",
        "name": "Tillman_Ronny",
        "email": "Orville_Ledner@gmail.com",
        "score": 1,
        "category": "eum",
        "team": "Sunt alias id",
        "registered": "2000-03-07T00:00:00.000Z"
      }
    }

The MAX function does the opposite :)

Finding the average

https://rdb-simpledb.restdb.io/rest/players?h={"$aggregate":["AVG:score"]}

The AVG function searches all items in the player collection and returns the average score. Note the difference from the MIN function that returns the full data item, the AVG function only returns the value:

    {
      "AVG score": 495.5825825826
    }

Finding the sum

https://rdb-simpledb.restdb.io/rest/players?h={"$aggregate":["SUM:score"]}

The SUM function searches all items in the player collection and returns the total sum of all scores in the query:

    {
      "SUM score": 495087
    }

Counting items

https://rdb-simpledb.restdb.io/rest/players?h={"$aggregate":["COUNT:nitems"]}

The COUNT function searches all items in the player collection and returns the count of all items in the query. Note that the COUNT function can have any value, it's just returned as a property name for the value, in this case "ntitems":

    {
      "COUNT nitems": 999
    }

Basic grouping of data

Data sets can be grouped by any data field contained in the resulting query. In our example players collection we have a category field, lets group data based on that:

https://rdb-simpledb.restdb.io/rest/players?h={"$groupby": ["category"]}

This query results in a grouped data set, where the group field is the key and the group is an array.

    {
      "a": [
        {
          "_id": "56ebbfe7db11633400000526",
          "name": "Porter_Rath",
          "email": "Padberg_Kellen@gmail.com",
          "score": 867,
          "category": "a",
         ...
        },
        {
          "_id": "56ebbfe7db116334000006e4",
          "name": "Shields.Jermain",
          "email": "Marielle.Torphy@Dicki.us",
          "score": 478,
          "category": "a",
         ...
        }
      ],
      "ab": [
        {
          "_id": "56ebbfe7db11633400000875",
          "name": "Everardo_Wiegand",
          "email": "Lucie.Powlowski@hotmail.com",
          "score": 985,
          "category": "ab",
         ...
        },
        ...
      ],
      ...
    }

Nested grouping of data

Data sets can also be grouped in i hierarchy of other groups. This example shows how to group players into registration year, registration month and category:

https://rdb-simpledb.restdb.io/rest/players?h={"$groupby":["$YEAR:registered","$MONTH:registered", "category"]}

The resulting data set is a hierarchy of key:[key:[array]] values, a bit complicated, but powerful:

    {
      "1969": {
        "11": {
          "ullam": [
            {
              "_id": "56ebbfe7db116334000005f8",
              "name": "Dovie.Abshire",
              "email": "Hayden_Sporer@Royal.org",
              "score": 708,
              "category": "ullam",
             ...
            }
          ]
        }
      },
      "1970": {
        "0": {
          "vitae": [
            {
              "_id": "56ebbfe7db11633400000636",
              "name": "Ivy_Rodriguez",
              "email": "Heaven_Kovacek@yahoo.com",
              "score": 863,
              "category": "vitae",
             ...
            }
          ],
          "molestias": [
            {
              "_id": "56ebbfe7db11633400000751",
              "name": "Lindgren_Icie",
              "email": "Rocky_Labadie@Ritchie.org",
              "score": 506,
              "category": "molestias",
             ...
            }
          ]
          ...
        },
        "1": {
          "enim": [
            {
              "_id": "56ebbfe7db1163340000089b",
              "name": "Pierce.Brown",
              "email": "Keeling_Berneice@yahoo.com",
              "score": 868,
              "category": "enim",
             ...
            }
          ]
        },
        "2": {
          "odit": [
            {
              "_id": "56ebbfe7db1163340000066e",
              "name": "Gudrun.Raynor",
              "email": "Filiberto.Feil@yahoo.com",
              "score": 911,
              "category": "odit",
              ...
            }
          ]
          ...
        },
      ...
    ...

Grouping and aggregation together

The example above can also be performed with aggregation of data within each group:

https://rdb-simpledb.restdb.io/rest/players?h={"$groupby":["$YEAR:registered","$MONTH:registered", "category"], "$aggregate": ["AVG:score"]}

The resulting data set is the same hierarchy as above, but the data items has been replaced with the aggregated values from the function:

    {
      "1969": {
        "11": {
          "ullam": {
            "AVG score": 708
          }
        }
      },
      "1970": {
        "0": {
          "vitae": {
            "AVG score": 863
          },
          "molestias": {
            "AVG score": 506
          },
          "magnam": {
            "AVG score": 94
          },
          "veritatis": {
            "AVG score": 681
          }
        },
        "1": {
          "enim": {
            "AVG score": 868
          }
        },
        ...

The $aggregate and $groupby functions can be used with all the other Query api tools, read the docs at https://restdb.io/docs/querying-with-the-api.