Skip to main content

Group By

Overview

Group Aggregate means aggregating multiple rows into a single row by processing measures across one or more dimensions. When multiple dimensions are selected, aggregation is performed at the finest granularity among those dimensions.

For example, in retail sales analysis, data with the same product category can be merged together and the corresponding total sales amount can be calculated.

User Guide

Steps

  1. Drag the Group Aggregate operator from the ETL operator area to the canvas on the right.
  2. Click the Group Aggregate operator and drag fields into the Dimensions and Measures sections.
  3. Click the added fields to set field aliases and choose aggregation methods as needed.
  4. At the current node, click Preview to confirm the data result.

Detailed Explanation

The following example shows how to configure Regional Quantity_Sold.

Aggregate sales quantities by store region and group them together. Preview before aggregation:

  1. Drag the Group By operator from the ETL operator area to the canvas on the right and connect it to the upstream node.

  2. Click the Group By operator. The left panel becomes the current operator configuration area. Rename it based on business needs, for example Regional Quantity_Sold.

  3. Drag Region into the Dimensions section, then click the field and configure its alias as needed.

    Notes

    The default aggregation for fields in the Measures section is Count for text fields and Sum for numeric fields.

  4. Drag Quantity_Sold into the Metric section, click the field, choose Sum as the aggregation method, and configure the field alias if needed.

For error details, see The ETL Group Aggregate node reports a missing field, but the field is actually present.

For error details, see Type mismatch field.

We support 7 aggregation methods, including but not limited to Sum, Minimum, and Maximum.

Aggregation MethodPurposeUsage ScenarioExample
SumAdd measure values under the specified dimension to calculate a totalWhen measures can be accumulatedMonthly total sales, daily website visits
MinimumGet the minimum measure value under the specified dimensionWhen measures have a meaningful minimum valueLowest sales price for each product, monthly minimum temperature
MaximumGet the maximum measure value under the specified dimensionWhen measures have a meaningful maximum valueHighest sales price for each product, monthly maximum temperature
AverageCalculate the average measure value under the specified dimensionWhen measures can be averagedAverage monthly sales, average weekly user logins
CountCount the number of records under the specified dimensionWhen you need to know how many records exist under a dimensionCount of records under a specific dimension
Distinct CountCount the number of distinct values under the specified dimensionWhen you need to know how many different values exist under a dimensionNumber of distinct products sold each month, distinct customers in each region
None---
  1. Click Preview to preview the data result and ensure the aggregated data matches expectations and does not contain errors or abnormal values.

For other data processing operators used later, see Getting Started.