Skip to main content

Group Aggregation

1. Overview

Group aggregation refers to aggregating multiple rows of data into one row according to dimensions through numerical processing of certain dimensions or several types of dimensions. When selecting multiple dimensions, aggregation is performed according to the dimension with the smallest granularity.

For example, when retail industry statistics analyze sales situations, data with the same product category needs to be merged together and the corresponding total sales amount calculated.

|550

2. Operation Steps

  1. Drag the "Group Aggregation" operator from the data flow operator area into the right canvas editing area;

  2. Click the "Group Aggregation" operator and drag fields into the dimension bar and value bar;

    |400

  3. Click the dragged field, set field alias as needed, and select aggregation method;

    |350

    |250

  4. At the current node, click "Preview" to confirm the data results.

3. Specific Case

The following introduces configuring a Regional Revenue as an example.

Merge revenue data from the same store regions together, including first-tier market revenue, second-tier market revenue, and other revenue. Preview before aggregation:

  1. Drag the "Group Aggregation" operator from the ETL operator area into the right canvas editing area and connect it to the upstream node;

  2. Click the "Group Aggregation" operator. The left area becomes the current operator configuration area. Rename it according to business needs, such as "Regional Revenue";

  3. Drag "Store Region" into the dimension bar, click on this field, and set field alias as needed:

    Note

    The default aggregation method for value bar fields is count for text type and sum for numeric type.

  4. Drag "Revenue" into the value bar, click on this field, select aggregation method as sum, and set field alias as needed:

    Supports 7 aggregation methods, including but not limited to sum, minimum, maximum, etc., as shown in the table below.

Aggregation MethodPurposeUsage ScenarioExample
SumAdd up metric values under specified dimensions to calculate totalWhen metric values can be accumulatedMonthly total sales, daily website visits
MinimumGet the minimum value of metrics under specified dimensionsWhen metric values have a minimum conceptMinimum selling price of each product, minimum monthly temperature
MaximumGet the maximum value of metrics under specified dimensionsWhen metric values have a maximum conceptMaximum selling price of each product, maximum monthly temperature
AverageCalculate the average value of metrics under specified dimensionsWhen metric values can be averagedMonthly average sales, weekly average user logins
CountCount the number of data records under specified dimensionsWhen you need to know how many data records are under a certain dimensionWhen you need to know how many data records are under a certain dimension
Distinct CountCount the number of deduplicated data records under specified dimensionsWhen you need to know the number of different values under a certain dimensionNumber of different products sold per month, number of different customers per region
No Processing---
  1. Click "Preview" to preview the data results to ensure that the grouped and aggregated data meets expectations and does not contain errors or abnormal values.