Row to Column with ETL
Background
Sometimes, we need to use a certain primary key to expand a field with fixed categories as column names, converting multiple rows of data into columns—this is the row-to-column effect. Guandata ETL supports this with both the Row-to-Column feature and SQL input.
Practical Example
Original Data
Here is a data table with 5 fields and 9 rows. The "Project" field has several fixed categories.
Department | SKU | Month | Project | Value |
---|---|---|---|---|
Ops Dept 5 | HHHH001 | 202012 | Target Price | 112 |
Ops Dept 5 | HHHH001 | 202012 | Target Daily Sales | 212 |
Ops Dept 5 | HHHH001 | 202012 | Target Conversion | 312 |
Ops Dept 5 | HHHH001 | 202011 | Target Price | 111 |
Ops Dept 5 | HHHH001 | 202011 | Target Daily Sales | 211 |
Ops Dept 5 | HHHH001 | 202011 | Target Conversion | 311 |
Ops Dept 5 | HHHH001 | 202010 | Target Price | 110 |
Ops Dept 5 | HHHH001 | 202010 | Target Daily Sales | 210 |
Ops Dept 5 | HHHH001 | 202010 | Target Conversion | 310 |
Expected Result
Department | SKU | Month | Target Price | Target Daily Sales | Target Conversion |
---|---|---|---|---|---|
Ops Dept 5 | HHHH001 | 202011 | 111 | 211 | 311 |
Ops Dept 5 | HHHH001 | 202012 | 112 | 212 | 312 |
Ops Dept 5 | HHHH001 | 202010 | 110 | 210 | 310 |
Solution
Method 1: Add "Row to Column" in ETL

- Select the operation column and fill column
In this case, we want to convert the "Project" and "Value" fields from rows to columns. "Project" becomes the new column name, and "Value" fills the new columns.

- Select the aggregation type for the value

- Add primary key

- Create new columns
Creating new columns means setting the mapping between the new column names and the original row values. If unchanged, you can directly check "new column name same as original row value."

Method 2: Process with SQL

- Use the pivot function, separate multiple column names with commas.
SELECT * FROM input1
PIVOT(SUM(`[Value to fill in new column]`) FOR `[Column to convert]` in ('[Column Name]','[Column Name]'...))

- The final result.

Notes
- In the PIVOT function, the field name must follow FOR directly. If you add a table path, it will cause an error on submission.

- Dynamic column names are not recommended. For example, when creating dynamic column names, you can preview and run successfully, but when ETL runs automatically on a schedule, if the column names change, the output dataset will report an error, and the created cards will show missing fields.
