View Dataset
Overview
Feature Description
View Dataset, formerly called Dynamic Dataset, is a parameterized dynamic dataset capability provided by Guandata based on Spark SQL. It supports dynamic joins and calculations across non-direct-connection datasets, enabling a more flexible way to analyze data. It is a query and computation feature designed to solve complex ad hoc analysis scenarios with Spark SQL.
With View Datasets, users can reorganize one or more non-direct-connection datasets, excluding real-time datasets, through Spark SQL to create a new dataset. Spark SQL provides rich data processing functions for complex joins, data preprocessing, and other advanced scenarios. Dynamic parameters can also be added to SQL statements to support dynamic calculations.
Use Cases
-
PSD Calculation in Chain Retail
When calculating PSD, the numerator, sales amount, is aggregated by date, store, and SKU from the sales amount table. The denominator, active store days, is aggregated by date and store from the store activity table and should not be summed at the SKU level. If you join the original tables directly, either the data volume grows dramatically or the aggregated result becomes inaccurate. The simplest and most accurate way is to aggregate the two tables separately and then join the results to calculate PSD. Guandata View Datasets support automatic multi-dataset joins. By injecting parameters into custom SQL, users can perform hierarchical aggregation on both source tables and then join the result sets to calculate PSD.
-
Other Scenarios
- Calculating period-over-period percentage changes for operational metrics
- Analyzing the distribution of customer purchase frequency within dynamic time ranges
User Guide
Create a View Dataset
Entry point: Data Preparation > Datasets > New Dataset > Application > View Dataset.

Select the Data Table
On the View Dataset configuration page, click Add Dataset and select one or more data tables, then click Confirm.
To ensure performance, it is recommended to select no more than two datasets.
Open: Pasted image 20260522144604.png

Enter Dynamic Query SQL
After selecting the relevant datasets, enter Dynamic Query SQL and use Dataset Fields and optional Dynamic Parameters on the left as needed to support flexible parameter passing and query execution. After writing the SQL, click Preview to verify the accuracy of the data.
Users can choose whether to use the system default table names or the dataset names. The default setting uses the system-generated table names. After switching to dataset names, you cannot switch back to the system default table names.
-
Use System Default Table Names: the system generates
INPUTas the table name -
Use Dataset Names: the dataset keeps its original table name or dataset alias
NoteView Datasets are intended for scenarios with dynamic parameters. If your SQL does not include dynamic parameters, it is recommended to use ETL for data processing. Using View Datasets inappropriately may lead to unpredictable card query performance overhead.

To make the business meaning of the view easier to understand, you can assign an alias to the view name.

Sometimes users define aliases for certain fields in a dataset. If you want to hide those aliases in the dataset field list, click
Show Original Field Names Only.
Enter Dataset Information
After the data preview succeeds, specify the dataset name and storage location, add a description if needed, and click Confirm Creation to create the dataset.

Manage the Created Dataset
After a View Dataset is created successfully, it can be found in the corresponding folder. Click the dataset to enter its details page, where you can view the overview, related Cards, and model structure, and edit the configuration on each tab.

Set the Preview Timeout Limit
Users can configure the preview timeout limit for View Datasets in Admin Center, guiding users to use View Datasets in recommended scenarios and reducing unnecessary performance overhead.
Under Admin Center > System Settings > General Settings > Runtime Parameters, users can set the preview timeout limit for View Datasets. The default value is 60 seconds.

This setting does not affect View Datasets that were created previously.

Practical Examples
For more examples, see Using a View Dataset to Count Foot Traffic and View Dataset Usage Methods and Case Sharing.