Skip to main content

Input Dataset

Overview

The Input Dataset is an important starting operator in the ETL workflow. It provides the data foundation for the first ETL stage, extraction, and prepares the data for downstream processing.

It supports rapid integration of heterogeneous multi-source data through multiple inputs, allowing users to combine data from different sources and structures easily.

User Guide

Steps

  1. Drag the Input Dataset operator from the ETL operator panel into the canvas editor on the right.
  2. Click the Input Dataset operator and choose the target dataset.
  3. Click Confirm to load the dataset.
  4. Optionally configure preview rules for the input dataset.
  5. Add other operator nodes as needed to build the full data processing flow.

Detailed Description

Note
  • Input datasets can come from file data, database datasets excluding direct connection databases and View Datasets, and output datasets from other Smart ETL flows.
  • Make sure you have usage permission and Row- and Column-Level Permissions for the source dataset.

The following example uses an Excel file dataset.

  1. Drag the Input Dataset operator from the ETL operator panel into the canvas editor on the right.

    Open: Pasted image 20260525194350.png

  1. Click the Input Dataset operator, enter the dataset name Mock Data 6e, and choose Flat. Select either Flat or Directory as needed.

  2. Locate the Orders Table dataset, and click Confirm. The full folder path of the dataset is displayed to help users identify the correct dataset more quickly.

    Note

    When switching between Flat and Directory during search, the entered search term and dataset type filter are cleared. After directory-based search, entering a subfolder or returning to a parent folder also clears the search term and filter settings.

    On the Flat page, dataset information is display-only and cannot be edited. To edit it, go to the Data Center.

Search ModeDescription
Flat SearchAll datasets under the root directory are treated as peers. Enter search content to find the corresponding dataset directly. The panel on the right also shows details such as the folder path of the selected dataset.
Directory SearchAll folders and datasets in the current directory are treated as peers. Enter search content to search for folders and datasets. After entering a subfolder, the search box is cleared and subsequent searches apply only within the current folder.
  1. After the dataset is imported successfully, the left configuration panel displays the dataset type, storage path, and detailed field information.

  2. Optionally configure preview rules for the input dataset:

    • Full Data
    • Partial Data - Limit Row Count
    • Partial Data - Set Filter Conditions
    Note

    To improve ETL preview performance, it is recommended to use partial data. This setting affects preview data only, not the final result.

    |400

Replace ETL Datasets

Replace a Single ETL Dataset

Click the imported dataset, then click Replace in the left information panel. Select the new dataset and click Confirm.

|400

Replace ETL Datasets in Batch

During data development, when datasets need to be replaced with new ones, ETL supports batch replacement, making the process as convenient as switching datasets on Cards.

  • On the details page of a non-direct-connection dataset, open the Associated Creation tab and choose Switch Dataset on the right. Multiple ETL flows can be selected, up to 200 at a time.

  • After switching datasets, you still need to check whether ETL node field names remain consistent and correct them manually if needed.

  • After the switch, the system returns a success or failure notification, and you can jump to the new dataset details page for review.

If you plan to use other data processing operators afterward, see Getting Started.