Skip to main content

Dataflow Supports Smart ETL Import

Overview

To support a smooth transition from Smart ETL to data development, DataFlow provides one-click Smart ETL import. This feature can migrate the ETL tasks you created, including task configuration, operator dependencies, and referenced datasets, quickly and completely into the new platform.

Procedure

  1. Open the data development module, select the folder where the Offline Dev task should be created, and create a new Offline Dev task.

  2. Drag in a Dataflow Node, then double-click it or click Edit to enter the dataflow development page.

  3. Click ... > Smart ETL Import in the upper-right corner. The dialog displays ETLs for which the task owner has at least user permission. Select the ETL to migrate.

    Note

    After ETL migration, all current nodes and configurations in the dataflow will be overwritten.

  4. After import is completed, all nodes in the dataflow are migrated, including operator configuration, operator dependencies, and input datasets. Output datasets must be created again or selected from existing datasets. Only Offline Dev datasets are supported.

    Output datasets support both full and incremental updates. When the data structure of upstream nodes changes, automatic schema updates are supported and missing fields can be added to existing tables automatically.

    |400

  5. Before migration, ETL output dataset A may be used as the input dataset of later Smart ETL tasks. After migration, data is written into Offline Dev dataset B, and ETL dataset A no longer receives data. One-click switching is supported to replace downstream ETL input dataset A with B.

    Select the target Offline Dev dataset.

    Check whether any fields are missing. If everything is correct, click Confirm to complete the migration.

    It is recommended to switch datasets for downstream ETL tasks before migrating them, so that manual steps can be reduced.

  6. Cards may originally use ETL output dataset A. After migration, the dataset can be replaced with Offline Dev dataset B. One-click switching is supported for card datasets as well.

    Batch creation or one-click switching for all cards is supported.

    Select the target Offline Dev dataset.

    Check whether any fields are missing. If everything is correct, click Confirm to complete the migration.

  7. Scheduling configuration: After all ETLs are migrated, reconfigure scheduling. In Offline Dev, configure event scheduling for all input datasets of the original dataflows.