Skip to main content

Quick Start

This article introduces the Data Development workflow to help you quickly understand and start using DataFlow.

  1. Integrate multiple data sources:
    Supports heterogeneous data sources such as business systems, data warehouses, file-based data, and APIs, enabling flexible full and incremental synchronization.
  2. Offline data development:
    1. Provides low-threshold visual ETL and Dataflow Node orchestration capabilities, while supporting extended task types such as Python scripts and Shell commands to improve development efficiency. It also uses loop control, Conditional Branch, and Subprocess to complete task orchestration efficiently. For details, see Offline Development Task.
    2. Provides minute-level Near Real-Time Scheduling to ensure data timeliness and supports event-driven scheduling to avoid empty task runs. For details, see [Task Scheduling](06-Task Scheduling.md).
  3. Operations management:
    Provides visual tools such as the Instance Runtime Gantt Chart and Workflow Tree View to monitor task status and quickly identify abnormal nodes. You can then trace and resolve issues based on Node Logs. It also supports Rerun and Recover Failed Tasks to repair tasks and data and ensure data accuracy. For details, see [Task Monitor](08-Task Monitor.md).