Skip to main content

Task Flow

Task Flow is Databend Cloud's built-in workflow orchestration feature. It lets you define, schedule, and monitor SQL-based data pipelines as directed acyclic graphs (DAGs). Each node in the graph is a Task — a SQL statement with its own schedule, dependencies, and execution settings. A Flow groups multiple tasks together and manages their execution order automatically.

Alt text

Overview

Task Flow replaces the legacy Task List with a more powerful model:

FeatureLegacy Task ListTask Flow
Single SQL task
Multi-task DAG
Visual graph editor
Version history
Stream-based triggers
Bulk operations

Key Concepts

Task

A Task is the smallest unit of work. It contains:

  • A SQL statement to execute
  • A schedule (manual, interval, or cron)
  • Optional dependencies on other tasks or streams
  • Advanced settings (failure threshold, result cache, min execution interval)

Flow

A Flow is a named collection of tasks with dependency relationships. Databend Cloud automatically determines execution order based on the DAG structure. A flow has:

  • A name and an assigned warehouse
  • One or more tasks with defined dependencies
  • A lifecycle: Created → Started → Suspended → Resumed → Dropped

DAG (Directed Acyclic Graph)

The dependency graph between tasks. If Task B depends on Task A, Databend Cloud runs Task A first and only triggers Task B after Task A succeeds. Cycles are not allowed.

Getting Started

Creating a Task Flow

  1. Navigate to DataTask & Flows in the left sidebar.
  2. Click Create in the top-right corner.
  3. In the flow modal:
    • Enter a Flow Name.
    • Select a Warehouse to run the tasks on.
  4. Click Add Task to Flow to add your first task.

Alt text

Alt text

Configuring a Task

In the task form, fill in the following:

Basic Settings

FieldDescription
Task NameUnique name within the flow
ScheduleWhen to run: Manual, Interval (e.g. every 5 minutes), or Cron expression
TimezoneTimezone for cron schedule evaluation
SQLThe SQL statement to execute
CommentOptional description

Dependencies

FieldDescription
Require TasksOther tasks that must complete before this task runs
Require StreamA database stream that must have new data before this task triggers

Advanced Options

FieldDescription
Suspend Task After Num FailuresAutomatically suspend the task after N consecutive failures (0 = never)
Enable Query Result CacheCache query results to avoid redundant computation
Min Execute SecondsMinimum interval between executions (5s / 10s / 15s / 30s)
  1. Click Save to add the task to the flow.
  2. Repeat to add more tasks. Use Require Tasks to define dependencies between them.
  3. Click Publish to create the flow.
note

Only account_admin or the flow creator can edit or delete a flow.

Visualizing the Flow

After creating a flow, click its name to open the details page. The Latest Run tab shows the DAG visualization:

Alt text

Each node displays:

  • Task name
  • Latest execution status (color-coded)
  • Execution time range
  • Error message (if failed)

Status colors:

ColorStatus
Blue borderScheduled
Green borderSucceeded
Red borderFailed
Light blue borderExecuting
Gray borderCancelled / Waiting

Managing Flows

Flow Actions

From the Task & Flows list, each row has an action menu with:

ActionDescription
EditModify flow name, warehouse, or tasks
SuspendPause all scheduled executions
ResumeRe-enable scheduled executions
Execute OnceTrigger an immediate one-time run
View Runs HistorySee all past executions
View Versions HistoryBrowse and compare previous versions
DeletePermanently remove the flow

Bulk Operations

Select multiple flows using the checkboxes, then use the bulk action menu to:

  • Suspend all selected flows
  • Resume all selected flows
  • Drop all selected flows

Monitoring Executions

Runs History

Click Runs History on the details page to see all past executions:

ColumnDescription
Task NameWhich task ran
WarehouseWarehouse used
StateScheduled / Executing / Succeeded / Failed / Cancelled
SQLThe SQL that was executed (with Query ID link)
Scheduled TimeWhen the run was triggered
Completed TimeWhen the run finished
CommentTask comment

Failed or cancelled runs show an error tooltip. You can click the error to view details or create a support ticket.

Global Task History

Navigate to DataTask History to see executions across all flows in your organization. You can filter by:

  • Task names (multi-select)
  • Time range (Last 2 days, Last 3 days)

Version Control

Every time you publish changes to a flow, Databend Cloud saves a new version. To access version history:

  1. Open the flow details page.
  2. Click the Versions History tab.

Comparing Versions

  1. Select two versions using the checkboxes.
  2. Click Compare.
  3. A side-by-side SQL diff drawer opens showing what changed between the two versions.

Alt text

Reverting to a Previous Version

  1. Select a version from the list.
  2. Click Revert.
  3. Confirm the action in the dialog.

The flow is restored to the selected version and a new version entry is created.

Scheduling Reference

Schedule Types

Manual The task only runs when triggered via Execute Once. No automatic scheduling.

Interval Run every N minutes/hours. Example: EVERY 5 MINUTE.

Cron Standard cron expression with timezone support. Example: 0 9 * * 1-5 (weekdays at 9am).

Stream-Based Triggers

If a task has a Require Stream dependency, it only executes when the specified stream has unconsumed data. This is useful for building event-driven pipelines that react to table changes (CDC).

Best Practices

  • Start simple: Create a single-task flow first to validate your SQL before adding dependencies.
  • Use streams for CDC pipelines: Combine stream triggers with MERGE INTO statements to build incremental data pipelines.
  • Set failure thresholds: Use Suspend Task After Num Failures to prevent runaway retries from consuming warehouse credits.
  • Enable result cache: For tasks that query the same data repeatedly, enable Query Result Cache to reduce compute costs.
  • Use version history: Before making significant changes, note the current version number so you can revert if needed.
  • Separate warehouses by workload: Assign heavier transformation tasks to a larger warehouse and lightweight tasks to a smaller one.

Permissions

RoleCreateEditDeleteView
account_admin✅ (any)✅ (any)
Creator✅ (own)✅ (own)
Other users