Loading Data via Pipelines

A pipeline in Databend Cloud allows for automatic discovery of file updates from object storage and automatically loads them into a table. Here are some scenarios where using a pipeline is recommended:

You have a large number of CSV or Parquet files in your bucket and want to load them to Databend Cloud in one go for further analysis.
The object storage automatically loads data into your buckets, such as billing data, which can be automatically loaded into Databend Cloud for visualization and further analysis.
You have a continuous stream of user behavior logs being stored into your object storage, which can be automatically loaded into Databend Cloud using pipelines for further analysis.

note

There is no limit on the number of pipelines you can create for your organization. However, please remember that a pipeline requires a warehouse to run, so running a pipeline will incur costs. For more information about warehouse pricing, see Warehouse Pricing.

Creating a Pipeline

To create a pipeline in Databend Cloud, you must first create a table that will serve as the target for the data to be imported into. The table schema must match the structure of the data to be imported in order for the pipeline to work properly.

To create a pipeline:

On the Data page, navigate to and select your target table, then select the Pipeline tab on the right.

Alt text

Click Configuration to open the pipeline setup page, then follow the instructions to create a pipeline.

Alt text

Click OK. The data loading process will begin only if all the connection information is accurate. After the loading is complete, you will be able to view the import logs like this on the page:

Alt text

Activating or Deactivating a Pipeline

Once created successfully, a pipeline is activated by default. The pipeline will periodically detect the file changes on your object storage and automatically load them into the table in Databend Cloud until you deactivate it.

To deactivate a pipeline, go to the Pipeline tab and toggle the Active button.

Alt text

Loading Data via Pipelines

Creating a Pipeline

Activating or Deactivating a Pipeline

Join our growing community

GitHub

Slack

Twitter

YouTube

Explore Databend Cloud for FREE

Creating a Pipeline​

Activating or Deactivating a Pipeline​

Join our growing community

GitHub

Slack

Twitter

YouTube

Explore Databend Cloud for FREE

Creating a Pipeline

Activating or Deactivating a Pipeline