Kafka
Apache Kafka is an open-source distributed event streaming platform that allows you to publish and subscribe to streams of records. It is designed to handle high-throughput, fault-tolerant, and real-time data feeds. Kafka enables seamless communication between various applications, making it an ideal choice for building data pipelines and streaming data processing applications.
Databend provides the following plugins and tools for data ingestion from Kafka topics:
databend-kafka-connect
The databend-kafka-connect is a Kafka Connect sink connector plugin designed specifically for Databend. This plugin enables seamless data transfer from Kafka topics directly into Databend tables, allowing for real-time data ingestion with minimal configuration. Key features of databend-kafka-connect include:
- Automatically creates tables in Databend based on the data schema.
- Supports both Append Only and Upsert write modes.
- Automatically adjusts the schema of Databend tables as the structure of incoming data changes.
To download databend-kafka-connect and learn more about the plugin, visit the GitHub repository and refer to the README for detailed instructions.
bend-ingest-kafka
bend-ingest-kafka is a high-performance data ingestion tool specifically designed to efficiently load data from Kafka topics into Databend tables. It supports two primary modes of operation: JSON Transform Mode and Raw Mode, catering to different data ingestion requirements. Key features of bend-ingest-kafka include:
- Supports two modes: JSON Transform Mode, which maps Kafka JSON data directly to Databend tables based on the data schema, and Raw Mode, which ingests raw Kafka data while capturing complete Kafka record metadata.
- Provides configurable batch processing settings for size and interval, ensuring efficient and scalable data ingestion.
To download bend-ingest-kafka and learn more about the tool, visit the GitHub repository and refer to the README for detailed instructions.