Skip to main content

Apache Iceberg

Introduced or updated: v1.2.668

Databend supports the integration of an Apache Iceberg catalog, enhancing its compatibility and versatility for data management and analytics. This extends Databend's capabilities by seamlessly incorporating the powerful metadata and storage management capabilities of Apache Iceberg into the platform.

Datatype Mapping

This table maps data types between Apache Iceberg and Databend. Please note that Databend does not currently support Iceberg data types that are not listed in the table.

Apache IcebergDatabend
BOOLEANBOOLEAN
INTINT32
LONGINT64
DATEDATE
TIMESTAMP/TIMESTAMPZTIMESTAMP
FLOATFLOAT
DOUBLEDOUBLE
STRING/BINARYSTRING
DECIMALDECIMAL
ARRAY<TYPE>ARRAY, supports nesting
MAP<KEYTYPE, VALUETYPE>MAP
STRUCT<COL1: TYPE1, COL2: TYPE2, ...>TUPLE
LISTARRAY

Managing Catalogs

Databend provides you the following commands to manage catalogs:

CREATE CATALOG

Defines and establishes a new catalog in the Databend query engine.

Syntax

CREATE CATALOG <catalog_name>
TYPE=ICEBERG
CONNECTION=(
TYPE='<connection_type>'
ADDRESS='<address>'
WAREHOUSE='<warehouse_location>'
"<connection_parameter>"='<connection_parameter_value>'
"<connection_parameter>"='<connection_parameter_value>'
...
);
ParameterRequired?Description
<catalog_name>YesThe name of the catalog you want to create.
TYPEYesSpecifies the catalog type. For Iceberg, set to ICEBERG.
CONNECTIONYesThe connection parameters for the Iceberg catalog.
TYPE (inside CONNECTION)YesThe connection type. For Iceberg, it is typically set to rest for REST-based connection.
ADDRESSYesThe address or URL of the Iceberg service (e.g., http://127.0.0.1:8181).
WAREHOUSEYesThe location of the Iceberg warehouse, usually an S3 bucket or compatible object storage system.
<connection_parameter>YesConnection parameters to establish connections with external storage. The required parameters vary based on the specific storage service and authentication methods. See the table below for a full list of the available parameters.
Connection ParameterDescription
s3.endpointS3 endpoint.
s3.access-key-idS3 access key ID.
s3.secret-access-keyS3 secret access key.
s3.session-tokenS3 session token, required when using temporary credentials.
s3.regionS3 region.
client.regionRegion to use for the S3 client, takes precedence over s3.region.
s3.path-style-accessS3 Path Style Access.
s3.sse.typeS3 Server-Side Encryption (SSE) type.
s3.sse.keyS3 SSE key. If encryption type is kms, this is a KMS Key ID. If encryption type is custom, this is a base-64 AES256 symmetric key.
s3.sse.md5S3 SSE MD5 checksum.
client.assume-role.arnARN of the IAM role to assume instead of using the default credential chain.
client.assume-role.external-idOptional external ID used to assume an IAM role.
client.assume-role.session-nameOptional session name used to assume an IAM role.
s3.allow-anonymousOption to allow anonymous access (e.g., for public buckets/folders).
s3.disable-ec2-metadataOption to disable loading credentials from EC2 metadata (typically used with s3.allow-anonymous).
s3.disable-config-loadOption to disable loading configuration from config files and environment variables.

SHOW CREATE CATALOG

Returns the detailed configuration of a specified catalog, including its type and storage parameters.

Syntax

SHOW CREATE CATALOG <catalog_name>;

SHOW CATALOGS

Shows all the created catalogs.

Syntax

SHOW CATALOGS [LIKE '<pattern>']

USE CATALOG

Switches the current session to the specified catalog.

Syntax

USE CATALOG <catalog_name>

Iceberg Table Functions

Databend provides the following table functions for querying Iceberg metadata, allowing users to inspect snapshots and manifests efficiently:

Usage Examples

This example shows how to create an Iceberg catalog using a REST-based connection, specifying the service address, warehouse location (S3), and optional parameters like AWS region and custom endpoint:

CREATE CATALOG ctl
TYPE=ICEBERG
CONNECTION=(
TYPE='rest'
ADDRESS='http://127.0.0.1:8181'
WAREHOUSE='s3://iceberg-tpch'
"s3.region"='us-east-1'
"s3.endpoint"='http://127.0.0.1:9000'
);