Skip to main content

Querying TSV Files in Stage

Syntax:

Tutorial

This guide queries TSV (called TEXT in Databend v1.2.890-nightly and later) files from a stage. The examples keep TSV for compatibility with older server versions.

Step 1. Create an External Stage

Create an external stage with your own S3 bucket and credentials where your TSV files are stored.

CREATE STAGE tsv_query_stage
URL = 's3://load/tsv/'
CONNECTION = (
ACCESS_KEY_ID = '<your-access-key-id>'
SECRET_ACCESS_KEY = '<your-secret-access-key>'
);

Step 2. Create Custom TSV File Format

CREATE FILE FORMAT tsv_query_format
TYPE = TSV,
RECORD_DELIMITER = '\n',
FIELD_DELIMITER = ',',
COMPRESSION = AUTO;

Step 3. Query TSV Files

SELECT $1, $2, $3
FROM @tsv_query_stage
(
FILE_FORMAT => 'tsv_query_format',
PATTERN => '.*[.]tsv'
);

If the TSV files is compressed with gzip, we can use the following query:

SELECT $1, $2, $3
FROM @tsv_query_stage
(
FILE_FORMAT => 'tsv_query_format',
PATTERN => '.*[.]tsv[.]gz'
);

Query with Metadata

Query TSV files directly from a stage, including metadata columns like METADATA$FILENAME and METADATA$FILE_ROW_NUMBER:

SELECT
METADATA$FILENAME,
METADATA$FILE_ROW_NUMBER,
$1, $2, $3
FROM @tsv_query_stage
(
FILE_FORMAT => 'tsv_query_format',
PATTERN => '.*[.]tsv'
);
Try Databend Cloud for FREE

Multimodal, object-storage-native warehouse for BI, vectors, search, and geo.

Snowflake-compatible SQL with automatic scaling.

Sign up and get $200 in credits.

Try it today