Skip to main content

Querying TSV Files in Stage

Query TSV Files in Stage

Syntax:

SELECT [<alias>.]$<col_position> [, $<col_position> ...] 
FROM {@<stage_name>[/<path>] [<table_alias>] | '<uri>' [<table_alias>]}
[(
[<connection_parameters>],
[ PATTERN => '<regex_pattern>'],
[ FILE_FORMAT => 'TSV| <custom_format_name>'],
[ FILES => ( '<file_name>' [ , '<file_name>' ] [ , ... ] ) ]
)]
Tips

TSV doesn't have schema information, so we can only query the columns $<col_position> [, $<col_position> ...] by position.

Tutorial

Step 1. Create an External Stage

Create an external stage with your own S3 bucket and credentials where your TSV files are stored.

CREATE STAGE tsv_query_stage 
URL = 's3://load/tsv/'
CONNECTION = (
ACCESS_KEY_ID = '<your-access-key-id>'
SECRET_ACCESS_KEY = '<your-secret-access-key>'
);

Step 2. Create Custom TSV File Format

CREATE FILE FORMAT tsv_query_format 
TYPE = TSV,
RECORD_DELIMITER = '\n',
FIELD_DELIMITER = ',',
COMPRESSION = AUTO;

Step 3. Query TSV Files

SELECT $1, $2, $3
FROM @tsv_query_stage
(
FILE_FORMAT => 'tsv_query_format',
PATTERN => '.*[.]tsv'
);

If the TSV files is compressed with gzip, we can use the following query:

SELECT $1, $2, $3
FROM @tsv_query_stage
(
FILE_FORMAT => 'tsv_query_format',
PATTERN => '.*[.]tsv[.]gz'
);
Explore Databend Cloud for FREE
Low-cost
Fast Analytics
Easy Data Ingestion
Elastic Scaling
Try it today