Skip to main content

Querying TSV Files in Stage

Query TSV Files in Stage

Syntax:

SELECT [<alias>.]$<col_position> [, $<col_position> ...] 
FROM {@<stage_name>[/<path>] [<table_alias>] | '<uri>' [<table_alias>]}
[(
[<connection_parameters>],
[ PATTERN => '<regex_pattern>'],
[ FILE_FORMAT => 'TSV| <custom_format_name>'],
[ FILES => ( '<file_name>' [ , '<file_name>' ] [ , ... ] ) ]
)]
Tips

Query Return Content Explanation:

  • Return Format: Individual column values as strings by default
  • Access Method: Use positional references $<col_position> (e.g., $1, $2, $3)
  • Example: SELECT $1, $2, $3 FROM @stage_name
  • Key Features:
    • Columns accessed by position, not by name
    • Each $<col_position> refers to a single column, not the whole row
    • Type casting required for non-string operations (e.g., CAST($1 AS INT))
    • No embedded schema information in TSV files

Tutorial

Step 1. Create an External Stage

Create an external stage with your own S3 bucket and credentials where your TSV files are stored.

CREATE STAGE tsv_query_stage 
URL = 's3://load/tsv/'
CONNECTION = (
ACCESS_KEY_ID = '<your-access-key-id>'
SECRET_ACCESS_KEY = '<your-secret-access-key>'
);

Step 2. Create Custom TSV File Format

CREATE FILE FORMAT tsv_query_format 
TYPE = TSV,
RECORD_DELIMITER = '\n',
FIELD_DELIMITER = ',',
COMPRESSION = AUTO;

Step 3. Query TSV Files

SELECT $1, $2, $3
FROM @tsv_query_stage
(
FILE_FORMAT => 'tsv_query_format',
PATTERN => '.*[.]tsv'
);

If the TSV files is compressed with gzip, we can use the following query:

SELECT $1, $2, $3
FROM @tsv_query_stage
(
FILE_FORMAT => 'tsv_query_format',
PATTERN => '.*[.]tsv[.]gz'
);

Query with Metadata

Query TSV files directly from a stage, including metadata columns like METADATA$FILENAME and METADATA$FILE_ROW_NUMBER:

SELECT
METADATA$FILENAME,
METADATA$FILE_ROW_NUMBER,
$1, $2, $3
FROM @tsv_query_stage
(
FILE_FORMAT => 'tsv_query_format',
PATTERN => '.*[.]tsv'
);