Skip to main content

Querying CSV Files in Stage

Syntax:

Tutorial

Step 1. Create an External Stage

Create an external stage with your own S3 bucket and credentials where your CSV files are stored.

CREATE STAGE csv_query_stage 
URL = 's3://load/csv/'
CONNECTION = (
ACCESS_KEY_ID = '<your-access-key-id>'
SECRET_ACCESS_KEY = '<your-secret-access-key>'
);

Step 2. Create Custom CSV File Format

CREATE FILE FORMAT csv_query_format 
TYPE = CSV,
RECORD_DELIMITER = '\n',
FIELD_DELIMITER = ',',
COMPRESSION = AUTO,
SKIP_HEADER = 1; -- Skip first line when querying if the CSV file has header

Step 3. Query CSV Files

SELECT $1, $2, $3
FROM @csv_query_stage
(
FILE_FORMAT => 'csv_query_format',
PATTERN => '.*[.]csv'
);

If the CSV files is compressed with gzip, we can use the following query:

SELECT $1, $2, $3
FROM @csv_query_stage
(
FILE_FORMAT => 'csv_query_format',
PATTERN => '.*[.]csv[.]gz'
);

Query with Metadata

Query CSV files directly from a stage, including metadata columns like METADATA$FILENAME and METADATA$FILE_ROW_NUMBER:

SELECT
METADATA$FILENAME,
METADATA$FILE_ROW_NUMBER,
$1, $2, $3
FROM @csv_query_stage
(
FILE_FORMAT => 'csv_query_format',
PATTERN => '.*[.]csv'
);