Querying Parquet Files in Stage
Syntax:
Tutorial
Step 1. Create an External Stage
Create an external stage with your own S3 bucket and credentials where your Parquet files are stored.
CREATE STAGE parquet_query_stage 
URL = 's3://load/parquet/' 
CONNECTION = (
    ACCESS_KEY_ID = '<your-access-key-id>' 
    SECRET_ACCESS_KEY = '<your-secret-access-key>'
);
Step 2. Create Custom Parquet File Format
CREATE FILE FORMAT parquet_query_format TYPE = PARQUET;
- More Parquet file format options refer to Parquet File Format Options
Step 3. Query Parquet Files
query with colum names:
SELECT *
FROM @parquet_query_stage
(
    FILE_FORMAT => 'parquet_query_format',
    PATTERN => '.*[.]parquet'
);
query with path expressions:
SELECT $1
FROM @parquet_query_stage
(
    FILE_FORMAT => 'parquet_query_format',
    PATTERN => '.*[.]parquet'
);
Query with Metadata
Query Parquet files directly from a stage, including metadata columns like METADATA$FILENAME and METADATA$FILE_ROW_NUMBER:
SELECT
    METADATA$FILENAME,
    METADATA$FILE_ROW_NUMBER,
    *
FROM @parquet_query_stage
(
    FILE_FORMAT => 'parquet_query_format',
    PATTERN => '.*[.]parquet'
);