Skip to main content

Querying & Transforming

Databend enables direct querying of staged files without loading data into tables first. Query files from any stage type (user, internal, external) or directly from object storage and HTTPS URLs. Ideal for data inspection, validation, and transformation before or after loading.

Syntax

SELECT [<alias>.]<column> [, <column> ...] | [<alias>.]$<col_position> [, $<col_position> ...]
FROM {@<stage_name>[/<path>] [<table_alias>] | '<uri>' [<table_alias>]}
[(
[<connection_parameters>],
[ PATTERN => '<regex_pattern>'],
[ FILE_FORMAT => 'CSV | TSV | NDJSON | PARQUET | ORC | Avro | <custom_format_name>'],
[ FILES => ( '<file_name>' [ , '<file_name>' ... ])],
[ CASE_SENSITIVE => true | false ]
)]

Parameters Overview

Key parameters for controlling data access and parsing:

ParameterDescription
FILE_FORMATFile format type (CSV, TSV, NDJSON, PARQUET, ORC, Avro)
PATTERNRegex pattern to filter files
FILESExplicit list of files to query
CASE_SENSITIVEColumn name case sensitivity (Parquet)
table_aliasAlias for referencing staged files
$col_positionColumn selection by position (1-based)
connection_parametersExternal storage connection details
uriURI for remote files

Supported File Formats

File FormatReturn FormatAccess MethodExampleGuide
ParquetNative data typesDirect column namesSELECT id, name FROMQuerying Parquet Files
ORCNative data typesDirect column namesSELECT id, name FROMQuerying ORC Files
CSVString valuesPositional references $<position>SELECT $1, $2 FROMQuerying CSV Files
TSVString valuesPositional references $<position>SELECT $1, $2 FROMQuerying TSV Files
NDJSONVariant objectPath expressions $1:<field>SELECT $1:id, $1:name FROMQuerying NDJSON Files
AvroVariant objectPath expressions $1:<field>SELECT $1:id, $1:name FROMQuerying Avro Files
Explore Databend Cloud for FREE
Low-cost
Fast Analytics
Easy Data Ingestion
Elastic Scaling
Try it today