Skip to main content

COPY INTO <location>

Introduced or updated: v1.2.647

COPY INTO allows you to unload data from a table or query into one or more files in one of the following locations:

  • User / Internal / External stages: See What is Stage? to learn about stages in Databend.
  • Buckets or containers created in a storage service.

See also: COPY INTO <table>

Syntax

COPY INTO { internalStage | externalStage | externalLocation }
FROM { [<database_name>.]<table_name> | ( <query> ) }
[ FILE_FORMAT = (
FORMAT_NAME = '<your-custom-format>'
| TYPE = { CSV | TSV | NDJSON | PARQUET } [ formatTypeOptions ]
) ]
[ copyOptions ]
[ VALIDATION_MODE = RETURN_ROWS ]
[ DETAILED_OUTPUT = true | false ]

internalStage

internalStage ::= @<internal_stage_name>[/<path>]

externalStage

externalStage ::= @<external_stage_name>[/<path>]

externalLocation

externalLocation ::=
's3://<bucket>[<path>]'
CONNECTION = (
<connection_parameters>
)

For the connection parameters available for accessing Amazon S3-like storage services, see Connection Parameters.

FILE_FORMAT

See Input & Output File Formats for details.

copyOptions

copyOptions ::=
[ SINGLE = true | false ]
[ MAX_FILE_SIZE = <num> ]
[ OVERWRITE = true | false ]
[ INCLUDE_QUERY_ID = true | false ]
[ USE_RAW_PATH = true | false ]
ParameterDefaultDescription
SINGLEfalseWhen true, the command unloads data into one single file.
MAX_FILE_SIZE67108864 bytes (64 MB)The maximum size (in bytes) of each file to be created. Effective when SINGLE is false.
OVERWRITEfalseWhen true, existing files with the same name at the target path will be overwritten. Note: OVERWRITE = true requires USE_RAW_PATH = true and INCLUDE_QUERY_ID = false.
INCLUDE_QUERY_IDtrueWhen true, a unique UUID will be included in the exported file names.
USE_RAW_PATHfalseWhen true, the exact user-provided path (including the full file name) will be used for exporting the data. If set to false, the user must provide a directory path.

DETAILED_OUTPUT

Determines whether a detailed result of the data unloading should be returned, with the default value set to false. For more information, see Output.

Output

COPY INTO provides a summary of the data unloading results with these columns:

ColumnDescription
rows_unloadedThe number of rows successfully unloaded to the destination.
input_bytesThe total size, in bytes, of the data read from the source table during the unload operation.
output_bytesThe total size, in bytes, of the data written to the destination.

When DETAILED_OUTPUT is set to true, COPY INTO provides results with the following columns. This assists in locating the unloaded files, especially when using MAX_FILE_SIZE to separate the unloaded data into multiple files.

ColumnDescription
file_nameThe name of the unloaded file.
file_sizeThe size of the unloaded file in bytes.
row_countThe number of rows contained in the unloaded file.

Examples

In this section, the provided examples make use of the following table and data:

-- Create sample table
CREATE TABLE canadian_city_population (
city_name VARCHAR(50),
population INT
);

-- Insert sample data
INSERT INTO canadian_city_population (city_name, population)
VALUES
('Toronto', 2731571),
('Montreal', 1704694),
('Vancouver', 631486),
('Calgary', 1237656),
('Ottawa', 934243),
('Edmonton', 972223),
('Quebec City', 542298),
('Winnipeg', 705244),
('Hamilton', 536917),
('Halifax', 403390);

Example 1: Unloading to Internal Stage

This example unloads data to an internal stage:

-- Create an internal stage
CREATE STAGE my_internal_stage;

-- Unload data from the table to the stage using the PARQUET file format
COPY INTO @my_internal_stage
FROM canadian_city_population
FILE_FORMAT = (TYPE = PARQUET);

┌────────────────────────────────────────────┐
│ rows_unloaded │ input_bytes │ output_bytes │
├───────────────┼─────────────┼──────────────┤
10211572
└────────────────────────────────────────────┘

LIST @my_internal_stage;

┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ name │ size │ md5 │ last_modified │ creator │
├─────────────────────────────────────────────────────────────────┼────────┼──────────────────┼───────────────────────────────┼──────────────────┤
│ data_abe520a3-ee88-488c-9221-b07c562c9a30_0000_00000000.parquet │ 572NULL2024-01-18 16:20:48.979 +0000NULL
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Example 2: Unloading to Compressed File

This example unloads data into a compressed file:

-- Create an internal stage
CREATE STAGE my_internal_stage;

-- Unload data from the table to the stage using the CSV file format with gzip compression
COPY INTO @my_internal_stage
FROM canadian_city_population
FILE_FORMAT = (TYPE = CSV COMPRESSION = gzip);

┌────────────────────────────────────────────┐
│ rows_unloaded │ input_bytes │ output_bytes │
├───────────────┼─────────────┼──────────────┤
10182168
└────────────────────────────────────────────┘

LIST @my_internal_stage;

┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ name │ size │ md5 │ last_modified │ creator │
├────────────────────────────────────────────────────────────────┼────────┼──────────────────┼───────────────────────────────┼──────────────────┤
│ data_7970afa5-32e3-4e7d-b793-e42a2a82a8e6_0000_00000000.csv.gz │ 168NULL2024-01-18 16:27:01.663 +0000NULL
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

-- COPY INTO also works with custom file formats. See below:
-- Create a custom file format named my_cs_gzip with CSV format and gzip compression
CREATE FILE FORMAT my_csv_gzip TYPE = CSV COMPRESSION = gzip;

-- Unload data from the table to the stage using the custom file format my_cs_gzip
COPY INTO @my_internal_stage
FROM canadian_city_population
FILE_FORMAT = (FORMAT_NAME = 'my_csv_gzip');

┌────────────────────────────────────────────┐
│ rows_unloaded │ input_bytes │ output_bytes │
├───────────────┼─────────────┼──────────────┤
10182168
└────────────────────────────────────────────┘

LIST @my_internal_stage;

┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ name │ size │ md5 │ last_modified │ creator │
├────────────────────────────────────────────────────────────────┼────────┼──────────────────┼───────────────────────────────┼──────────────────┤
│ data_d006ba1c-0609-46d7-a67b-75c7078d86ff_0000_00000000.csv.gz │ 168NULL2024-01-18 16:29:29.721 +0000NULL
│ data_7970afa5-32e3-4e7d-b793-e42a2a82a8e6_0000_00000000.csv.gz │ 168NULL2024-01-18 16:27:01.663 +0000NULL
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Example 3: Unloading to Bucket

This example unloads data into a bucket on MinIO:

-- Unload data from the table to a bucket named 'databend' on MinIO using the PARQUET file format
COPY INTO 's3://databend'
CONNECTION = (
ENDPOINT_URL = 'http://localhost:9000/',
ACCESS_KEY_ID = 'ROOTUSER',
SECRET_ACCESS_KEY = 'CHANGEME123',
region = 'us-west-2'
)
FROM canadian_city_population
FILE_FORMAT = (TYPE = PARQUET);

┌────────────────────────────────────────────┐
│ rows_unloaded │ input_bytes │ output_bytes │
├───────────────┼─────────────┼──────────────┤
10211572
└────────────────────────────────────────────┘

Alt text