How Databend Copy-Free Data Sharing Works

Different teams need different parts of the same data. Traditional solutions copy data multiple times - expensive and hard to maintain.

Databend's ATTACH TABLE solves this elegantly: create multiple "views" of the same data without copying it. This leverages Databend's true compute-storage separation - whether using cloud storage or on-premise object storage: store once, access everywhere.

Think of ATTACH TABLE like computer shortcuts - they point to the original file without duplicating it.

                Object Storage (S3, MinIO, Azure, etc.)
                         ┌─────────────┐
                         │ Your Data   │
                         └──────┬──────┘
                                │
        ┌───────────────────────┼───────────────────────┐
        │                       │                       │
        ▼                       ▼                       ▼
┌─────────────┐         ┌─────────────┐         ┌─────────────┐
│ Marketing   │         │  Finance    │         │   Sales     │
│ Team View   │         │ Team View   │         │ Team View   │
└─────────────┘         └─────────────┘         └─────────────┘

How to Use ATTACH TABLE

Step 1: Find your data location

SELECT snapshot_location FROM FUSE_SNAPSHOT('default', 'company_sales');
-- Result: 1/23351/_ss/... → Data at s3://your-bucket/1/23351/

Step 2: Create team-specific views

-- Marketing: Customer behavior analysis
ATTACH TABLE marketing_view (customer_id, product, amount, order_date) 
's3://your-bucket/1/23351/' CONNECTION = (ACCESS_KEY_ID = 'xxx', SECRET_ACCESS_KEY = 'yyy');

-- Finance: Revenue tracking
ATTACH TABLE finance_view (order_id, amount, profit, order_date) 
's3://your-bucket/1/23351/' CONNECTION = (ACCESS_KEY_ID = 'xxx', SECRET_ACCESS_KEY = 'yyy');

-- HR: Employee info without salaries
ATTACH TABLE hr_employees (employee_id, name, department) 
's3://data/1/23351/' CONNECTION = (...);

-- Development: Production structure without sensitive data
ATTACH TABLE dev_customers (customer_id, country, created_date) 
's3://data/1/23351/' CONNECTION = (...);

Step 3: Query independently

-- Marketing analyzes trends
SELECT product, COUNT(*) FROM marketing_view GROUP BY product;

-- Finance tracks profit
SELECT order_date, SUM(profit) FROM finance_view GROUP BY order_date;

Key Benefits

Real-Time Updates: When source data changes, all attached tables see it instantly

INSERT INTO company_sales VALUES (1001, 501, 'Laptop', 1299.99, 299.99, 'user@email.com', '2025-01-20');
SELECT COUNT(*) FROM marketing_view WHERE order_date = '2024-01-20'; -- Returns: 1

Column-Level Security: Teams only see what they need - Marketing can't see profit, Finance can't see customer emails

Strong Consistency: Never read partial updates, always see complete snapshots - perfect for financial reporting and compliance

Full Performance: All indexes work automatically, same speed as regular tables

Why This Matters

Traditional Approach	Databend ATTACH TABLE
Multiple data copies	Single copy shared by all
ETL delays, sync issues	Real-time, always current
Complex maintenance	Zero maintenance
More copies = more security risk	Fine-grained column access
Slower due to data movement	Full optimization on original data

How It Works Under the Hood

Query: SELECT product, SUM(amount) FROM marketing_view GROUP BY product

┌─────────────────────────────────────────────────────────────────┐
│                    Query Execution Flow                         │
└─────────────────────────────────────────────────────────────────┘

    User Query
        │
        ▼
┌───────────────────┐    ┌─────────────────────────────────────┐
│ 1. Read Snapshot  │───►│ s3://bucket/1/23351/_ss/            │
│    Metadata       │    │ Get current table state             │
└───────────────────┘    └─────────────────────────────────────┘
        │
        ▼
┌───────────────────┐    ┌─────────────────────────────────────┐
│ 2. Apply Column   │───►│ Filter: customer_id, product,       │
│    Filter         │    │         amount, order_date          │
└───────────────────┘    └─────────────────────────────────────┘
        │
        ▼
┌───────────────────┐    ┌─────────────────────────────────────┐
│ 3. Check Stats &  │───►│ • Segment min/max values            │
│    Indexes        │    │ • Bloom filters                     │
└───────────────────┘    │ • Aggregate indexes                 │
        │                └─────────────────────────────────────┘
        ▼
┌───────────────────┐    ┌─────────────────────────────────────┐
│ 4. Smart Data     │───►│ Skip irrelevant blocks              │
│    Fetching       │    │ Download only needed data from _b/  │
└───────────────────┘    └─────────────────────────────────────┘
        │
        ▼
┌───────────────────┐    ┌─────────────────────────────────────┐
│ 5. Local          │───►│ Full optimization & parallelism     │
│    Execution      │    │ Process with all available indexes  │
└───────────────────┘    └─────────────────────────────────────┘
        │
        ▼
    Results: Product sales summary

Multiple Databend clusters can execute this flow simultaneously without coordination - true compute-storage separation in action.

ATTACH TABLE represents a fundamental shift: from copying data for each use case to one copy with many views. Whether in cloud or on-premise environments, Databend's architecture enables powerful, efficient data sharing while maintaining enterprise-grade consistency and security.

How Databend Copy-Free Data Sharing Works

How to Use ATTACH TABLE

Key Benefits

Why This Matters

How It Works Under the Hood

Join our growing community

GitHub

Slack

X(Twitter)

YouTube

Explore Databend Cloud for FREE

What is Data Sharing?​

How to Use ATTACH TABLE​

Key Benefits​

Why This Matters​

How It Works Under the Hood​

Join our growing community

GitHub

Slack

X(Twitter)

YouTube

Explore Databend Cloud for FREE

What is Data Sharing?

How to Use ATTACH TABLE

Key Benefits

Why This Matters

How It Works Under the Hood