Skip to main content

How Databend Copy-Free Data Sharing Works

What is Data Sharing?

Different teams need different parts of the same data. Traditional solutions copy data multiple times - expensive and hard to maintain.

Databend's ATTACH TABLE solves this elegantly: create multiple "views" of the same data without copying it. This leverages Databend's true compute-storage separation - whether using cloud storage or on-premise object storage: store once, access everywhere.

Think of ATTACH TABLE like computer shortcuts - they point to the original file without duplicating it.

                Object Storage (S3, MinIO, Azure, etc.)
┌─────────────┐
│ Your Data │
└──────┬──────┘

┌───────────────────────┼───────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Marketing │ │ Finance │ │ Sales │
│ Team View │ │ Team View │ │ Team View │
└─────────────┘ └─────────────┘ └─────────────┘

How to Use ATTACH TABLE

Step 1: Find your data location

SELECT snapshot_location FROM FUSE_SNAPSHOT('default', 'company_sales');
-- Result: 1/23351/_ss/... → Data at s3://your-bucket/1/23351/

Step 2: Create team-specific views

-- Marketing: Customer behavior analysis
ATTACH TABLE marketing_view (customer_id, product, amount, order_date)
's3://your-bucket/1/23351/' CONNECTION = (ACCESS_KEY_ID = 'xxx', SECRET_ACCESS_KEY = 'yyy');

-- Finance: Revenue tracking
ATTACH TABLE finance_view (order_id, amount, profit, order_date)
's3://your-bucket/1/23351/' CONNECTION = (ACCESS_KEY_ID = 'xxx', SECRET_ACCESS_KEY = 'yyy');

-- HR: Employee info without salaries
ATTACH TABLE hr_employees (employee_id, name, department)
's3://data/1/23351/' CONNECTION = (...);

-- Development: Production structure without sensitive data
ATTACH TABLE dev_customers (customer_id, country, created_date)
's3://data/1/23351/' CONNECTION = (...);

Step 3: Query independently

-- Marketing analyzes trends
SELECT product, COUNT(*) FROM marketing_view GROUP BY product;

-- Finance tracks profit
SELECT order_date, SUM(profit) FROM finance_view GROUP BY order_date;

Key Benefits

Real-Time Updates: When source data changes, all attached tables see it instantly

INSERT INTO company_sales VALUES (1001, 501, 'Laptop', 1299.99, 299.99, 'user@email.com', '2025-01-20');
SELECT COUNT(*) FROM marketing_view WHERE order_date = '2024-01-20'; -- Returns: 1

Column-Level Security: Teams only see what they need - Marketing can't see profit, Finance can't see customer emails

Strong Consistency: Never read partial updates, always see complete snapshots - perfect for financial reporting and compliance

Full Performance: All indexes work automatically, same speed as regular tables

Why This Matters

Traditional ApproachDatabend ATTACH TABLE
Multiple data copiesSingle copy shared by all
ETL delays, sync issuesReal-time, always current
Complex maintenanceZero maintenance
More copies = more security riskFine-grained column access
Slower due to data movementFull optimization on original data

How It Works Under the Hood

Query: SELECT product, SUM(amount) FROM marketing_view GROUP BY product

┌─────────────────────────────────────────────────────────────────┐
│ Query Execution Flow │
└─────────────────────────────────────────────────────────────────┘

User Query


┌───────────────────┐ ┌─────────────────────────────────────┐
│ 1. Read Snapshot │───►│ s3://bucket/1/23351/_ss/ │
│ Metadata │ │ Get current table state │
└───────────────────┘ └─────────────────────────────────────┘


┌───────────────────┐ ┌─────────────────────────────────────┐
│ 2. Apply Column │───►│ Filter: customer_id, product, │
│ Filter │ │ amount, order_date │
└───────────────────┘ └─────────────────────────────────────┘


┌───────────────────┐ ┌─────────────────────────────────────┐
│ 3. Check Stats & │───►│ • Segment min/max values │
│ Indexes │ │ • Bloom filters │
└───────────────────┘ │ • Aggregate indexes │
│ └─────────────────────────────────────┘

┌───────────────────┐ ┌─────────────────────────────────────┐
│ 4. Smart Data │───►│ Skip irrelevant blocks │
│ Fetching │ │ Download only needed data from _b/ │
└───────────────────┘ └─────────────────────────────────────┘


┌───────────────────┐ ┌─────────────────────────────────────┐
│ 5. Local │───►│ Full optimization & parallelism │
│ Execution │ │ Process with all available indexes │
└───────────────────┘ └─────────────────────────────────────┘


Results: Product sales summary

Multiple Databend clusters can execute this flow simultaneously without coordination - true compute-storage separation in action.

ATTACH TABLE represents a fundamental shift: from copying data for each use case to one copy with many views. Whether in cloud or on-premise environments, Databend's architecture enables powerful, efficient data sharing while maintaining enterprise-grade consistency and security.

Explore Databend Cloud for FREE
Low-cost
Fast Analytics
Easy Data Ingestion
Elastic Scaling
Try it today