Window Functions
Overview
Window functions perform calculations across a set of related rows while returning one result per input row. Unlike aggregate functions, window functions don't collapse rows into a single output.
Key characteristics:
- Operate on a "window" of rows related to the current row
- Return one value per input row (no grouping/collapsing)
- Can access values from other rows in the window
- Support partitioning and ordering for flexible calculations
Window Function Categories
Databend supports two main categories of window functions:
1. Dedicated Window Functions
These functions are specifically designed for window operations.
Ranking Functions:
| Function | Description | Ties Handling | Example Output |
|---|---|---|---|
| ROW_NUMBER | Sequential numbering | Always unique | 1, 2, 3, 4, 5 |
| RANK | Ranking with gaps | Same rank, gaps after | 1, 2, 2, 4, 5 |
| DENSE_RANK | Ranking without gaps | Same rank, no gaps | 1, 2, 2, 3, 4 |
Distribution Functions:
| Function | Description | Range | Example Output |
|---|---|---|---|
| PERCENT_RANK | Relative rank as percentage | 0.0 to 1.0 | 0.0, 0.25, 0.5, 0.75, 1.0 |
| CUME_DIST | Cumulative distribution | 0.0 to 1.0 | 0.2, 0.4, 0.6, 0.8, 1.0 |
| NTILE | Divide into N buckets | 1 to N | 1, 1, 2, 2, 3, 3 |
Value Access Functions:
| Function | Description | Use Case |
|---|---|---|
| FIRST_VALUE | First value in window | Get highest/earliest value |
| LAST_VALUE | Last value in window | Get lowest/latest value |
| NTH_VALUE | Nth value in window | Get specific positioned value |
| LAG | Previous row value | Compare with previous |
| LEAD | Next row value | Compare with next |
Aliases:
| Function | Alias For |
|---|---|
| FIRST | FIRST_VALUE |
| LAST | LAST_VALUE |
2. Aggregate Functions Used as Window Functions
These are standard aggregate functions that can be used with the OVER clause to perform window operations.
| Function | Description | Window Frame Support | Example |
|---|---|---|---|
| SUM | Calculates sum over window | ✓ | SUM(sales) OVER (PARTITION BY region ORDER BY date) |
| AVG | Calculates average over window | ✓ | AVG(score) OVER (ORDER BY id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) |
| COUNT | Counts rows over window | ✓ | COUNT(*) OVER (PARTITION BY department) |
| MIN | Returns minimum value in window | ✓ | MIN(price) OVER (PARTITION BY category) |
| MAX | Returns maximum value in window | ✓ | MAX(price) OVER (PARTITION BY category) |
| ARRAY_AGG | Collects values into array | ARRAY_AGG(product) OVER (PARTITION BY category) | |
| STDDEV_POP | Population standard deviation | ✓ | STDDEV_POP(value) OVER (PARTITION BY group) |
| STDDEV_SAMP | Sample standard deviation | ✓ | STDDEV_SAMP(value) OVER (PARTITION BY group) |
| MEDIAN | Median value | ✓ | MEDIAN(response_time) OVER (PARTITION BY server) |
Conditional Variants
| Function | Description | Window Frame Support | Example |
|---|---|---|---|
| COUNT_IF | Conditional count | ✓ | COUNT_IF(status = 'complete') OVER (PARTITION BY dept) |
| SUM_IF | Conditional sum | ✓ | SUM_IF(amount, status = 'paid') OVER (PARTITION BY customer) |
| AVG_IF | Conditional average | ✓ | AVG_IF(score, passed = true) OVER (PARTITION BY class) |
| MIN_IF | Conditional minimum | ✓ | MIN_IF(temp, location = 'outside') OVER (PARTITION BY day) |
| MAX_IF | Conditional maximum | ✓ | MAX_IF(speed, vehicle = 'car') OVER (PARTITION BY test) |
Basic Syntax
All window functions follow this pattern:
FUNCTION() OVER (
[ PARTITION BY column ]
[ ORDER BY column ]
[ window_frame ]
)
- PARTITION BY: Divides data into groups
- ORDER BY: Sorts rows within each partition
- window_frame: Defines which rows to include (optional)
Common Use Cases
- Ranking: Create leaderboards and top-N lists
- Analytics: Calculate running totals, moving averages, percentiles
- Comparison: Compare current vs previous/next values
- Grouping: Divide data into buckets without losing detail
For detailed syntax and examples, see individual function documentation above.