Window Functions
Overview
Window functions perform calculations across a set of related rows while returning one result per input row. Unlike aggregate functions, window functions don't collapse rows into a single output.
Key characteristics:
- Operate on a "window" of rows related to the current row
- Return one value per input row (no grouping/collapsing)
- Can access values from other rows in the window
- Support partitioning and ordering for flexible calculations
Window Function Categories
Databend supports two main categories of window functions:
1. Dedicated Window Functions
These functions are specifically designed for window operations.
Ranking Functions:
Function | Description | Ties Handling | Example Output |
---|---|---|---|
ROW_NUMBER | Sequential numbering | Always unique | 1, 2, 3, 4, 5 |
RANK | Ranking with gaps | Same rank, gaps after | 1, 2, 2, 4, 5 |
DENSE_RANK | Ranking without gaps | Same rank, no gaps | 1, 2, 2, 3, 4 |
Distribution Functions:
Function | Description | Range | Example Output |
---|---|---|---|
PERCENT_RANK | Relative rank as percentage | 0.0 to 1.0 | 0.0, 0.25, 0.5, 0.75, 1.0 |
CUME_DIST | Cumulative distribution | 0.0 to 1.0 | 0.2, 0.4, 0.6, 0.8, 1.0 |
NTILE | Divide into N buckets | 1 to N | 1, 1, 2, 2, 3, 3 |
Value Access Functions:
Function | Description | Use Case |
---|---|---|
FIRST_VALUE | First value in window | Get highest/earliest value |
LAST_VALUE | Last value in window | Get lowest/latest value |
NTH_VALUE | Nth value in window | Get specific positioned value |
LAG | Previous row value | Compare with previous |
LEAD | Next row value | Compare with next |
Aliases:
Function | Alias For |
---|---|
FIRST | FIRST_VALUE |
LAST | LAST_VALUE |
2. Aggregate Functions Used as Window Functions
These are standard aggregate functions that can be used with the OVER clause to perform window operations.
Function | Description | Window Frame Support | Example |
---|---|---|---|
SUM | Calculates sum over window | ✓ | SUM(sales) OVER (PARTITION BY region ORDER BY date) |
AVG | Calculates average over window | ✓ | AVG(score) OVER (ORDER BY id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) |
COUNT | Counts rows over window | ✓ | COUNT(*) OVER (PARTITION BY department) |
MIN | Returns minimum value in window | ✓ | MIN(price) OVER (PARTITION BY category) |
MAX | Returns maximum value in window | ✓ | MAX(price) OVER (PARTITION BY category) |
ARRAY_AGG | Collects values into array | ARRAY_AGG(product) OVER (PARTITION BY category) | |
STDDEV_POP | Population standard deviation | ✓ | STDDEV_POP(value) OVER (PARTITION BY group) |
STDDEV_SAMP | Sample standard deviation | ✓ | STDDEV_SAMP(value) OVER (PARTITION BY group) |
MEDIAN | Median value | ✓ | MEDIAN(response_time) OVER (PARTITION BY server) |
Conditional Variants
Function | Description | Window Frame Support | Example |
---|---|---|---|
COUNT_IF | Conditional count | ✓ | COUNT_IF(status = 'complete') OVER (PARTITION BY dept) |
SUM_IF | Conditional sum | ✓ | SUM_IF(amount, status = 'paid') OVER (PARTITION BY customer) |
AVG_IF | Conditional average | ✓ | AVG_IF(score, passed = true) OVER (PARTITION BY class) |
MIN_IF | Conditional minimum | ✓ | MIN_IF(temp, location = 'outside') OVER (PARTITION BY day) |
MAX_IF | Conditional maximum | ✓ | MAX_IF(speed, vehicle = 'car') OVER (PARTITION BY test) |
Basic Syntax
All window functions follow this pattern:
FUNCTION() OVER (
[ PARTITION BY column ]
[ ORDER BY column ]
[ window_frame ]
)
- PARTITION BY: Divides data into groups
- ORDER BY: Sorts rows within each partition
- window_frame: Defines which rows to include (optional)
Common Use Cases
- Ranking: Create leaderboards and top-N lists
- Analytics: Calculate running totals, moving averages, percentiles
- Comparison: Compare current vs previous/next values
- Grouping: Divide data into buckets without losing detail
For detailed syntax and examples, see individual function documentation above.