Window Functions

Overview

A window function operates on a group ("window") of related rows. For each input row, a window function returns one output row that depends on the specific row passed to the function and the values of the other rows in the window.

There are two main types of order-sensitive window functions:

Rank-related functions: List information based on the "rank" of a row. For example, ranking stores in descending order by profit per year, the store with the most profit will be ranked 1, and the second-most profitable store will be ranked 2, and so on.
Window frame functions: Enable you to perform rolling operations, such as calculating a running total or a moving average, on a subset of the rows in the window.

Window Function Categories

Databend supports two main categories of window functions:

1. Dedicated Window Functions

These functions are specifically designed for window operations and provide ranking, navigation, and value analysis capabilities.

Function	Description	Example
RANK	Returns rank with gaps	`RANK() OVER (ORDER BY salary DESC)` → `1, 2, 2, 4, ...`
DENSE_RANK	Returns rank without gaps	`DENSE_RANK() OVER (ORDER BY salary DESC)` → `1, 2, 2, 3, ...`
ROW_NUMBER	Returns sequential row number	`ROW_NUMBER() OVER (ORDER BY hire_date)` → `1, 2, 3, 4, ...`
CUME_DIST	Returns cumulative distribution	`CUME_DIST() OVER (ORDER BY score)` → `0.2, 0.4, 0.8, 1.0, ...`
PERCENT_RANK	Returns relative rank (0-1)	`PERCENT_RANK() OVER (ORDER BY score)` → `0.0, 0.25, 0.75, ...`
NTILE	Divides rows into N groups	`NTILE(4) OVER (ORDER BY score)` → `1, 1, 2, 2, 3, 3, 4, 4, ...`
FIRST_VALUE	Returns first value in window	`FIRST_VALUE(product) OVER (PARTITION BY category ORDER BY sales)`
LAST_VALUE	Returns last value in window	`LAST_VALUE(product) OVER (PARTITION BY category ORDER BY sales)`
NTH_VALUE	Returns Nth value in window	`NTH_VALUE(product, 2) OVER (PARTITION BY category ORDER BY sales)`
LEAD	Access value from subsequent row	`LEAD(price, 1) OVER (ORDER BY date)` → next day's price
LAG	Access value from previous row	`LAG(price, 1) OVER (ORDER BY date)` → previous day's price
FIRST	Returns first value (alias)	`FIRST(product) OVER (PARTITION BY category ORDER BY sales)`
LAST	Returns last value (alias)	`LAST(product) OVER (PARTITION BY category ORDER BY sales)`

2. Aggregate Functions Used as Window Functions

These are standard aggregate functions that can be used with the OVER clause to perform window operations.

Function	Description	Window Frame Support	Example
SUM	Calculates sum over window	✓	`SUM(sales) OVER (PARTITION BY region ORDER BY date)`
AVG	Calculates average over window	✓	`AVG(score) OVER (ORDER BY id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)`
COUNT	Counts rows over window	✓	`COUNT(*) OVER (PARTITION BY department)`
MIN	Returns minimum value in window	✓	`MIN(price) OVER (PARTITION BY category)`
MAX	Returns maximum value in window	✓	`MAX(price) OVER (PARTITION BY category)`
ARRAY_AGG	Collects values into array		`ARRAY_AGG(product) OVER (PARTITION BY category)`
STDDEV_POP	Population standard deviation	✓	`STDDEV_POP(value) OVER (PARTITION BY group)`
STDDEV_SAMP	Sample standard deviation	✓	`STDDEV_SAMP(value) OVER (PARTITION BY group)`
MEDIAN	Median value	✓	`MEDIAN(response_time) OVER (PARTITION BY server)`

Conditional Variants

Function	Description	Window Frame Support	Example
COUNT_IF	Conditional count	✓	`COUNT_IF(status = 'complete') OVER (PARTITION BY dept)`
SUM_IF	Conditional sum	✓	`SUM_IF(amount, status = 'paid') OVER (PARTITION BY customer)`
AVG_IF	Conditional average	✓	`AVG_IF(score, passed = true) OVER (PARTITION BY class)`
MIN_IF	Conditional minimum	✓	`MIN_IF(temp, location = 'outside') OVER (PARTITION BY day)`
MAX_IF	Conditional maximum	✓	`MAX_IF(speed, vehicle = 'car') OVER (PARTITION BY test)`