Aggregate Functions
This page provides a comprehensive overview of aggregate functions in Databend, organized by functionality for easy reference.
Basic Aggregation
| Function | Description | Example | 
|---|---|---|
| COUNT | Counts the number of rows or non-NULL values | COUNT(*)→10 | 
| COUNT_DISTINCT | Counts distinct values | COUNT(DISTINCT city)→5 | 
| APPROX_COUNT_DISTINCT | Approximates count of distinct values | APPROX_COUNT_DISTINCT(user_id)→9955 | 
| SUM | Calculates the sum of values | SUM(sales)→1250.75 | 
| AVG | Calculates the average of values | AVG(temperature)→72.5 | 
| MIN | Returns the minimum value | MIN(price)→9.99 | 
| MAX | Returns the maximum value | MAX(price)→99.99 | 
| ANY_VALUE | Returns any value from the group | ANY_VALUE(status)→'active' | 
Conditional Aggregation
| Function | Description | Example | 
|---|---|---|
| COUNT_IF | Counts rows that match a condition | COUNT_IF(price > 100)→5 | 
| SUM_IF | Sums values that match a condition | SUM_IF(amount, status = 'completed')→750.25 | 
| AVG_IF | Averages values that match a condition | AVG_IF(score, passed = true)→85.6 | 
| MIN_IF | Returns minimum where condition is true | MIN_IF(temp, location = 'outside')→45.2 | 
| MAX_IF | Returns maximum where condition is true | MAX_IF(speed, vehicle = 'car')→120.5 | 
Statistical Functions
| Function | Description | Example | 
|---|---|---|
| VAR_POP / VARIANCE_POP | Population variance | VAR_POP(height)→10.25 | 
| VAR_SAMP / VARIANCE_SAMP | Sample variance | VAR_SAMP(height)→12.3 | 
| STDDEV_POP | Population standard deviation | STDDEV_POP(height)→3.2 | 
| STDDEV_SAMP | Sample standard deviation | STDDEV_SAMP(height)→3.5 | 
| COVAR_POP | Population covariance | COVAR_POP(x, y)→2.5 | 
| COVAR_SAMP | Sample covariance | COVAR_SAMP(x, y)→2.7 | 
| KURTOSIS | Measures peakedness of distribution | KURTOSIS(values)→2.1 | 
| SKEWNESS | Measures asymmetry of distribution | SKEWNESS(values)→0.2 | 
Percentile and Distribution
| Function | Description | Example | 
|---|---|---|
| MEDIAN | Calculates the median value | MEDIAN(response_time)→125 | 
| MODE | Returns the most frequent value | MODE(category)→'electronics' | 
| QUANTILE_CONT | Continuous interpolation quantile | QUANTILE_CONT(0.95)(response_time)→350.5 | 
| QUANTILE_DISC | Discrete quantile | QUANTILE_DISC(0.5)(age)→35 | 
| QUANTILE_TDIGEST | Approximate quantile using t-digest | QUANTILE_TDIGEST(0.9)(values)→95.2 | 
| QUANTILE_TDIGEST_WEIGHTED | Weighted t-digest quantile | QUANTILE_TDIGEST_WEIGHTED(0.5)(values, weights)→50.5 | 
| MEDIAN_TDIGEST | Approximate median using t-digest | MEDIAN_TDIGEST(response_time)→124.5 | 
| HISTOGRAM | Creates histogram buckets | HISTOGRAM(10)(values)→[{...}] | 
Array and Collection Aggregation
| Function | Description | Example | 
|---|---|---|
| ARRAY_AGG | Collects values into an array | ARRAY_AGG(product)→['A', 'B', 'C'] | 
| GROUP_ARRAY_MOVING_AVG | Moving average over array | GROUP_ARRAY_MOVING_AVG(3)(values)→[null, null, 3.0, 6.0, 9.0] | 
| GROUP_ARRAY_MOVING_SUM | Moving sum over array | GROUP_ARRAY_MOVING_SUM(2)(values)→[null, 3, 7, 11, 15] | 
String Aggregation
| Function | Description | Example | 
|---|---|---|
| GROUP_CONCAT | Concatenates values with separator | GROUP_CONCAT(city, ', ')→'New York, London, Tokyo' | 
| STRING_AGG | Concatenates strings with separator | STRING_AGG(tag, ',')→'red,green,blue' | 
| LISTAGG | Concatenates values with separator | LISTAGG(name, ', ')→'Alice, Bob, Charlie' | 
JSON Aggregation
| Function | Description | Example | 
|---|---|---|
| JSON_ARRAY_AGG | Aggregates values as JSON array | JSON_ARRAY_AGG(name)→'["Alice", "Bob", "Charlie"]' | 
| JSON_OBJECT_AGG | Creates JSON object from key-value pairs | JSON_OBJECT_AGG(name, score)→'{"Alice": 95, "Bob": 87}' | 
Argument Selection
| Function | Description | Example | 
|---|---|---|
| ARG_MAX | Returns value of expr1 at maximum expr2 | ARG_MAX(name, score)→'Alice' | 
| ARG_MIN | Returns value of expr1 at minimum expr2 | ARG_MIN(name, score)→'Charlie' | 
Funnel Analysis
| Function | Description | Example | 
|---|---|---|
| RETENTION | Calculates retention rates | RETENTION(action = 'signup', action = 'purchase')→[100, 40] | 
| WINDOWFUNNEL | Searches for event sequences within time window | WINDOWFUNNEL(1800)(timestamp, event='view', event='click', event='purchase')→2 |