跳到主要内容

Aggregate Functions

This page provides a comprehensive overview of aggregate functions in Databend, organized by functionality for easy reference.

Basic Aggregation

FunctionDescriptionExample
COUNTCounts the number of rows or non-NULL valuesCOUNT(*)10
COUNT_DISTINCTCounts distinct valuesCOUNT(DISTINCT city)5
APPROX_COUNT_DISTINCTApproximates count of distinct valuesAPPROX_COUNT_DISTINCT(user_id)9955
SUMCalculates the sum of valuesSUM(sales)1250.75
AVGCalculates the average of valuesAVG(temperature)72.5
MINReturns the minimum valueMIN(price)9.99
MAXReturns the maximum valueMAX(price)99.99
ANYReturns any value from the groupANY(status)'active'

Conditional Aggregation

FunctionDescriptionExample
COUNT_IFCounts rows that match a conditionCOUNT_IF(price > 100)5
SUM_IFSums values that match a conditionSUM_IF(amount, status = 'completed')750.25
AVG_IFAverages values that match a conditionAVG_IF(score, passed = true)85.6
MIN_IFReturns minimum where condition is trueMIN_IF(temp, location = 'outside')45.2
MAX_IFReturns maximum where condition is trueMAX_IF(speed, vehicle = 'car')120.5

Statistical Functions

FunctionDescriptionExample
VAR_POP / VARIANCE_POPPopulation varianceVAR_POP(height)10.25
VAR_SAMP / VARIANCE_SAMPSample varianceVAR_SAMP(height)12.3
STDDEV_POPPopulation standard deviationSTDDEV_POP(height)3.2
STDDEV_SAMPSample standard deviationSTDDEV_SAMP(height)3.5
COVAR_POPPopulation covarianceCOVAR_POP(x, y)2.5
COVAR_SAMPSample covarianceCOVAR_SAMP(x, y)2.7
KURTOSISMeasures peakedness of distributionKURTOSIS(values)2.1
SKEWNESSMeasures asymmetry of distributionSKEWNESS(values)0.2

Percentile and Distribution

FunctionDescriptionExample
MEDIANCalculates the median valueMEDIAN(response_time)125
MODEReturns the most frequent valueMODE(category)'electronics'
QUANTILE_CONTContinuous interpolation quantileQUANTILE_CONT(0.95)(response_time)350.5
QUANTILE_DISCDiscrete quantileQUANTILE_DISC(0.5)(age)35
QUANTILE_TDIGESTApproximate quantile using t-digestQUANTILE_TDIGEST(0.9)(values)95.2
QUANTILE_TDIGEST_WEIGHTEDWeighted t-digest quantileQUANTILE_TDIGEST_WEIGHTED(0.5)(values, weights)50.5
MEDIAN_TDIGESTApproximate median using t-digestMEDIAN_TDIGEST(response_time)124.5
HISTOGRAMCreates histogram bucketsHISTOGRAM(10)(values)[{...}]

Array and Collection Aggregation

FunctionDescriptionExample
ARRAY_AGGCollects values into an arrayARRAY_AGG(product)['A', 'B', 'C']
GROUP_ARRAY_MOVING_AVGMoving average over arrayGROUP_ARRAY_MOVING_AVG(3)(values)[null, null, 3.0, 6.0, 9.0]
GROUP_ARRAY_MOVING_SUMMoving sum over arrayGROUP_ARRAY_MOVING_SUM(2)(values)[null, 3, 7, 11, 15]

String Aggregation

FunctionDescriptionExample
GROUP_CONCATConcatenates values with separatorGROUP_CONCAT(city, ', ')'New York, London, Tokyo'
STRING_AGGConcatenates strings with separatorSTRING_AGG(tag, ',')'red,green,blue'
LISTAGGConcatenates values with separatorLISTAGG(name, ', ')'Alice, Bob, Charlie'

JSON Aggregation

FunctionDescriptionExample
JSON_ARRAY_AGGAggregates values as JSON arrayJSON_ARRAY_AGG(name)'["Alice", "Bob", "Charlie"]'
JSON_OBJECT_AGGCreates JSON object from key-value pairsJSON_OBJECT_AGG(name, score)'{"Alice": 95, "Bob": 87}'

Argument Selection

FunctionDescriptionExample
ARG_MAXReturns value of expr1 at maximum expr2ARG_MAX(name, score)'Alice'
ARG_MINReturns value of expr1 at minimum expr2ARG_MIN(name, score)'Charlie'

Funnel Analysis

FunctionDescriptionExample
RETENTIONCalculates retention ratesRETENTION(action = 'signup', action = 'purchase')[100, 40]
WINDOWFUNNELSearches for event sequences within time windowWINDOWFUNNEL(1800)(timestamp, event='view', event='click', event='purchase')2