Structured & Semi-Structured Functions
Structured and semi-structured functions in Databend enable efficient processing of arrays, objects, maps, JSON, and other structured data formats. These functions provide comprehensive capabilities for creating, parsing, querying, transforming, and manipulating structured and semi-structured data.
JSON Functions
Parsing & Validation
| Function | Description | Example |
|---|---|---|
| PARSE_JSON | Parses a JSON string into a variant value | PARSE_JSON('[1,2,3]') |
| CHECK_JSON | Validates if a string is valid JSON | CHECK_JSON('{"a":1}') |
| JSON_TYPEOF | Returns the type of a JSON value | JSON_TYPEOF(PARSE_JSON('[1,2,3]')) |
Path-based Querying
| Function | Description | Example |
|---|---|---|
| JSON_PATH_EXISTS | Checks if a JSON path exists | JSON_PATH_EXISTS(json_obj, '$.name') |
| JSON_PATH_QUERY | Queries JSON data using JSONPath | JSON_PATH_QUERY(json_obj, '$.items[*]') |
| JSON_PATH_QUERY_ARRAY | Queries JSON data and returns results as an array | JSON_PATH_QUERY_ARRAY(json_obj, '$.items') |
| JSON_PATH_QUERY_FIRST | Returns the first result from a JSON path query | JSON_PATH_QUERY_FIRST(json_obj, '$.items[*]') |
| JSON_PATH_MATCH | Matches JSON values against a path pattern | JSON_PATH_MATCH(json_obj, '$.age') |
| JQ | Advanced JSON processing using jq syntax | JQ('.name', json_obj) |
Value Extraction
| Function | Description | Example |
|---|---|---|
| GET | Gets a value from a JSON object by key or array by index | GET(PARSE_JSON('[1,2,3]'), 0) |
| GET_PATH | Gets a value from a JSON object using a path expression | GET_PATH(json_obj, 'user.name') |
| GET_IGNORE_CASE | Gets a value with case-insensitive key matching | GET_IGNORE_CASE(json_obj, 'NAME') |
| JSON_EXTRACT_PATH_TEXT | Extracts text value from JSON using path | JSON_EXTRACT_PATH_TEXT(json_obj, 'name') |
Transformation & Output
| Function | Description | Example |
|---|---|---|
| JSON_TO_STRING | Converts a JSON value to a string | JSON_TO_STRING(PARSE_JSON('{"a":1}')) |
| JSON_PRETTY | Formats JSON with proper indentation | JSON_PRETTY(PARSE_JSON('{"a":1}')) |
| STRIP_NULL_VALUE | Removes null values from JSON | STRIP_NULL_VALUE(PARSE_JSON('{"a":1,"b":null}')) |
Array/Object Expansion
| Function | Description | Example |
|---|---|---|
| JSON_EACH | Expands JSON object into key-value pairs | JSON_EACH(PARSE_JSON('{"a":1,"b":2}')) |
| JSON_ARRAY_ELEMENTS | Expands JSON array into individual elements | JSON_ARRAY_ELEMENTS(PARSE_JSON('[1,2,3]')) |
Array Functions
| Function | Description | Example |
|---|---|---|
| ARRAY | Builds an array from expressions | ARRAY(1, 2, 3) |
| ARRAY_CONSTRUCT | Creates an array from individual values | ARRAY_CONSTRUCT(1, 2, 3) |
| RANGE | Generates an array of sequential numbers | RANGE(1, 5) |
| ARRAY_GENERATE_RANGE | Generates a sequence with optional step | ARRAY_GENERATE_RANGE(0, 6, 2) |
| GET | Gets an element from an array by index | GET([1,2,3], 0) |
| ARRAY_GET | Alias for GET function | ARRAY_GET([1,2,3], 1) |
| CONTAINS | Checks if an array contains a specific value | CONTAINS([1,2,3], 2) |
| ARRAY_CONTAINS | Checks if an array contains a specific value | ARRAY_CONTAINS([1,2,3], 2) |
| ARRAY_SIZE | Returns array length (alias: ARRAY_LENGTH) | ARRAY_SIZE([1,2,3]) |
| ARRAY_COUNT | Counts the non-NULL elements | ARRAY_COUNT([1,NULL,2]) |
| ARRAY_ANY | Returns the first non-NULL entry | ARRAY_ANY([NULL,'a','b']) |
| ARRAY_APPEND | Appends an element to the end of an array | ARRAY_APPEND([1,2], 3) |
| ARRAY_PREPEND | Prepends an element to the beginning of an array | ARRAY_PREPEND([2,3], 1) |
| ARRAY_INSERT | Inserts an element at a specific position | ARRAY_INSERT([1,3], 1, 2) |
| ARRAY_REMOVE | Removes all occurrences of a specified element | ARRAY_REMOVE([1,2,2,3], 2) |
| ARRAY_REMOVE_FIRST | Removes the first element from an array | ARRAY_REMOVE_FIRST([1,2,3]) |
| ARRAY_REMOVE_LAST | Removes the last element from an array | ARRAY_REMOVE_LAST([1,2,3]) |
| ARRAY_CONCAT | Concatenates multiple arrays | ARRAY_CONCAT([1,2], [3,4]) |
| ARRAY_SLICE | Extracts a portion of an array | ARRAY_SLICE([1,2,3,4], 1, 2) |
| SLICE | Alias for ARRAY_SLICE function | SLICE([1,2,3,4], 1, 2) |
| ARRAYS_ZIP | Combines multiple arrays element-wise | ARRAYS_ZIP([1,2], ['a','b']) |
| ARRAY_DISTINCT | Returns unique elements from an array | ARRAY_DISTINCT([1,2,2,3]) |
| ARRAY_UNIQUE | Alias for ARRAY_DISTINCT function | ARRAY_UNIQUE([1,2,2,3]) |
| ARRAY_INTERSECTION | Returns common elements between arrays | ARRAY_INTERSECTION([1,2,3], [2,3,4]) |
| ARRAY_EXCEPT | Returns elements in first array but not in second | ARRAY_EXCEPT([1,2,3], [2,3]) |
| ARRAY_OVERLAP | Checks if arrays have common elements | ARRAY_OVERLAP([1,2], [2,3]) |
| ARRAY_TRANSFORM | Applies a function to each array element | ARRAY_TRANSFORM([1,2,3], x -> x * 2) |
| ARRAY_FILTER | Filters array elements based on a condition | ARRAY_FILTER([1,2,3,4], x -> x > 2) |
| ARRAY_REDUCE | Reduces array to a single value using aggregation | ARRAY_REDUCE([1,2,3], 0, (acc, x) -> acc + x) |
| ARRAY_AGGREGATE | Aggregates array elements using a function | ARRAY_AGGREGATE([1,2,3], 'sum') |
| ARRAY_SUM | Sum of numeric values | ARRAY_SUM([1,2,3]) |
| ARRAY_AVG | Average of numeric values | ARRAY_AVG([1,2,3]) |
| ARRAY_MEDIAN | Median of numeric values | ARRAY_MEDIAN([1,3,2]) |
| ARRAY_MIN | Minimum value | ARRAY_MIN([1,2,3]) |
| ARRAY_MAX | Maximum value | ARRAY_MAX([1,2,3]) |
| ARRAY_STDDEV_POP | Population standard deviation | ARRAY_STDDEV_POP([1,2,3]) |
| ARRAY_STDDEV_SAMP | Sample standard deviation | ARRAY_STDDEV_SAMP([1,2,3]) |
| ARRAY_KURTOSIS | Excess kurtosis | ARRAY_KURTOSIS([1,2,3,4]) |
| ARRAY_SKEWNESS | Skewness | ARRAY_SKEWNESS([1,2,3,10]) |
| ARRAY_APPROX_COUNT_DISTINCT | Approximate distinct count | ARRAY_APPROX_COUNT_DISTINCT([1,1,2]) |
| ARRAY_SORT | Sorts values; variants control order/nulls | ARRAY_SORT([3,1,2]) |
| ARRAY_TO_STRING | Joins array elements | ARRAY_TO_STRING(['a','b'], ',') |
| ARRAY_COMPACT | Removes null values from an array | ARRAY_COMPACT([1, NULL, 2, NULL, 3]) |
| ARRAY_FLATTEN | Flattens nested arrays into a single array | ARRAY_FLATTEN([[1,2], [3,4]]) |
| ARRAY_REVERSE | Reverses the order of array elements | ARRAY_REVERSE([1,2,3]) |
| ARRAY_INDEXOF | Returns the index of first occurrence of an element | ARRAY_INDEXOF([1,2,3,2], 2) |
| UNNEST | Expands an array into individual rows | UNNEST([1,2,3]) |
Object Functions
| Function | Description | Example |
|---|---|---|
| OBJECT_CONSTRUCT | Creates a JSON object from key-value pairs | OBJECT_CONSTRUCT('name', 'John', 'age', 30) |
| OBJECT_CONSTRUCT_KEEP_NULL | Creates a JSON object keeping null values | OBJECT_CONSTRUCT_KEEP_NULL('a', 1, 'b', NULL) |
| OBJECT_KEYS | Returns all keys from a JSON object as an array | OBJECT_KEYS(PARSE_JSON('{"a":1,"b":2}')) |
| OBJECT_INSERT | Inserts or updates a key-value pair in a JSON object | OBJECT_INSERT(json_obj, 'new_key', 'value') |
| OBJECT_DELETE | Removes a key-value pair from a JSON object | OBJECT_DELETE(json_obj, 'key_to_remove') |
| OBJECT_PICK | Creates a new object with only specified keys | OBJECT_PICK(json_obj, 'name', 'age') |
Map Functions
| Function | Description | Example |
|---|---|---|
| MAP_CAT | Combines multiple maps into a single map | MAP_CAT({'a':1}, {'b':2}) |
| MAP_KEYS | Returns all keys from a map as an array | MAP_KEYS({'a':1, 'b':2}) |
| MAP_VALUES | Returns all values from a map as an array | MAP_VALUES({'a':1, 'b':2}) |
| MAP_SIZE | Returns the number of key-value pairs in a map | MAP_SIZE({'a':1, 'b':2}) |
| MAP_CONTAINS_KEY | Checks if a map contains a specific key | MAP_CONTAINS_KEY({'a':1}, 'a') |
| MAP_INSERT | Inserts a key-value pair into a map | MAP_INSERT({'a':1}, 'b', 2) |
| MAP_DELETE | Removes a key-value pair from a map | MAP_DELETE({'a':1, 'b':2}, 'b') |
| MAP_TRANSFORM_KEYS | Applies a function to each key in a map | MAP_TRANSFORM_KEYS(map, k -> UPPER(k)) |
| MAP_TRANSFORM_VALUES | Applies a function to each value in a map | MAP_TRANSFORM_VALUES(map, v -> v * 2) |
| MAP_FILTER | Filters key-value pairs based on a predicate | MAP_FILTER(map, (k, v) -> v > 10) |
| MAP_PICK | Creates a new map with only specified keys | MAP_PICK({'a':1, 'b':2, 'c':3}, 'a', 'c') |
Type Conversion Functions
| Function | Description | Example |
|---|---|---|
| AS_BOOLEAN | Converts a VARIANT value to BOOLEAN | AS_BOOLEAN(PARSE_JSON('true')) |
| AS_INTEGER | Converts a VARIANT value to BIGINT | AS_INTEGER(PARSE_JSON('42')) |
| AS_FLOAT | Converts a VARIANT value to DOUBLE | AS_FLOAT(PARSE_JSON('3.14')) |
| AS_DECIMAL | Converts a VARIANT value to DECIMAL | AS_DECIMAL(PARSE_JSON('12.34')) |
| AS_STRING | Converts a VARIANT value to STRING | AS_STRING(PARSE_JSON('"hello"')) |
| AS_BINARY | Converts a VARIANT value to BINARY | AS_BINARY(TO_BINARY('abcd')::VARIANT) |
| AS_DATE | Converts a VARIANT value to DATE | AS_DATE(TO_DATE('2025-10-11')::VARIANT) |
| AS_ARRAY | Converts a VARIANT value to ARRAY | AS_ARRAY(PARSE_JSON('[1,2,3]')) |
| AS_OBJECT | Converts a VARIANT value to OBJECT | AS_OBJECT(PARSE_JSON('{"a":1}')) |
Type Predicate Functions
| Function | Description | Example |
|---|---|---|
| IS_ARRAY | Checks if a JSON value is an array | IS_ARRAY(PARSE_JSON('[1,2,3]')) |
| IS_OBJECT | Checks if a JSON value is an object | IS_OBJECT(PARSE_JSON('{"a":1}')) |
| IS_STRING | Checks if a JSON value is a string | IS_STRING(PARSE_JSON('"hello"')) |
| IS_INTEGER | Checks if a JSON value is an integer | IS_INTEGER(PARSE_JSON('42')) |
| IS_FLOAT | Checks if a JSON value is a floating-point number | IS_FLOAT(PARSE_JSON('3.14')) |
| IS_BOOLEAN | Checks if a JSON value is a boolean | IS_BOOLEAN(PARSE_JSON('true')) |
| IS_NULL_VALUE | Checks if a JSON value is null | IS_NULL_VALUE(PARSE_JSON('null')) |