COSINE_DISTANCE
Calculates the cosine distance between two vectors, measuring how dissimilar they are.
Syntax
COSINE_DISTANCE(vector1, vector2)
Arguments
vector1
: First vector (ARRAY(FLOAT NOT NULL))vector2
: Second vector (ARRAY(FLOAT NOT NULL))
Returns
Returns a FLOAT value between 0 and 1:
- 0: Identical vectors (completely similar)
- 1: Orthogonal vectors (completely dissimilar)
Description
The cosine distance measures the dissimilarity between two vectors based on the angle between them, regardless of their magnitude. The function:
- Verifies that both input vectors have the same length
- Computes the sum of element-wise products (dot product) of the two vectors
- Calculates the square root of the sum of squares for each vector (vector magnitudes)
- Returns
1 - (dot_product / (magnitude1 * magnitude2))
The mathematical formula implemented is:
cosine_distance(v1, v2) = 1 - (Σ(v1ᵢ * v2ᵢ) / (√Σ(v1ᵢ²) * √Σ(v2ᵢ²)))
Where v1ᵢ and v2ᵢ are the elements of the input vectors.
info
This function performs vector computations within Databend and does not rely on external APIs.
Examples
Create a table with vector data:
CREATE OR REPLACE TABLE vectors (
id INT,
vec ARRAY(FLOAT NOT NULL)
);
INSERT INTO vectors VALUES
(1, [1.0000, 2.0000, 3.0000]),
(2, [1.0000, 2.2000, 3.0000]),
(3, [4.0000, 5.0000, 6.0000]);
Find the vector most similar to [1, 2, 3]:
SELECT
vec,
COSINE_DISTANCE(vec, [1.0000, 2.0000, 3.0000]) AS distance
FROM
vectors
ORDER BY
distance ASC
LIMIT 1;
+-------------------------+----------+
| vec | distance |
+-------------------------+----------+
| [1.0000,2.2000,3.0000] | 0.0 |
+-------------------------+----------+