CREATE NGRAM INDEX

Introduced or updated: v1.2.726

ENTERPRISE EDITION FEATURE

NGRAM INDEX is an Enterprise Edition feature. Contact Databend Support for a license.

Creates an Ngram index on a column for a table.

Syntax

-- Create an Ngram index on an existing table
CREATE [OR REPLACE] NGRAM INDEX [IF NOT EXISTS] <index_name>
ON [<database>.]<table_name>(<column>)
[gram_size = <number>] [bloom_size = <number>]

-- Create an Ngram index when creating a table
CREATE [OR REPLACE] TABLE <table_name> (
    <column_definitions>,
    NGRAM INDEX <index_name> (<column>)
        [gram_size = <number>] [bloom_size = <number>]
)...

gram_size (defaults to 3) specifies the length of each character-based substring (n-gram) when the column text is indexed. For example, with gram_size = 3, the text "hello world" would be split into overlapping substrings like:
```
"hel", "ell", "llo", "lo ", "o w", " wo", "wor", "orl", "rld"
```
bloom_size specifies the size in bytes of the Bloom filter bitmap used to accelerate string matching within each block of data. It controls the trade-off between index accuracy and memory usage:
- A larger bloom_size reduces false positives in string lookups, improving query precision at the cost of more memory.
- A smaller bloom_size saves memory but may increase false positives.
- If not explicitly set, the default is 1,048,576 bytes (1m) per indexed column per block. The valid range is from 512 bytes to 10,485,760 bytes (10m).

Examples

Creating a Table with NGRAM Index

CREATE TABLE articles (
    id INT,
    title VARCHAR,
    content STRING,
    NGRAM INDEX idx_content (content)
);

Creating an NGRAM Index on an Existing Table

CREATE TABLE products (
    id INT,
    name VARCHAR,
    description STRING
);

CREATE NGRAM INDEX idx_description
ON products(description);

Viewing Indexes

SHOW INDEXES;

Result:

┌─────────────────┬───────┬──────────┬─────────────────────────┬──────────────────────────┐
│ name            │ type  │ original │ definition              │ created_on               │
├─────────────────┼───────┼──────────┼─────────────────────────┼──────────────────────────┤
│ idx_content     │ NGRAM │          │ articles(content)       │ 2025-05-13 01:22:34.123  │
│ idx_description │ NGRAM │          │ products(description)   │ 2025-05-13 01:23:45.678  │
└─────────────────┴───────┴──────────┴─────────────────────────┴──────────────────────────┘

Using NGRAM Index

-- Create a table with NGRAM index
CREATE TABLE phrases (
    id INT,
    text STRING,
    NGRAM INDEX idx_text (text)
);

-- Insert sample data
INSERT INTO phrases VALUES
(1, 'apple banana cherry'),
(2, 'banana date fig'),
(3, 'cherry elderberry fig'),
(4, 'date grape kiwi');

-- Query using fuzzy matching with the NGRAM index
SELECT * FROM phrases WHERE text LIKE '%banana%';

Result:

┌────┬─────────────────────┐
│ id │ text                │
├────┼─────────────────────┤
│  1 │ apple banana cherry │
│  2 │ banana date fig     │
└────┴─────────────────────┘

Dropping an NGRAM Index

DROP NGRAM INDEX idx_text ON phrases;

CREATE NGRAM INDEX

Syntax

Examples

Creating a Table with NGRAM Index

Creating an NGRAM Index on an Existing Table

Viewing Indexes

Using NGRAM Index

Dropping an NGRAM Index

Join our growing community

GitHub

Slack

X(Twitter)

YouTube

Explore Databend Cloud for FREE

Syntax​

Examples​

Creating a Table with NGRAM Index​

Creating an NGRAM Index on an Existing Table​

Viewing Indexes​

Using NGRAM Index​

Dropping an NGRAM Index​

Join our growing community

GitHub

Slack

X(Twitter)

YouTube

Explore Databend Cloud for FREE

Syntax

Examples

Creating a Table with NGRAM Index

Creating an NGRAM Index on an Existing Table

Viewing Indexes

Using NGRAM Index

Dropping an NGRAM Index