Skip to main content

REGEXP_SUBSTR

Returns the substring of the string expr that matches the regular expression specified by the pattern pat, NULL if there is no match. If expr or pat is NULL, the return value is NULL.

  • REGEXP_SUBSTR does not support extracting capture groups (subpatterns defined by parentheses ()). It returns the entire matched substring instead of specific captured groups.
SELECT REGEXP_SUBSTR('abc123', '(\w+)(\d+)');
-- Returns 'abc123' (the entire match), not 'abc' or '123'.

-- Alternative Solution: Use string functions like SUBSTRING and REGEXP_INSTR to manually extract the desired portion of the string:
SELECT SUBSTRING('abc123', 1, REGEXP_INSTR('abc123', '\d+') - 1);
-- Returns 'abc' (extracts the part before the digits).
SELECT SUBSTRING('abc123', REGEXP_INSTR('abc123', '\d+'));
-- Returns '123' (extracts the digits).
  • REGEXP_SUBSTR does not support the e parameter (used in Snowflake to extract capture groups) or the group_num parameter for specifying which capture group to return.
SELECT REGEXP_SUBSTR('abc123', '(\w+)(\d+)', 1, 1, 'e', 1);
-- Error: Databend does not support the 'e' parameter or capture group extraction.

-- Alternative Solution: Use string functions like SUBSTRING and LOCATE to manually extract the desired substring, or preprocess the data with external tools (e.g., Python) to extract capture groups before querying.
SELECT SUBSTRING(
REGEXP_SUBSTR('letters:abc,numbers:123', 'letters:[a-z]+,numbers:[0-9]+'),
LOCATE('letters:', 'letters:abc,numbers:123') + 8,
LOCATE(',', 'letters:abc,numbers:123') - (LOCATE('letters:', 'letters:abc,numbers:123') + 8)
);
-- Returns 'abc'

Syntax

REGEXP_SUBSTR(<expr>, <pat[, pos[, occurrence[, match_type]]]>)

Arguments

ArgumentsDescription
exprThe string expr that to be matched
patThe regular expression
posOptional. The position in expr at which to start the search. If omitted, the default is 1.
occurrenceOptional. Which occurrence of a match to search for. If omitted, the default is 1.
match_typeOptional. A string that specifies how to perform matching. The meaning is as described for REGEXP_LIKE().

Return Type

VARCHAR

Examples

SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+');
+----------------------------------------+
| REGEXP_SUBSTR('abc def ghi', '[a-z]+') |
+----------------------------------------+
| abc |
+----------------------------------------+

SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3);
+----------------------------------------------+
| REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3) |
+----------------------------------------------+
| ghi |
+----------------------------------------------+

SELECT REGEXP_SUBSTR('周 周周 周周周 周周周周', '周+', 2, 3);
+------------------------------------------------------------------+
| REGEXP_SUBSTR('周 周周 周周周 周周周周', '周+', 2, 3) |
+------------------------------------------------------------------+
| 周周周周 |
+------------------------------------------------------------------+

Explore Databend Cloud for FREE
Low-cost
Fast Analytics
Easy Data Ingestion
Elastic Scaling
Try it today