regexCount Guide

Overview

regexCount(value, regexPattern) counts matches of a JavaScript (ECMAScript) regular expression in a string and returns a number (0, 1, 2, …). It’s ideal for richer checks such as case-insensitivity, alternatives (x|y|z), word boundaries (\b), and quantified patterns.

Scope: regexCount counts matches. It does not extract matched text or transform data.

Data type of output: Number

Syntax

regexCount(value, regexPattern)
  • value: string (e.g. 'text') or variable (e.g. %TEXT)

  • regexPattern: a string or variable containing a JavaScript-flavoured (ECMAScript) regex in /pattern/flags format, e.g. '/\b\d{5}(-\d{4})?\b/g'

    • Regex pattern must be wrapped in forward slashes e.g. '/pattern/'

    • Flags can be added after the closing forward-slash e.g. '/pattern/flags'

    • Common flags:

      • g – global (count all matches)

      • i – case-insensitive

Example

regexCount(%TEXT, '/cat/') //given a value of "My favourite animal is a cat." it will return 1.

To convert this to a boolean to ensure the expression passes or fails, you can do the following:

regexCount(%TEXT, '/cat/')>=1 //Given the same sentence it will return true.

Regex Library

Access a collection of example regex patterns to support learning or to incorporate into your knowledge graph logic.

Use case
Regex (includes escaping and single quotes)
Example input/output

Count US Zip code

'/\b\d{5}(-\d{4})?\b/g'

12345 // returns 1

12345-6789 // returns 1

1234 //returns 1

Count UK postcodes

'/\b[A-Z]{1,2}\d[A-Z\d]?\s?\d[A-Z]{2}\b/gi'

SW1A 1AA // returns 1

Sw1a 1aa // returns 1

D343 EEF // returns 0

Count email addresses

'/[\w.-]+@[\w.-]+\.[a-zA-Z]{2,}/gi'

[email protected] // returns 1

john.smith@example // returns 0

Count UK National Insurance Number

'/[A-CEGHJ-PR-TW-Z]{2}\d{6}[A-D]/g'

AB123456C, ZZ000000A, invalidNINumber // returns 2

Count occurrences of words from a list

'/\b(apples|bananas|grapes|oranges|pears)\b/g'

I like apples, grapes, and oranges // returns 3

Count dates (YYYY-MM-DD format)

'/\d{4}-\d{2}-\d{2}/g'

Start: 2023-12-01, End: 2024-01-20, Next: 2025-06-11 // returns 3

Validate that a string contains only letters and spaces

'/^[A-Za-z ]+$/'

John Smith // returns 1

John M. Smith // returns 0

Troubleshooting

Problem
Fix

The following expression resulted in an error message

Function: regexCount(%S, 'cat')

Error message: Expression error at position 0 - In regexCount - Regex must be wrapped in slashes, e.g. `/abc/g`

Wrap the regex in forward slashes

regexCount(%S, '/cat/')

The following expression found 0 matches when using a subject of cat:

regexCount(%S, '/Cat | cat/')

Remove the spaces within the regex

regexCount(%S, '/Cat|cat/')

The following found 0 matches when using a subject of Cat (note: uppercase C):

regexCount(%S, '/cat/')

Add the case-insensitive flag i to ignore case

regexCount(%S, '/cat/i')

Expecting a count greater than 1 to be returned for regexCount. E.g. the following returns 1 but should be 2.

regexCount('The weather was sunny and warm', '/hot|warm|nice|sunny|pleasant/i')

Add the global g flag to count all matches regexCount('The weather was sunny and warm', '/hot|warm|nice|sunny|pleasant/gi')

Error messages

If the engine hits an error at runtime, the regexCount function returns a negative number:

Below shows the current list of errors and their error codes:

Error code
Name
Description

-1

Unsafe

Vulnerable regex pattern

-2

Evaluation failed

Error at runtime

-3

Invalid syntax

Regex failed to compile

-4

Pattern exceeded

Pattern length > max allowed (max 500 characters)

-5

Input exceeded

Input string too long (max 10,000 characters)

-6

Invalid format

Not in `/pattern/flags` format

-7

Check timeout

ReDos (excessive processing time) analysis timeout

During development, you may want to create a relationship with a rule that enables easy testing of inputs and outputs of the function, before embedding it into the business logic.

This will enable you to validate the regex function and surface any error codes.

External resources

Below contains a selection of external tools that be used to learn, build and test regular expressions before incorporating into your knowledge graph.

A recommended workflow for those unfamiliar with regex includes:

  1. Use your favourite LLM (e.g. ChatGPT or Claude) to build a regex based on your requirements

  2. Use Regex tester to test this regex with the data you expect to be using. Ask for refinements where required.

  3. Create a knowledge graph with one relationship of (Data) > has regexCounts > (Number of matches) (Data must be string. Number of matches must be a number)

  4. Create a rule with a single expression containing regexCount with the result being assigned to the object (%O).

  5. Querying this relationship enables you to pass test data in the subject and confirm how many matches are found as the result.

Regex in the Evidence Tree

To ensure the evidence tree is easier to understand evidence text (aka alt-text) can be added to expressions containing a regular expression.

Without evidence text:

With evidence text:

Advanced: Building a dynamic regex pattern

Using an intermediary expression you can compose a regex pattern containing dynamic data in one expression, then use it in another.

The below example shows a rule that infers a person could be onboarded to a company, assuming their profile contains an email address from one of its approved suppliers.

This builds a regex pattern to match an email address structure, but only when it contains an approved supplier name in the email domain.

(Person) > approved for onboarding > True
when
%S > being onboarded to > %COMPANY
%S > has profile > %PROFILE                                        //%PROFILE contains unstructured data including name and contact details
%COMPANY > has approved suppliers > %APPROVED_SUPPLIER            //get approved supplier name
'/\b[A-Za-z0-9._%+-]+@(?:[A-Za-z0-9-]+\.)*' + %APPROVED_SUPPLIER + '\.[A-Za-z]{2,}\b/gi'    //regex pattern assigned to %REGEX variable
regexCount(%PROFILE, %REGEX) >= 1                                 //check if profile contains a valid email with approved supplier in the domain

Breaking down this dynamic regex, the goal was to create the following regex pattern (assuming Rainbird is the approved supplier):

/\b[A-Za-z0-9._%+-]+@(?:[A-Za-z0-9-]+\.)*Rainbird\.[A-Za-z]{2,}\b/gi

To insert Rainbird as a variable, the string needs to be split up either side of Rainbird, with Rainbird replaced with a variable. This is achieved using string concatenation:

'first part of string' + %VARIABLE + 'second part of string'

'/\b[A-Za-z0-9._%+-]+@(?:[A-Za-z0-9-]+\.)*' + %VARIABLE + '\.[A-Za-z]{2,}\b/gi'

The output of this expression (the string containing a regex pattern) must be assigned to a variable that can be passed into the regexCount function.

This is not supported as a nested operation within the regexCount function and MUST be built in a separate expression, with it being passed into the regexCount function as a variable.

When creating a dynamic regex pattern it is important to ensure the resulting regex pattern is wrapped in forwards slashes and conforms to the /pattern/flags structure.

Last updated