Callbacks are a powerful feature that allows you to provide a CFML function to be executed every time a regex match is made, which enables you to run your own logic that can override the default behaviour.
There are two types of callbacks, Boolean Callbacks return either true or false to say "perform default behaviour" or "don't perform it", and are used with Match and Split actions, whilst String Callbacks are used with Replace and return the value to use for the replacement text. Both types receive identical arguments - the only difference is how their returned value is used.
A callback function must be a UDF or an object method (i.e. it cannot be a built-in function), and needs to be a variable (do not surround it with quotes, and in tag-form use hashes).
A callback function will receive the following named arguments:
Name | Type | Passed | Notes |
---|---|---|---|
Pos | Char Position | always | The position which the current match starts at. |
Len | Integer | always | The length of the current match (can be zero). |
Match | String | always | The text of the current match. |
Groups | Array | always | An array containing numbered group information. |
NamedGroups | Struct | sometimes | A struct containing named group information, if GroupNames passed to calling function. |
Data | Struct | sometimes | A struct containing passed-in data, if CallbackData passed to calling function.
|
The callback function is called using named arguments via argumentcollection, so order is not important, but the names must match exactly.
The Groups
and NamedGroups
arguments both contain the same data, but the
former uses backreference positions, whilst the latter uses the names supplied
to the original calling function for the keys of the structure.
Each group element within both these arguments contains a structure with keys
Pos
, Len
, and Match
with the values for that particular group.
Below are a couple of example functions which you can use as a base to create your own callback functions.
A sample boolean callback function which doesn't change behaviour (i.e. always returns true):
<cffunction name="BooleanCallback" returntype="Boolean" output="false">
<cfargument name="Pos" type="Numeric" required hint="The start position of the match." />
<cfargument name="Len" type="Numeric" required hint="The length of the match." />
<cfargument name="Match" type="String" required hint="The text of the match." />
<cfargument name="Groups" type="Array" required hint="Array of group information." />
<cfargument name="NamedGroups" type="Struct" optional hint="Struct of named group information." />
<cfargument name="Data" type="Struct" optional hint="Struct containing passed-in data." />
<cfreturn true />
</cffunction>
A sample replace callback function which doesn't change the result (i.e. always returns same text as was matched):
<cffunction name="ReplaceCallback" returntype="String" output="false">
<cfargument name="Pos" type="Numeric" required hint="The start position of the match." />
<cfargument name="Len" type="Numeric" required hint="The length of the match." />
<cfargument name="Match" type="String" required hint="The text of the match." />
<cfargument name="Groups" type="Array" required hint="Array of group information." />
<cfargument name="NamedGroups" type="Struct" optional hint="Struct of named group information." />
<cfargument name="Data" type="Struct" optional hint="Struct containing passed-in data." />
<cfreturn Arguments.Match />
</cffunction>
Note that you do not need to specify all arguments with cfargument tags - the function will receive them, but if you prefer, you can specify only the ones you are using. (Since they are provided as named arguments, you don't strictly need the cfargument tags at all.)
Callback functions can be used any time you want to apply CFML logic to a match, and avoid having to use complicated and messy hacks like replacement tokens.
For example, if you had a document containing codes which needed to be updated to use newer codes from a database, you might do this:
<cfset DocText = RegexReplace
( '\b[A-Z]{2}-\d{4}\b'
, DocText
, CodeReplaceFunc
) />
<cffunction name="CodeReplaceFunc" returntype="String" output="false">
<cfargument name="Pos" type="Numeric" required hint="The start position of the match." />
<cfargument name="Match" type="String" required hint="The text of the match." />
<cfif Application.Codes.isOld(Arguments.Match)>
<cfset var NewCode = Application.Codes.lookupNewCode(Arguments.Match) />
<cflog
file = "DocCodeReplace"
text = "Converted '#Arguments.Match#' at #Arguments.Pos# to '#NewCode#'"
/>
<cfreturn NewCode />
<cfelse>
<cflog
file = "DocCodeReplace"
text = "Skipped '#Arguments.Match#' at #Arguments.Pos#"
/>
<cfreturn Arguments.Match />
</cfif>
</cffunction>
This example shows how you might extract a list of country codes from a document. The regex filters down possible candidates, before using an existing function to confirm which ones are valid.
The end result is the Countries
array only contains valid countries.
<cfregex match
variable = "Countries"
text = #DocText#
callback = #CountryCheckFunc#
>
## Lookbehind to ensure whitespace, start of string, or other valid prefix.
(?<=\s|\A|["'])
## Negative lookahead to exclude unused codes (AAA-AAZ,QMA-QZZ,XAA-XZZ,ZZA-ZZZ)
(?!AA|ZZ|X|Q[M-Z])
## Any two uppercase letters (except as excluded above).
[A-Z]{3}
## Lookahead to ensure whitespace, end of string, or other valid suffix
(?=\s|\z|\b[^/:])
</cfregex>
<cffunction name="CountryCheckFunc" returntype="Boolean" output="false">
<cfargument name="Match" type="String" required />
<cfreturn Application.Countries.isValid(ISO=Arguments.Match) />
</cffunction>