Sorcerer's IsleDocs cfRegexOverviewCode

Matches

The matches action returns a boolean which indicates if the supplied regex matches the entire input text.

It also supports an optional returntype which allows different checks, as detailed below.

Since v0.4, when using a Regex object, there are shortcut methods allowing a returntype to be suffixed onto the method name for cleaner syntax, e.g:

RegexObj.matches( Input , 'exact' )   => RegexObj.matchesExact( Input )
RegexObj.matches( Input , 'partial' ) => RegexObj.matchesPartial( Input )
RegexObj.matches( Input , 'start' )   => RegexObj.matchesStart( Input )
RegexObj.matches( Input , 'end' )     => RegexObj.matchesEnd( Input )
RegexObj.matches( Input , 'count' )   => RegexObj.matchesCount( Input )

Return Types

exact

Perform an exact match and returns true only if the regex exactly matches the entire text.

partial

Return true is the regex matches anywhere within the text.

start

Return true only if the regex match starts at the first character of the text.

end

Return true only if the regex match ends with the last character of the text.

count

Returns the number of times a partial match is found anywhere within the text.

Object

Arguments

Name Type Required Default Notes
Text String yes n/a The text to check if the regex matches in.
ReturnType Enum (exact,partial,start,end,count) no "exact" See Return Types section for details.

Usage Examples

<cfset Input = "The quick fox jumps over the lazy brown dog." />
<cfset NineWordsRx = new Regex( '(?:\w+\W){9}' ) />
<cfset ThreeWordsRx = new Regex( '(?:\w+\W){3}' ) />
<cfset TtheRx = new Regex( '[Tt]he' ) />


<cfdump var=#NineWordsRx.matches( Input )# />
Outputs: true

<cfdump var=#ThreeWordsRx.matches( Input )# />
Outputs: false

<cfdump var=#ThreeWordsRx.matches( Input , 'partial' )# />
Outputs: true

<cfdump var=#ThreeWordsRx.matches( Input , 'count' )# />
Outputs: 3

<cfdump var=#TtheRx.matches( Input , 'partial' )# />
Outputs: true

<cfdump var=#TtheRx.matches( Input , 'count' )# />
Outputs: 2

<cfdump var=#TtheRx.matches( Input , 'start' )# />
Outputs: true

<cfdump var=#TtheRx.matches( Input , 'end' )# />
Outputs: false

<cfset DogRx = new Regex('dog\.') />
<cfdump var=#DogRx.matches( Input , 'end' )# />
Outputs: true

<cfset TheRx = new Regex('the') />
<cfdump var=#TheRx.matches( Input , 'start' )# />
Outputs: false

<cfset TheRx = new Regex('the','case_insensitive') />
<cfdump var=#TheRx.matches( Input , 'start' )# />
Outputs: true

Tag

Attributes

Name Type Required Default Notes
Variable VarName no "cfregex" The variable which the result is assigned to.
Text String yes n/a The text to check if the regex matches in.
ReturnType Enum (exact,partial,start,end,count) no "exact" See Return Types section for details.
Modes StringList no none List of regex modes to apply to the pattern.

Usage Examples

<cfset Input = "The quick fox jumps over the lazy brown dog." />

<cfregex matches variable="isExactMatch" text=#Input# >
    (?:\w+\W){9}
</cfregex>
<dump var=#isExactMatch#/>
Outputs: true

<cfregex matches variable="isExactMatch" text=#Input# >
    (?:\w+\W){3}
</cfregex>
<dump var=#isExactMatch#/>
Outputs: false

<cfregex matches variable="isPartialMatch" text=#Input# returntype="partial" >
    (?:\w+\W){3}
</cfregex>
<dump var=#isPartialMatch#/>
Outputs: true

<cfregex matches variable="MatchCount" text=#Input# returntype="count" >
    (?:\w+\W){3}
</cfregex>
<dump var=#MatchCount#/>
Outputs: 3

<cfregex matches variable="isPartialMatch" text=#Input# returntype="partial" >
    [Tt]he
</cfregex>
<dump var=#isPartialMatch#/>
Outputs: true

<cfregex matches variable="MatchCount" text=#Input# returntype="count" >
    [Tt]he
</cfregex>
<dump var=#MatchCount#/>
Outputs: 2

<cfregex matches variable="isStartMatch" text=#Input# returntype="start" >
    [Tt]he
</cfregex>
<dump var=#isStartMatch#/>
Outputs: true

<cfregex matches variable="isEndMatch" text=#Input# returntype="start" >
    [Tt]he
</cfregex>
<dump var=#isEndMatch#/>
Outputs: false

<cfregex matches variable="isEndMatch" text=#Input# returntype="start" >
    dog\.
</cfregex>
<dump var=#isEndMatch#/>
Outputs: true

<cfregex matches variable="isStartMatch" text=#Input# returntype="start" >
    the
</cfregex>
<dump var=#isStartMatch#/>
Outputs: false

<cfregex matches variable="isStartMatch" text=#Input# returntype="start" flags="case_insensitive" >
    the
</cfregex>
<dump var=#isStartMatch#/>
Outputs: true

Function

Arguments

Name Type Required Default Notes
Pattern RegexString yes n/a The regex pattern to compile into a Regex Object.
Text String yes n/a The text to check if the regex matches in.
ReturnType Enum (exact,partial,start,end,count) no "exact" See Return Types section for details.
Modes StringList no none List of regex modes to apply to the pattern.

Usage Examples

<cfset Input = "The quick fox jumps over the lazy brown dog." />

<cfdump var=#RegexMatches( '(?:\w+\W){9}' , Input )# />
Outputs: true

<cfdump var=#RegexMatches( '(?:\w+\W){3}' , Input )# />
Outputs: false

<cfdump var=#RegexMatches( '(?:\w+\W){3}' , Input , 'partial' )# />
Outputs: true

<cfdump var=#RegexMatches( '(?:\w+\W){3}' , Input , 'count' )# />
Outputs: 3

<cfdump var=#RegexMatches( '[Tt]he' , Input , 'partial' )# />
Outputs: true

<cfdump var=#RegexMatches( '[Tt]he' , Input , 'count' )# />
Outputs: 2

<cfdump var=#RegexMatches( '[Tt]he' , Input , 'start' )# />
Outputs: true

<cfdump var=#RegexMatches( '[Tt]he' , Input , 'end' )# />
Outputs: false

<cfdump var=#RegexMatches( 'dog\.' , Input , 'end' )# />
Outputs: true

<cfdump var=#RegexMatches( 'the' , Input , 'start' )# />
Outputs: false

<cfdump var=#RegexMatches( 'the' , Input , 'start' , 'case_insensitive' )# />
Outputs: true

Practical Examples

Example 1

Checking for a UK postcode:

<cfif NOT RegexMatches
    ( '[A-Z]{1,2}[0-9][A-Z]?[0-9]{1,2}[A-Z]{2}'
    , RegexReplace(Form.Postcode,'\s|-','')
    )>
    <cfset Error = "Unrecognised postcode." />
</cfif>

(This is a simplified regex, actual UK postcode validation is more complicated.)

Example 2

Validating that a password has enough non-alpha characters:

<cfif RegexMatches('\W',Form.Password,'count') LT 3>
    <cfset Error = "Passwords must contain at least three non-alphanumeric characters" />
</cfif>

Example 3

If you wanted to identify files which contained var-scoped variables, so that they can be converted to using the local scope, you might use:

<cfset FindVarRx = new Regex
    ( '<cfset\s+var\s+'
      & '|'
      & '<cfscript>[^<v]+(?:(?!</cfscript>).)*?(?<=\n\t{0,5})var\s+'
    , 'CASE_INSENSITIVE,DOTALL'
    ) />

<cfset VarFiles = StructNew() />
<cfset Total = 0 />
<cfdirectory name="Files" recurse directory="/codebase" filter="\*.cfm|\*.cfc" />

<cfloop query="Files">
    <cfif Files.Type NEQ 'FILE'><cfcontinue/></cfif>

    <cfset Found = FindVarRx.matches
        ( Text       : FileRead(Files.Directory & Files.Name)
        , ReturnType : 'count'
        ) />

    <cfif Found >
        <cfset VarFiles[Files.Directory & Files.Name] = Found />
        <cfset Total += Found />
    </cfif>
</cfloop>

<cfif Total >
    <cfoutput>Found #Total# Vars in #StructCount(VarFiles)# of #Files.RecordCount# files.</cfoutput>
    <cfdump var=#VarFiles# />
<cfelse>
    <cfoutput>No vars found in #Files.RecordCount# files.</cfoutput>
</cfif>

(note: this is to demonstrate how the method might be used; the regex itself may return false positives (e.g. vars inside comments or strings)

Example 4

The following example verifies that the start of a string matches one of five doctype definitions.

<cfregex
    action     = "matches"
    returntype = "start"
    variable   = "isSupportedDoctype"
    text       = #HtmlContent#
    >
    ## HTML5 DocType
        (?i:<!doctype\ html\s*>)

    |

    ## XHTML
        <!DOCTYPE\s+html\s+PUBLIC\s+"-//W3C//DTD\ XHTML\ 1\.
            (?:
                ## 1.0 Strict or Transitional
                0\ (Strict|Transitional)//EN"
                \s+
                "http://www\.w3\.org/TR/xhtml1/DTD/xhtml1-(?i:\1)
            |
                ## XHTML 1.1
                1//EN"
                \s+
                "http://www\.w3\.org/TR/xhtml11/DTD/xhtml11
            )
        \.dtd"\s*>

    |

    ## HTML 4 Strict
        <!DOCTYPE\s+HTML\s+PUBLIC\s+"-//W3C//DTD\ HTML\ 4\.01//EN"
        \s+
        "http://www\.w3\.org/TR/html4/strict\.dtd"
        \s*>
</cfregex>