Regular Expressions
Regular expressions are special characters that match or capture portions of a field. This sections covers special characters supported by Check Point and the rules that govern them.
Metacharacters
Some metacharacters are recognized anywhere in a pattern, except within square brackets; other metacharacters are recognized only in square brackets.
The Check Point set of regular expressions has been enhanced for R70 and higher.
Metacharacter
|
Meaning
|
Earlier?
|
\ (backslash)
|
escape character, and other meanings
|
partial
|
[ ] (square brackets)
|
character class definition
|
yes
|
( ) (parenthesis)
|
subpattern
|
yes
|
{ } (curly brackets)
|
min/max quantifier
|
no
|
. (dot)
|
match any character
|
yes
|
? (question mark)
|
zero or one quantifier
|
yes
|
* (asterisk)
|
zero or more quantifier
|
yes
|
+ (plus)
|
one or more quantifier
|
yes
|
| (vertical bar)
|
start alternative branch
|
yes
|
^ (circumflex anchor)
|
anchor pattern to beginning of buffer
|
yes
|
$ (dollar anchor)
|
anchor pattern to end of buffer
|
yes
|
Square Brackets
Square brackets ([ ]) designate a character class: matching a single character in the string.
Inside a character class, only these metacharacters have special meaning:
- backslash ( \ ) - general escape character.
- hyphen ( - ) - character range.
Backslash
The meaning of the backslash (\) character depends on the context. The following explanations are not all supported in earlier versions.
In R70 and above, backslash escapes metacharacters inside and outside character classes.
Encoding Non-Printable Characters
To use non-printable characters in patterns, escape the reserved character set.
Character
|
Description
|
\a
|
alarm; the BEL character (hex 07)
|
\cx
|
"control-x", where x is any character
|
\e
|
escape (hex 1B)
|
\f
|
formfeed (hex 0C)
|
\n
|
newline (hex 0A)
|
\r
|
carriage return (hex 0D)
|
\t
|
tab (hex 09)
|
\ddd
|
character with octal code ddd
|
\xhh
|
character with hex code hh
|
Specifying Character Types
To specify types of characters in patterns, escape the reserved character.
Character
|
Description
|
\d
|
any decimal digit [0-9]
|
\D
|
any character that is not a decimal digit
|
\s
|
any whitespace character
|
\S
|
any character that is not whitespace
|
\w
|
any word character (underscore or alphanumeric character)
|
\W
|
any non-word character (not underscore or alphanumeric)
|
Quantifiers
Various metacharacters indicate how many instances of a character, character set or character class should be matched. A quantifier must not follow another quantifier, an opening parenthesis, or be the expression’s first character.
These quantifiers can follow any of the following items:
- a literal data character
- an escape such as \d that matches a single character
- a character class
- a sub-pattern in parentheses
Curly Brackets
Curly brackets { } are general repetition quantifiers. They specify a minimum and maximum number of permitted matches.
{match the string if at least n times, match the string if not more than n times}
For example: a{2,4} matches aa, aaa, or aaaa, but not a or aaaaa
{n} - exactly n times
{n,} - no maximum limit
For example:
\d{8} matches exactly 8 digits[aeiou]{3,} matches at least 3 successive vowels, but may match many more
|
Note - A closing curly bracket '}' that is not preceded by an opening curly bracket '{' is treated as a simple character.
It is good practice to use a backslash, '\}', when using a closing curly bracket as a simple character.
|
Question Marks
Outside a character class, a question mark (?) matches zero or one character in the string. It is the same as using {0,1}.
For example: c([ab]?)r matches car, cbr, and cr
Inside a character class, it matches a question mark: [?] matches ? (question mark).
Asterisk
Outside a character class, an asterisk (*) matches any number of characters in the string. It is the same as using {0,}.
For example: c([ab]*)r matches car, cbr, cr, cabr, and caaabbbr
Inside a character class, it matches an asterisk: [*] matches * (asterisk).
Plus
Outside a character class, a plus (+) matches one or more characters in the string. It is the same as using {1,}.
For example: c([ab]+)r matches character strings such as car, cbr, cabr, caaabbbr; but not cr
Inside a character class, it matches a plus: [+] matches + (plus).
|