Contents/Index/Search Download Complete PDF Send Feedback Print This Page

Previous

Regular Expressions

Regular expressions are special characters that match or capture portions of a field. This sections covers special characters supported by Check Point and the rules that govern them.

In This Section

Metacharacters

Square Brackets

Backslash

Quantifiers

Metacharacters

Some metacharacters are recognized anywhere in a pattern, except within square brackets; other metacharacters are recognized only in square brackets.

The Check Point set of regular expressions has been enhanced for R70 and higher.

Metacharacter

Meaning

Earlier?

\ (backslash)

escape character, and other meanings

partial

[ ] (square brackets)

character class definition

yes

( ) (parenthesis)

subpattern

yes

{ } (curly brackets)

min/max quantifier

no

. (dot)

match any character

yes

? (question mark)

zero or one quantifier

yes

* (asterisk)

zero or more quantifier

yes

+ (plus)

one or more quantifier

yes

| (vertical bar)

start alternative branch

yes

^ (circumflex anchor)

anchor pattern to beginning of buffer

yes

$ (dollar anchor)

anchor pattern to end of buffer

yes

Square Brackets

Square brackets ([ ]) designate a character class: matching a single character in the string.

Inside a character class, only these metacharacters have special meaning:

  • backslash ( \ ) - general escape character.
  • hyphen ( - ) - character range.

Backslash

The meaning of the backslash (\) character depends on the context. The following explanations are not all supported in earlier versions.

In R70 and above, backslash escapes metacharacters inside and outside character classes.

Encoding Non-Printable Characters

To use non-printable characters in patterns, escape the reserved character set.

Character

Description

\a

alarm; the BEL character (hex 07)

\cx

"control-x", where x is any character

\e

escape (hex 1B)

\f

formfeed (hex 0C)

\n

newline (hex 0A)

\r

carriage return (hex 0D)

\t

tab (hex 09)

\ddd

character with octal code ddd

\xhh

character with hex code hh

Specifying Character Types

To specify types of characters in patterns, escape the reserved character.

Character

Description

\d

any decimal digit [0-9]

\D

any character that is not a decimal digit

\s

any whitespace character

\S

any character that is not whitespace

\w

any word character (underscore or alphanumeric character)

\W

any non-word character (not underscore or alphanumeric)

Quantifiers

Various metacharacters indicate how many instances of a character, character set or character class should be matched. A quantifier must not follow another quantifier, an opening parenthesis, or be the expression’s first character.

These quantifiers can follow any of the following items:

  • a literal data character
  • an escape such as \d that matches a single character
  • a character class
  • a sub-pattern in parentheses

Curly Brackets

Curly brackets { } are general repetition quantifiers. They specify a minimum and maximum number of permitted matches.

{match the string if at least n times, match the string if not more than n times}

For example: a{2,4} matches aa, aaa, or aaaa, but not a or aaaaa

{n} - exactly n times

{n,} - no maximum limit

For example:

  • \d{8} matches exactly 8 digits
  • [aeiou]{3,} matches at least 3 successive vowels, but may match many more

Note - A closing curly bracket '}' that is not preceded by an opening curly bracket '{' is treated as a simple character.

It is good practice to use a backslash, '\}', when using a closing curly bracket as a simple character.

Question Marks

Outside a character class, a question mark (?) matches zero or one character in the string. It is the same as using {0,1}.

For example: c([ab]?)r matches car, cbr, and cr

Inside a character class, it matches a question mark: [?] matches ? (question mark).

Asterisk

Outside a character class, an asterisk (*) matches any number of characters in the string. It is the same as using {0,}.

For example: c([ab]*)r matches car, cbr, cr, cabr, and caaabbbr

Inside a character class, it matches an asterisk: [*] matches * (asterisk).

Plus

Outside a character class, a plus (+) matches one or more characters in the string. It is the same as using {1,}.

For example: c([ab]+)r matches character strings such as car, cbr, cabr, caaabbbr; but not cr

Inside a character class, it matches a plus: [+] matches + (plus).

 
Top of Page ©2013 Check Point Software Technologies Ltd. All rights reserved. Download Complete PDF Send Feedback Print