Regular expressions are put between two forward slashes (/
) and escaped with a backward
slash (\
). Special characters (that need to be escaped to be matched) are: . | ( ) [
] { } + \ ^ $ * ?
.
Characters | Description |
---|---|
. | Match any character |
| | An OR operator, match either the sequence before or after this |
( ) | Start and end a subsequence (grouping) |
[ ] | Start and end a character class |
{ } | Match intervals: {a,b}: Match at least a characters, at most b. {a,}: Match at least a characters and more. {a}: Match exactly a characters. |
\ | Escape character |
^ | Match the beginning of a line. When used at the beginning of a character class, negates it. |
$ | Match the end of a line |
+ | Match one or more times ({1,}) |
+? | Match one or more times ({1,}) (Non greedy: match the smallest pattern) |
* | Matches zero ore more ({0,}) |
*? | Matches zero ore more ({0,}) (Non greedy: match the smallest pattern) |
? | Match zero or one time ({0,1}) |
Class | Switch | Match description |
---|---|---|
[0-9] | \d | Decimal digit character |
[^0-9] | \D | Not a decimal digit character |
[\s\t\r\n\f] | \s | Whitespace character |
[^\s\t\r\n\f] | \S | Not a whitespace character |
[A-Za-z0-9_] | \w | Word character (alpha, numeric, and underscore) |
[^A-Za-z0-9_] | \W | Not a word character |
[:alnum:] | Alpha numeric ([A-Za-z0-9]) | |
[:alpha:] | Uppercase and lowercase letters ([A-Za-z]) | |
[:blank:] | Blank or tab character | |
[:space:] | Whitespace characters | |
[:digit:] | Decimal digit characters | |
[:lower:] | Lowercase letters ([a-z]) | |
[:upper:] | Uppercase characters | |
[:print:] | Any printable character, including space | |
[:graph:] | Printable characters excluding space | |
[:punct:] | Punctuation characters: any printable character excluding aplhanumeric or space | |
[:cntrl] | Chontrol characters (0x00 to 0x1F and 0x7F) | |
[:xdigit:] | Hexadecimal digits ([0-9a-fA-F]) |
Expression modifiers are placed behind the expression and influence general matching behaviour.
Modifier | Description |
/i | Match everying case insensitive. |
/m | Multiline mode: dot matches newlines, ^ and $ both match line starts and endings. |
/o | Perform inline substitutions (#{variable}) only once on creation. Normally, the variable is inserted on every evaluation. |
/x | Extended mode: whitespaces in the pattern are ignored |
/[single character] | Set the character encoding, character should be one of 'neus': none, EUC, UTF-8 or SJIS. |
/ |