Regular expressions are put between two forward slashes (/) and escaped with a backward
slash (\). Special characters (that need to be escaped to be matched) are: . | ( ) [
] { } + \ ^ $ * ?.
| Characters | Description |
|---|---|
| . | Match any character |
| | | An OR operator, match either the sequence before or after this |
| ( ) | Start and end a subsequence (grouping) |
| [ ] | Start and end a character class |
| { } | Match intervals: {a,b}: Match at least a characters, at most b. {a,}: Match at least a characters and more. {a}: Match exactly a characters. |
| \ | Escape character |
| ^ | Match the beginning of a line. When used at the beginning of a character class, negates it. |
| $ | Match the end of a line |
| + | Match one or more times ({1,}) |
| +? | Match one or more times ({1,}) (Non greedy: match the smallest pattern) |
| * | Matches zero ore more ({0,}) |
| *? | Matches zero ore more ({0,}) (Non greedy: match the smallest pattern) |
| ? | Match zero or one time ({0,1}) |
| Class | Switch | Match description |
|---|---|---|
| [0-9] | \d | Decimal digit character |
| [^0-9] | \D | Not a decimal digit character |
| [\s\t\r\n\f] | \s | Whitespace character |
| [^\s\t\r\n\f] | \S | Not a whitespace character |
| [A-Za-z0-9_] | \w | Word character (alpha, numeric, and underscore) |
| [^A-Za-z0-9_] | \W | Not a word character |
| [:alnum:] | Alpha numeric ([A-Za-z0-9]) | |
| [:alpha:] | Uppercase and lowercase letters ([A-Za-z]) | |
| [:blank:] | Blank or tab character | |
| [:space:] | Whitespace characters | |
| [:digit:] | Decimal digit characters | |
| [:lower:] | Lowercase letters ([a-z]) | |
| [:upper:] | Uppercase characters | |
| [:print:] | Any printable character, including space | |
| [:graph:] | Printable characters excluding space | |
| [:punct:] | Punctuation characters: any printable character excluding aplhanumeric or space | |
| [:cntrl] | Chontrol characters (0x00 to 0x1F and 0x7F) | |
| [:xdigit:] | Hexadecimal digits ([0-9a-fA-F]) |
Expression modifiers are placed behind the expression and influence general matching behaviour.
| Modifier | Description |
| /i | Match everying case insensitive. |
| /m | Multiline mode: dot matches newlines, ^ and $ both match line starts and endings. |
| /o | Perform inline substitutions (#{variable}) only once on creation. Normally, the variable is inserted on every evaluation. |
| /x | Extended mode: whitespaces in the pattern are ignored |
| /[single character] | Set the character encoding, character should be one of 'neus': none, EUC, UTF-8 or SJIS. |
| / |