Regex reference/cheatsheet.
# Regular Expression Reference
## Anchors
^ Matches at the start of string or start of line if multi-line
mode is enabled. Many regex implementations have multi-line mode
enabled by default.
$ Matches at the end of string or end of line if multi-line mode
is enabled. Many regex implementations have multi-line mode
enabled by default.
\A Matches at the start of the search string.
\Z Matches at the end of the search string, or before a newline at
the end of the string.
\z Matches at the end of the search string.
\b Matches at word boundaries.
\B Matches anywhere but word boundaries.
## Character Classes (can be used in ranges)
. Matches any character except newline (matches newline in single-line)
\s Matches white space characters.
\S Matches anything but white space characters.
\d Matches digits. Equivalent to [0-9].
\D Matches anything but digits. Equivalent to [^0-9].
\w Matches letters, digits and underscores. Equivalent to [A-Za-z0-9_].
\W Matches anything but letters, digits and underscores.
\xff Matches ASCII hexadecimal character ff.
\x{ffff} Matches UTF-8 hexadecimal character ffff.
\cA Matches ASCII control character ^A (case insensitive).
\132 Matches ASCII octal character 132.
## Groups
(foo|bar) Matches pattern foo or bar.
(foo) Define a group (or subpattern) consisting of pattern foo.
Matches within the group can be referenced in a replacement
using a back reference.
(?<foo>bar) Define a named group named "foo" consisting of pattern bar.
Matches within the group can be referenced in a replacement
using the back reference $foo.
(?:foo) Define a passive group consisting of pattern foo. Passive
groups cannot be referenced in a replacement using a
back reference.
(?>foo+)bar Define an atomic group consisting of pattern foo+. Once foo+
has been matched, the regex engine will not try to find other
variable length matches of foo+ in order to find a match
followed by a match of bar. Atomic groups may be used for
performance reasons.
## Bracket Expressions
[adf] Matches characters a or d or f.
[^adf] Matches anything but characters a, d and f.
[a-f] Match any lowercase letter between a and f inclusive.
[A-F] Match any uppercase letter between A and F inclusive.
[0-9] Match any digit between 0 and 9 inclusive.
## Quantifiers
*? Zero or more, lazy. Matches will be as small as possible.
+ One or more. Matches will be as large as possible.
+? One or more, lazy. Matches will be as small as possible.
? Zero or one. Matches will be as large as possible.
?? Zero or one, lazy. Matches will be as small as possible.
{2} Two exactly.
{2,} Two or more. Matches will be as large as possible.
{2,}? Two or more, lazy. Matches will be as small as possible.
{2,4} Two, three or four. Matches will be as large as possible.
{2,4}? Two, three or four, lazy. Matches will be as small as possible.
## Special Characters
\ Escape character.
\n Matches newline.
\t Matches tab.
\r Matches carriage return.
\v Matches form feed/page break.
## Assertions
foo(?=bar) Lookahead assertion. The pattern foo will only match if
followed by a match of pattern bar.
foo(?!bar) Negative lookahead assertion. The pattern foo will only match
if not followed by a match of pattern bar.
(?<=foo)bar Look behind assertion. The pattern bar will only match if
preceded by a match of pattern foo.
(?<!foo)bar Negative look behind assertion. The pattern bar will only match
if not preceded by a match of pattern foo.
## POSIX Character Classes
[:upper:] Matches uppercase letters. Equivalent to A-Z.
[:lower:] Matches lowercase letters. Equivalent to a-z.
[:alpha:] Matches letters. Equivalent to A-Za-z.
[:alnum:] Matches letters and digits. Equivalent to A-Za-z0-9.
[:ascii:] Matches ASCII characters. Equivalent to \x00-\x7f.
[:word:] Matches letters, digits and underscores. Equivalent to \w.
[:digit:] Matches digits. Equivalent to 0-9.
[:xdigit:] Matches characters that can be used in hexadecimal codes.
[:punct:] Matches punctuation.
[:blank:] Matches space and tab. Equivalent to [ \t].
[:space:] Matches space, tab and newline. Equivalent to \s.
[:cntrl:] Matches control characters. Equivalent to [\x00-\x1F\x7F].
[:graph:] Matches printed characters. Equivalent to [\x21-\x7E].
[:print:] Matches printed characters and spaces. Equivalent to [\x21-\x7E ].
## Back References (used in replacements)
$3 Matched string within the third non-passive group.
$0 or $& Entire matched string.
$foo Matched string within the group named "foo".
## Case Modifiers (used in replacements)
\u Make the next character in the replacement uppercase.
\l Make the next character in the replacement lowercase.
\U Make the remaining characters in the replacement uppercase.
\L Make the remaining characters in the replacement lowercase.
## Modifiers
(?i) Case insensitive mode. Make the remainder of the pattern or
sub-pattern case insensitive.
(?m) Multi-line mode. Make $ and ^ in the remainder of the pattern
or sub-pattern match before/after newline.
(?s) Single-line mode. Make the . (dot) in the remainder of the
pattern or sub-pattern match newline.
(?x) Free spacing mode. Ignore white space in the remainder of the
pattern or sub-pattern.