regexp / help

pattern

.
Matches any character, except for line breaks if dotall is false.
 
\w
Matches any word character.
(alphanumeric & underscore, does'nt include any extended word characters, such as accented characters)
\W
Matches any character that is not a word character.
(alphanumeric & underscore)
\d
Matches any digit character.
(0-9)
\D
Matches any character that is not a digit character.
!(0-9)
\s
Matches any whitespace character.
(spaces, tabs, line breaks)
\S
Matches any character that is not a whitespace character.
(spaces, tabs, line breaks)
[ABC]
Match any single character in the set.
Matches defense or defence : defen[cs]e
[^ABC].
Match any single character that is not in the set.
 
[a-z]
Matches any single character in the range.
(a-z)
[a-zA-Z]
Matches any single character in the range.
(a-z or A-Z)
[^f-m]
Matches any single character that is not in the range.
!(f-m)
[0-9]
Matches any single character in the range.
(0-9)
[\w'-]
Matches any word character, single quote, or hyphen.
word or ' or -
\t
Tab character.
 
\r
Carriage return character.
 
\n
Line break character.
 
\xFF
Specify a character by its hexadecimal index.
 
\\
Matches a \ character.
 
\.
Matches a . character.
 
\+
Matches a + character.
 
\*
Matches a * character.
 
\?
Matches a ? character.
 
\^
Matches a ^ character.
 
\$
Matches a $ character.
 
\[
Matches a [ character.
 
\]
Matches a ] character.
 
\(
Matches a ( character.
 
\)
Matches a ) character.
 
\|
Matches a | character.
 
\b
Matches a word boundary position such as whitespace or the beginning or end of the string.
 
\B
Matches any position that is not a word boundary.
 
^
Matches the beginning of the string.
Match a capital letter at the beginning of a string : ^[A-Z]
$
Matches the end of the string.
Match a period or exclamation mark at the end of a string : [\.!]$
(?=ABC)
Positive lookahead. Matches a group after your main expression without including it in the result.
 
(?!ABC)
Negative lookahead. Specifies a group that can not match after your main expression.
(ie. if it matches, the result is discarded)
(?<=ABC)
Positive lookbehind. Matches a group before your main expression without including it in the result.
 
(?
Negative lookahead. Specifies a group that can not match after your main expression.
(ie. if it matches, the result is discarded)
?
Makes the preceeding token optional. It will match 0 or 1 of the preceeding token.
Matches color or colour : colou?r
*
Matches 0 or more of the preceeding token. This is a greedy match.
(will match as many characters as possible before satisfying the next token)
*?
Matches 0 or more of the preceeding token. This is a lazy match.
(will match as few characters as possible before satisfying the next token)
+
Matches 1 or more of the preceeding token. This is a greedy match.
(will match as many characters as possible before satisfying the next token)
+?
Matches 0 or more of the preceeding token. This is a lazy match.
(will match as few characters as possible before satisfying the next token)
{5}
Matches exactly 5 of the preceeding token.
 
{2,5}
Matches 2 to 5 of the preceeding token. This is a greedy match.
(will match as many characters as possible before satisfying the next token)
{2,5}?
Matches 2 to 5 of the preceeding token. This is a lazy match.
(will match as few characters as possible before satisfying the next token)
|
Alternation. Equivalent of "or". Matches the full expression before or after the |.
Matches any on of pear, kiwi, or plum : pear|kiwi|plum

flags

i (PCRE_CASELESS)
Letters in the pattern match both upper and lower case letters.
 
m (PCRE_MULTILINE)
The "start of line" and "end of line" constructs match immediately following or immediately before any newline in the subject string, respectively, as well as at the very start and end.
If there are no "\n" characters in a subject string, or no occurrences of ^ or $ in a pattern, setting this modifier has no effect.
s (PCRE_DOTALL)
A dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded.
A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.
u (PCRE_UTF8)
Pattern strings are treated as UTF-8.
 
x (PCRE_EXTENDED)
Whitespace data characters in the pattern are totally ignored except when escaped or inside a character class, and characters between an unescaped # outside a character class and the next newline character, inclusive, are also ignored.
Whitespace characters may never appear within special character sequences in a pattern, for example within the sequence (?( which introduces a conditional subpattern.
A (PCRE_ANCHORED)
The pattern is forced to be "anchored", that is, it is constrained to match only at the start of the string which is being searched (the "subject string").
 
D (PCRE_DOLLAR_ENDONLY)
a dollar metacharacter in the pattern matches only at the end of the subject string. Without this modifier, a dollar also matches immediately before the final character if it is a newline (but not before any other newlines).
This modifier is ignored if m modifier is set.
S
When a pattern is going to be used several times, it is worth spending more time analyzing it in order to speed up the time taken for matching. If this modifier is set, then this extra analysis is performed. At present, studying a pattern is useful only for non-anchored patterns that do not have a single fixed starting character.
U (PCRE_UNGREEDY)
This modifier inverts the "greediness" of the quantifiers so that they are not greedy by default, but become greedy if followed by "?".
 
X (PCRE_EXTRA)
Any backslash in a pattern that is followed by a letter that has no special meaning causes an error, thus reserving these combinations for future expansion.
There are at present no other features controlled by this modifier.

source : RegExr by Grant Skinner & php.net doc