Regular Expression Metacharacters Meatcharactes: {}[]()^$.|*+? : Character with a special meaning. If you want to look for these characters in your text, preceed them with a backslash \ (i.e. \$) ^ : Matches the starting position within the string. In line-based tools, it matches the starting position of any line. . : Matches any single character (many applications exclude newlines, and exactly which characters are considered newlines is flavor-, character-encoding-, and platform-specific, but it is safe to assume that the line feed character is included). Within POSIX bracket expressions, the dot character matches a literal dot. For example, a.c matches "abc", etc., but [a.c] matches only "a", ".", or "c". [ ] : A bracket expression. Matches a single character that is contained within the brackets. For example, [abc] matches "a", "b", or "c". [a-z] specifies a range which matches any lowercase letter from "a" to "z". These forms can be mixed: [abcx-z] matches "a", "b", "c", "x", "y", or "z", as does [a-cx-z]. The - character is treated as a literal character if it is the last or the first (after the ^, if present) character within the brackets: [abc-], [-abc]. Note that backslash escapes are not allowed. The ] character can be included in a bracket expression if it is the first (after the ^) character: []abc]. [^ ]: Matches a single character that is not contained within the brackets. For example, [^abc] matches any character other than "a", "b", or "c". [^a-z] matches any single character that is not a lowercase letter from "a" to "z". Likewise, literal characters and ranges can be mixed. $ : Matches the ending position of the string or the position just before a string-ending newline. In line-based tools, it matches the ending position of any line. ( ) : Defines a marked subexpression. The string matched within the parentheses can be recalled later (see the next entry, \n). A marked subexpression is also called a block or capturing group. BRE mode requires \( \). \n : Matches what the nth marked subexpression matched, where n is a digit from 1 to 9. This construct is vaguely defined in the POSIX.2 standard. Some tools allow referencing more than nine capturing groups. Also known as a backreference. Quantification: ? : The question mark indicates zero or one occurrences of the preceding element. For example, colou?r matches both "color" and "colour". * : The asterisk indicates zero or more occurrences of the preceding element. For example, ab*c matches "ac", "abc", "abbc", "abbbc", and so on. + : The plus sign indicates one or more occurrences of the preceding element. For example, ab+c matches "abc", "abbc", "abbbc", and so on, but not "ac". {n} : The preceding item is matched exactly n times. {min,} : The preceding item is matched min or more times. {,max} : The preceding item is matched up to max times. {min,max} : The preceding item is matched at least min times, but not more than max times. Character classes; [:digit:] : 0-9 [:alnum:] : A-Z,a-z,0-9 [:alpha:] : A-Z,a-z [:blank:] :<space>, <tab> [:punct:] : [][!"#$%&'()*+,./:;<=>?@\^_`{|}~-] [:space:] : whitespace characters [ \t\r\n\v\f] [:upper:] : A-Z [:lower:] : a-Z [:print:] : visible characters and <space> \b : word boundary (zero width word boundary between a alphanumeric character and a non alphanumeric character) \w : alphanumeric including _ \s : whitepace character \d : 0-9 \W : inverse of \w \S : inverse of \s \D : inverse of \dfor further information see the wiki page Regular expression