Grep and Regular Expressions

Grep

I usually use egrep instead of plain grep. The difference is that egrep treats ?, +, {, |, (, and ) as special characters whereas grep does not. { and } are not portable but recognized as special by GNU egrep.

Essential options are -H (with filename) when using find, -n (line number) and -v (invert match). -r (exactly equivalent to -R) is also useful when not using find.

-P, –perl-regexp: to be tested

Regular Expressions

Basic
. Matches any character
[ ] Bracket expression. Matches one of any characters enclosed
( ) Group an expression to form a single item
Quantifiers
? Preceeding item must match one or zero times
* Preceeding item must match zero or more times
+ Preceeding item must match one or more times
{n} Preceeding item must match exactly n times
{n,} Preceeding item must match n or more times
{n,m} Preceeding item must match n to m times
Anchors
Character classes
\d Digit
\D Non-digit
\w Part-of-word character
\W Non-word character
\s Whitespace character
\S Non-whitespace character

Other utils

sed

sed treats (), +, ? and {} as 'ordinary' characters, they must be escaped to become control characters.

With the substitute commande, one can use \1, \2… to refer to the previously memorized strings (with the ( operator).

notes/grep_and_regular_expressions.txt · Last modified: 2005/12/04 15:51 by nicko