NAME
grep, egrep, fgrep - Searches a file for a pattern
SYNOPSIS
grep [-c | -l | -q] [-bhinsvwxy] [-p paragraph_separator]
pattern | -e pattern | [file ... ]
egrep [-c | -l | -q] [-bhinsvwxy] pattern | -e pattern |
[file ... ]
fgrep [-c | -l | -q] [-bhinsvwxy] pattern | -e pattern |
[file ... ]
The grep, egrep, and fgrep commands search the specified
files (standard input by default) for lines containing char-
acters that match the specified pattern, and then write
matching lines to standard output.
FLAGS
While most flags can be combined, some combinations result
in one flag overriding another. For example, if you specify
-n and -l, the output includes filenames only (as specified
by -l) and thus does not include line numbers (as specified
by -n).
-b Precedes each line by the block number on which it was
found. Use this flag to help find disk block numbers by
context.
-c Displays only a count of matching lines.
-e pattern
Specifies a pattern. This works the same as a simple
pattern, but is useful when the pattern begins with a -
(dash).
-i Ignores the case of letters in locating pattern; that
is, uppercase and lowercase letters in the input are
considered to be identical. (Same as -y.)
-l Lists the name of each file with lines matching pattern.
Each filename is listed only once; filenames are
separated by newline characters.
-n Precedes each line with its relative line number in the
file.
-p paragraph_separator
Displays the entire paragraph containing matched lines.
Paragraphs are delimited by paragraph separators,
paragraph_separator, which are patterns in the same form
as the search pattern. Lines containing the paragraph
separators are used only as separators; they are never
included in the output. The default paragraph separator
is a blank line.
-q Suppresses all output except error messages. This is
useful for checking status.
-s Supresses error messages about inaccessible files.
-v Displays all lines except those that match the specified
pattern. Useful for filtering unwanted lines out of a
file.
-w The expression is searched for as a word (the pattern
bracketed by nonalphanumeric characters or by the begin-
ning or end of the line). See ex.
-y Ignores the case of letters in locating pattern; that
is, uppercase and lowercase letters in the input are
considered to be identical. (Same as -i.)
DESCRIPTION
Three versions of the grep command permit you to specify the
matching pattern in varying levels of complexity:
grep
The grep command searches for patterns that are limited reg-
ular expressions as described under Regular Expressions.
egrep
The egrep command searches for patterns that are full regu-
lar expressions, except for \( and \) and with the addition
of the following rules:
o A regular expression followed by a + (plus sign)
matches one or more occurrences of the regular expres-
sion.
o A regular expression followed by a ? (question mark)
matches zero or one occurrence of the regular expres-
sion.
o Two regular expressions separated by a | (vertical bar)
or by a newline character match either expression.
o A regular expression can be enclosed in ( )
(parentheses) for grouping.
The order of precedence of operators is [], then *, ?, and
+, then concatenation, then | and the newline character.
The egrep command uses a deterministic algorithm that needs
exponential space.
fgrep (obsolescent)
The fgrep command searches for patterns that are fixed
strings.
Command Usage
All versions of grep precede the matched line with the name
of the file containing it if you specify more than one file
.
Lines are limited to 2048 bytes; longer lines are broken
into multiple lines of 2048 or fewer bytes. Paragraphs
(under the -p flag) are currently limited to a length of
5000 bytes.
Running grep on a non-text file (for example, an .o file)
produces unpredictable results and is discouraged.
Regular Expressions (REs)
The following REs match a single character:
character
An ordinary character (one other than one of the special
pattern-matching characters) matches itself.
. A . (dot) matches any single character except for the
newline character.
[string]
A string enclosed in [ ] (brackets) matches any one
character in that string. In addition, certain
pattern-matching characters have special meanings within
brackets:
^ If the first character of string is a ^ (circum-
flex), the RE [^string] matches any character except
the characters in string and the newline character.
A ^ has this special meaning only if it occurs first
in the string.
- You can use a - (dash) to indicate a range of con-
secutive characters. The characters that fall
within a range are determined by the current collat-
ing sequence, which is defined by the LC_COLLATE
environment variable. For example, [a-d] is
equivalent to [abcd] in the traditional ASCII col-
lating sequence, but if you were using French colla-
tion rules, it would be equivalent to [abcd].
A range can include a multicharacter collating ele-
ment enclosed within bracket-period delimiters ([.
.]). These collating symbols are necessary for
languages that treat some strings as individual col-
lating elements. For example, in Spanish, the
strings ch and ll each are collating symbols (that
is, the Spanish primary sort order is a, b, c, ch,
d,...,k, l, ll, m,...). The bracket-period delim-
iters in the RE syntax distinguish multicharacter
collating elements from a list of the individual
characters that make up the element. When using
Spanish collation rules, [[.ch.]] is treated as an
RE matching the sequence ch, while [ch] is treated
as an RE matching c or h. In addition, [a-[.ch.]]
matches a, b, c, and ch.
A collating sequence can define equivalence classes
for characters. An equivalence class is a set of
collating elements that all sort to the same primary
location. They are enclosed within bracket-equal
delimiters ([= =]). An equivalence class generally
is designed to deal with primary-secondary sorting;
that is, for languages like French that define
groups of characters as sorting to the same primary
location, and then having a tie-breaking, secondary
sort. For example, if e, , and belong to the same
equivalence class, then [[=e=]fg, [[==]fg], and
[[==]fg] are each equivalent to [efg].
The - (dash) character loses its special meaning if
it occurs first ([-string]), if it immediately fol-
lows an initial circumflex ([^-string]), or if it
appears last ([string-]) in the string.
] When the ] (right bracket) is the first character in
the string ([]string]) or when it immediately fol-
lows an initial circumflex ([^]string]), it is
treated as a part of the string rather than as the
string terminator.
\special_character
A \ (backslash) followed by a special pattern-matching
character matches the special character itself (as a
literal character). These special pattern-matching
characters are as follows:
. * [ \
Always special, except when they appear within [ ]
(brackets).
^ Special at the beginning of an entire pattern or
when it immediately follows the left bracket of a
pair of brackets ([^...]).
$ Special at the end of an entire pattern.
[: :]
A character class name enclosed in bracket-colon delim-
iters matches any of the set of characters in the named
class. Members of each of the sets are determined by the
current setting of the LC_CTYPE environment variable.
The supported classes are: alpha, upper, lower, digit,
alnum, xdigit, space, print, punct, graph, cntrl. Here
is an example of how to specify one of these classes:
[[:lower:]]
This matches any lowercase character for the current
locale.
Forming Patterns
The following rules describe how to form patterns from REs:
o An RE that consists of a single, ordinary character
matches that same character in a string.
o An RE followed by an * (asterisk) matches zero or more
occurrences of the character that the RE matches. For
example, the following pattern:
ab*cd
matches each of the following strings:
acd
abcd
abbcd
abbbcd
but not the following string:
abd
If there is any choice, the longest matching leftmost
string is chosen. For example, given the following
string:
122333444
the pattern .* matches 122333444, the pattern .*3
matches 122333, and the pattern .*2 matches 122.
o An RE followed by:
\{number\}
Matches exactly number occurrences of the character
matched by the RE.
\{number,\}
Matches at least number occurrences of the charac-
ter matched by the RE.
\{number1,number2\}
Matches any number of occurrences of the character
matched by the RE from number1 to number2,
inclusive.
The values of number1 and number2 must be integers
from 0 to 255, inclusive. Whenever a choice
exists, this pattern matches as many occurrences as
possible.
Note that if number is 0 (zero), pattern matches
the beginning of the line.
o You can combine REs into patterns that match strings
containing the same sequence of characters. For exam-
ple, ABD matches the string AB*CD and [A-Za-z]*[0-9]*
matches any string that contains any combination of
ASCII alphabetic characters (including none), followed
by any combination of numerals (including none).
o The character sequence \(pattern\) matches pattern and
saves it into a numbered holding space. Using this
sequence, up to nine patterns can be saved on a line.
Counting from left to right on the line, the first pat-
tern saved is placed in the first holding space, the
second pattern is placed in the second holding space,
and so on.
The character sequence \n matches the nth saved pat-
tern, which is placed in the nth holding space. (The
value of n is a digit, 1-9.) Thus, the following pat-
tern:
\(A\)\(B\)C\2\1
matches the string ABCBA. You can nest patterns to be
saved in holding spaces. Whether the enclosed patterns
are nested or in a series, \n refers to the nth
occurrence, counting from the left, of the delimiting
characters, \).
Restricting What Patterns Match
A pattern can be restricted to match from the beginning of a
line, up to the end of the line, or the entire line:
o A ^ (circumflex) at the beginning of a pattern causes
the pattern to match only a string that begins in the
first character position on a line.
o A $ (dollar sign) at the end of a pattern causes that
pattern to match only if the last matched character is
the last character (not including the newline charac-
ter) on a line.
o The construction ^pattern$ restricts the pattern to
matching only an entire line.
EXAMPLES
1. To search several files for a string of characters,
enter:
grep -F 'strcpy' *.c
This searches for the string strcpy in all files in the
current directory with names ending in .c.
2. To count the number of lines that match a pattern,
enter:
grep -c -F '{' pgm.c
grep -c -F '}' pgm.c
This displays the number of lines in pgm.c that contain
left and right braces.
If you do not put more than one { or } on a line in
your C programs, and if the braces are properly bal-
anced, then the two numbers displayed will be the same.
If the numbers are not the same, then you can display
the lines that contain braces in the order that they
occur in the file with the command:
grep -n -E '{|}' pgm.c
3. To display all lines in a file that begin with an ASCII
letter, enter:
grep '^[a-zA-Z]' pgm.s
Note that because grep -F searches only for fixed
strings and does not interpret pattern-matching charac-
ters, the following command causes grep to search only
for the literal string ^[a-zA-Z] in pgm.s:
grep -F '^[a-zA-Z]' pgm.s
4. To display all lines that contain ASCII letters in
parentheses or digits in parentheses (with spaces
optionally preceding and following the letters or
digits), but not letter-digit combinations in
parentheses, enter:
grep -E \
'\( *([a-zA-Z]*|[0-9]*) *\)' my.txt
This command displays lines in my.txt such as (y) or (
783902), but not (alpha19c).
Note that with grep -E, \( and \) match parentheses in
the text and ( and ) are special characters that group
parts of the pattern. With simply grep, the reverse is
true; use ( and ) to match parentheses and \( and \) to
group characters.
5. To display all lines that do not match a pattern,
enter:
grep -v '^#'
This displays all lines that do not begin with a #
(number sign).
6. To display the names of files that contain a pattern,
enter:
grep -l -F 'rose' *.list
This searches the files in the current directory that
end with .list and displays the names of those files
that contain at least one line containing the string
rose.
7. To display all lines that contain uppercase characters,
enter:
grep '[[:upper:]]' pgm.s
8. To display all lines that begin with a range of charac-
ters that includes a multicharacter collating symbol,
enter:
grep '^[a-[.ch.]]' pgm.s
With your locale set to a Spanish locale, this command
matches all lines that begin with a, b, c, or ch.
EXIT VALUES
The exit values of the grep, egrep, and fgrep commands are
as follows:
0 A match was found.
1 No match was found.
2 A syntax error was found or a file was inaccessible,
even if matches were found.
RELATED INFORMATION
Commands: ed(1)/red(1), ex(1), sed(1), sh(1).
"Using Internationalization Features" in the OSF/1 User's
Guide.
Acknowledgement and Disclaimer