Solaris Advanced User's Guide
검색에만이 책은
PDF로 이 문서 다운로드

Searching Files

4

This chapter describes how to search directories and files for keywords and strings using the SunOS command grep.

Searching for Patterns with grep

To search for a particular character string in a file, use the grep command. The basic syntax of the grep command is:

  $ grep string file  

where string is the word or phrase you want to find, and file is the file to be searched.

Note - A string is one or more characters; a single letter is a string, as is a word or a sentence. Strings may include "white space," punctuation, and invisible (control) characters.

For example, to find Edgar Allan Poe's telephone extension, type grep, all or part of his name, and the file containing the information:

  $ grep Poe extensions  
  Edgar Allan Poe     x72836  
  $  

Note that more than one line may match the pattern you give:

  $ grep Allan extensions  
  David Allan         x76438  
  Edgar Allan Poe     x72836  
  $ grep Al extensions  
  Louisa May Alcott   x74236  
  David Allan         x76438  
  Edgar Allan Poe     x72836  
  $  

grep is case-sensitive; that is, you must match the pattern with respect to uppercase and lowercase letters:

  $ grep allan extensions  
  $ grep Allan extensions  
  David Allan         x76438  
  Edgar Allan Poe     x72836  
  $  

Note that grep failed in the first try because none of the entries began with a lowercase "a."

grep as a Filter

grep is very often used as a "filter" with other commands. It allows you to filter out useless information from the output of commands. To use grep as a filter, you must pipe the output of the command through grep. The symbol for pipe is "|".
The following example displays files ending in ".ps" that were created in the month of May:

  $ ls -l *.ps | grep May  

The first part of this command line,

  ls -l *.ps  

produces a list of files:

  $ ls -l *.ps  
  -rw-r--r--  1 elvis       7228 Apr 22 15:07 change.ps  
  -rw-r--r--  1 elvis       2356 May 22 12:56 clock.ps  
  -rw-r--r--  1 elvis       1567 Jun 22 12:56 cmdtool.ps  
  -rw-r--r--  1 elvis      10198 Jun 22 15:07 command.ps  
  -rw-r--r--  1 elvis       5644 May 22 15:07 buttons.ps  
  $  

The second part,

  | grep May  

pipes that list through grep, looking for the pattern May:

  $ ls -l *.ps | grep May  
  -rw-r--r--  1 elvis       2356 May 22 12:56 clock.ps  
  -rw-r--r--  1 elvis       5644 May 22 15:07 buttons.ps  
  $  

grep with Multi-Word Strings

To find a pattern that is more than one word long, enclose the string with single or double quotation marks:

  $ grep "Louisa May" extensions  
  Louisa May Alcott     x74236  
  $  

grep can search for a string in groups of files. When it finds a pattern that matches in more than one file, it prints the name of the file, followed by a colon, then the line matching the pattern:

  $ grep ar *  
  actors:Humphrey Bogart  
  alaska:Alaska is the largest state in the United States.  
  wilde:book.  Books are well written or badly written.  
  $  

Searching for Lines without a Certain String

To search for all the lines of a file that don't contain a certain string, use the -v option to grep. The following example shows how to find all of the lines in the user medici's home directory files that don't contain the letter e:

  $ ls  
  actors    alaska    hinterland    tutors    wilde  
  $ grep -v e *  
  actors:Mon Mar 14 10:00 PST 1936  
  wilde:That is all.  
  $  

More on grep

You can also use the grep command to search for targets defined as patterns using regular expressions. Regular expressions consist of letters and numbers, in addition to characters with special meaning to grep. These special characters, called metacharacters, also have special meaning to the system and need to be quoted or escaped. Whenever you use a grep regular expression at the command prompt, surround it with quotes, or escape metacharacters (such as & ! . * $ ? and \) with a backslash (\).
  • A caret (^) indicates the beginning of the line. So the command:

  $ grep '^b' list  

finds any line in the file list starting with "b."
  • A dollar-sign ($) indicates the end of the line. The command:

  $ grep 'b$' list  

displays any line in which "b" is the last character on the line. And the command:

  $ grep '^b$' list  

displays any line in list where "b" is the only character on the line.
  • Within a regular expression, dot (.) finds any single character. So the command:

  $ grep 'an.' list  

would match any three characters with "an" as the first two, including "any," "and," "management," and "plan" (because spaces count, too).
  • When an asterisk (*) follows a character, grep interprets it as "zero or more instances of that character." When the asterisk follows a regular expression, grep interprets it as "zero or more instances of characters matching the pattern."

    Because it includes zero occurrences, usage of the asterisk is a little non-intuitive. Suppose you want to find all words with the letters "qu" in them. Typing:


  $ grep 'qu*' list  

will work as expected. However, if you wanted to find all words containing the letter "n," you would have to type:

  $ grep 'nn*' list  

If you wanted to find all words containing the pattern "nn," you would have to type:

  $ grep 'nnn*' list  

You may want to try this to see what happens otherwise.
  • To match zero or more occurrences of any character in list, type:

  $ grep .* list  

Searching for Metacharacters

Suppose you want to find lines in the text that have a dollar sign ($) in them. Preceding the dollar sign in the regular expression with a backslash (\) tells grep to ignore (escape) its special meaning. This is true for the other metacharacters (& ! . * ? and \ itself) as well.
For example, the expression

  $ grep ^\.  

matches lines starting with a period, and is especially useful when searching for nroff or troff formatting requests (which begin with a period).
The following table, Table 4-1, provides a list of the more commonly used search pattern elements you can use with grep.
Table 4-1 grep
CharacterMatches
^The beginning of a text line
$The end of a text line
.Any single character
[...]Any single character in the bracketed list or range
[^...]Any character not in the list or range
*Zero or more occurrences of the preceding character or regular expression
.*Zero or more occurrences of any single character
\Escapes special meaning of next character
Note that these search characters can also be used in vi text editor searches.

Single or Double Quotes on Command Lines

As shown earlier, you use quotation marks to surround text that you want to be interpreted as one word. For example, you would type the following to use grep to search all files for the phrase "dang it, boys":

  $ grep "dang it, boys" *  

Single quotation marks (') can also be used to group multiword phrases into single units. Single quotation marks also make sure that certain characters, such as $, are interpreted literally. (The history metacharacter ! is always interpreted as such, even inside quotation marks, unless you escape it with a backslash.) In any case, it is a good idea to escape characters such as & ! $ ? . ; and \ when you want them taken as ordinary typographical characters.
For example, if you type:

  $ grep $ list  

you will see all the lines in list. However, if you type:

  $ grep '\$' list  

you will see only those lines with the "$" character in them.
For more information on the grep(1) command, refer to the man Pages(1): User Commands.