The grep command is one of the most useful commands in a linux system for searching for text or patterns in a given file.
Table of contents.
- Searching a file.
- Searching multiple files.
- Find whole words.
- Ignore case.
- Regex matching.
- List matching file names.
- Counting matches.
- Display line numbers
- Limiting output.
Grep is an acronym for global regular expression print. We use it to search for a specific pattern of characters in a file.
This text patterns is referred to as a regular expression.
Once a match is found it is printed as output.
grep <search string> <file name>
Throughout this article we shall run the grep commands on the file below.
Searching a file.
The basic use of grep is searching for strings in a file.
To search the test.txt file for the string worker we could write;
grep "worker" test.txt
The output of thus command is all occurrences of the word worker highlighted.
Inversely we could get lines in the file that don't match the string worker by using the v option as shown below,
grep -v "worker" test.txt
The output is all lines without the string worker.
Searching multiple files.
We can also use grep to search for a string in multiple files by writing,
grep "search string" file1.txt file2.txt
The output of this command is all occurrences of search string and for each occurrences the file name is written at the beginning of the each line in the output.
We could also search all files in a directory by using the * option.
grep "gmail* *
The output is all occurrences of gmail in all files in the current directory.
Find whole words.
When we run the following command,
grep "mail" test.txt
We get all lines with mail string highlighted, however, we may need to match full words.
We use the w option with grep to match full words as shown
grep -w "mail" test.txt
The command when executed on the test.txt file won't output anything because there is no full word mail in it.
Notice that when we previously searched for a string, the search string had to be in the same case as the string in the file otherwise the search would not work.
With grep we can decide to get all words matching the string regardless of the string case by using the i option as follows.
grep -i "lubin" test.txt
The output of this command is all occurrences of the string Lubin in the file.
A regex pattern is a string of characters used to specify a pattern matching rule.
Given a file we can use grep and regular expressions to match a string.
To get all lines starting with 104 we could write,
grep "^104" test.txt
The ^ character is used to match the beginning of a pattern.
Other characters are;
$: used to match the end of input.
[ ]: used to match ranges e.g [A-Z] matches all uppercase letters
* : Matches zero or more instances of a character. e.g x* will match the preceding x zero or more times.
?: The preceding character is optional and matched at most once.
+: The preceding character is matched one or more times.
List matching file names.
Assume we are dealing with multiple log files and we want only files containing a specific ip address. We use the grep command with the rl option for such a case.
The r is the recursive search operator for searching a subdirectories.
We use the l option to list files.
grep -rl "127.0.0.1" *
The output is a list of files containing the specified ip address.
We can also count the number of occurrences of a string in a file by using the c option.
grep -c "mail" test.txt
The output will be the count of the word mail in the file test.txt.
Display line numbers.
We can also display line numbers for the matched string by using the n option.
grep -n "developer" test.txt
The command will output the matched line with its line number and the matched string highlighted.
Following this we can also display n lines that come before the matched line by using the -C command.
grep -n -C 3 "developer" test.txt
The above command will output the matched line that contains the string developer and three lines that come before this line.
Assume we are greping a very large file, The output may not fit the screen so we can opt to use the m option to limit the number of lines of output.
To get the first four lines that match gmail string, we could write,
grep -m4 "gmail" test.txt
The grep command finds many uses especially for developers or system administrators.
Its a handy tool for task such as searching for a particular text or pattern in a very large files. We can use it together with the find command to achieve better results from a very large dataset.