Linux: pattern searching
Wildcards
The command line is capable of understanding patterns of strings (which we will call here wildcards), which allows us to simultaneously run commands for multiple files/directories using only a few characters.
The * wildcard
The most frequently used wildcard is *, which represents zero or more characters. Let’s see how it works with a real example:
First create a folder named wildcards_test and move to it:
username@bash:~$ mkdir wildcards_test
username@bash:~$ cd wildcards_test
Now create five empty files:
username@bash:~/wildcards_test$ touch carol.txt blah.txt example.png firstfile.txt number2file
Now we can use the wildcard * to list only the files that begin with the letter b:
username@bash:~/wildcards_test$ ls b*
blah.txt
What if we wanted to list all the files that end with .txt?
username@bash:~/wildcards_test$ ls *.txt
carol.txt blah.txt firstfile.txt
Under the hood
What is happening under the hood is that first the command line will process the wildcard and return the files/directories that match with it. Then it will pass all those files/directories as arguments to the command that is being executed. In the above example we have run the command ls, however wildcards will work with any other command.
For instance, imagine we wanted to create a folder called images and move all files with PNG extension to it. We can do that by combining the wildcard * with the command mv:
username@bash:~/wildcards_test$ mkdir images
username@bash:~/wildcards_test$ mv *.png images/
username@bash:~/wildcards_test$ ls images/
example.png
The ? wildcard
The ? wildcard represents a single character. For example, it can be used to list all files whose second letter is a:
username@bash:~/wildcards_test$ ls ?a*
carol.txt
Or even to list all files whose extension have three characters:
username@bash:~/wildcards_test$ ls *.???
carol.txt blah.txt firstfile.txt
The [] wildcard
Finally, as opposed to * and ?, which refer to any character, the range operator [] allows us to search for a specific subset of characters.
For instance, if we wanted to list all files that begin the the letter c or f:
username@bash:~/wildcards_test$ ls [cf]*
carol.txt firstfile.txt
Or to list all files that contain a numeric character:
username@bash:~/wildcards_test$ ls *[0-9]*
number2file
Searching inside files with grep
The command grep is used to search for patterns inside files, iterating over each line of it. Before start playing with it, let’s create a directory named grep_test in our home, move from our current working directory (wildcards_test) to there and copy an example file called grep_test.txt to grep_test:
username@bash:~/wildcards_test$ mkdir ~/grep_test
username@bash:~/wildcards_test$ cd ~/grep_test
username@bash:~/grep_test$ cp ~/Share/linux_tutorial/grep_test.txt .
If you are curious about it, you can print the content of the file grep_test.txt using the cat command:
username@bash:~/grep_test$ cat grep_test.txt
n0_v0w3l_l!n3
line_contains_vowels
Now we can use the command grep to search for the lines that contain the exclamation mark (!):
username@bash:~/grep_test$ grep ! grep_test.txt
n0_v0w3l_l!n3
Or combine wildcards and grep to find the lines that contain vowels:
username@bash:~/grep_test$ grep [aeiou] grep_test.txt
line_contains_vowels
Interestingly, you can use the option -v to search for opposite patterns. For instance, to find all the lines that do not contain any vowels:
username@bash:~/grep_test$ grep -v [aeiou] grep_test.txt
n0_v0w3l_l!n3