Linux: pattern searching
Wildcards
The command line is capable of understanding patterns of strings (which we will call here wildcards), which allows us to simultaneously run commands for multiple files/directories using only a few characters.
The *
wildcard
The most frequently used wildcard is *
, which represents zero or more characters. Let’s see how it works with a real example:
First create a folder named wildcards_test
and move to it:
username@bash:~$ mkdir wildcards_test
username@bash:~$ cd wildcards_test
Now create five empty files:
username@bash:~/wildcards_test$ touch carol.txt blah.txt example.png firstfile.txt number2file
Now we can use the wildcard *
to list only the files that begin with the letter b:
username@bash:~/wildcards_test$ ls b*
blah.txt
What if we wanted to list all the files that end with .txt?
username@bash:~/wildcards_test$ ls *.txt
carol.txt blah.txt firstfile.txt
Under the hood
What is happening under the hood is that first the command line will process the wildcard and return the files/directories that match with it. Then it will pass all those files/directories as arguments to the command that is being executed. In the above example we have run the command ls
, however wildcards will work with any other command.
For instance, imagine we wanted to create a folder called images and move all files with PNG extension to it. We can do that by combining the wildcard *
with the command mv
:
username@bash:~/wildcards_test$ mkdir images
username@bash:~/wildcards_test$ mv *.png images/
username@bash:~/wildcards_test$ ls images/
example.png
The ?
wildcard
The ?
wildcard represents a single character. For example, it can be used to list all files whose second letter is a:
username@bash:~/wildcards_test$ ls ?a*
carol.txt
Or even to list all files whose extension have three characters:
username@bash:~/wildcards_test$ ls *.???
carol.txt blah.txt firstfile.txt
The []
wildcard
Finally, as opposed to *
and ?
, which refer to any character, the range operator []
allows us to search for a specific subset of characters.
For instance, if we wanted to list all files that begin the the letter c or f:
username@bash:~/wildcards_test$ ls [cf]*
carol.txt firstfile.txt
Or to list all files that contain a numeric character:
username@bash:~/wildcards_test$ ls *[0-9]*
number2file
Searching inside files with grep
The command grep
is used to search for patterns inside files, iterating over each line of it. Before start playing with it, let’s create a directory named grep_test in our home, move from our current working directory (wildcards_test) to there and copy an example file called grep_test.txt to grep_test:
username@bash:~/wildcards_test$ mkdir ~/grep_test
username@bash:~/wildcards_test$ cd ~/grep_test
username@bash:~/grep_test$ cp ~/Share/linux_tutorial/grep_test.txt .
If you are curious about it, you can print the content of the file grep_test.txt
using the cat
command:
username@bash:~/grep_test$ cat grep_test.txt
n0_v0w3l_l!n3
line_contains_vowels
Now we can use the command grep
to search for the lines that contain the exclamation mark (!
):
username@bash:~/grep_test$ grep ! grep_test.txt
n0_v0w3l_l!n3
Or combine wildcards and grep
to find the lines that contain vowels:
username@bash:~/grep_test$ grep [aeiou] grep_test.txt
line_contains_vowels
Interestingly, you can use the option -v
to search for opposite patterns. For instance, to find all the lines that do not contain any vowels:
username@bash:~/grep_test$ grep -v [aeiou] grep_test.txt
n0_v0w3l_l!n3