Python Glob() Function To Match Path, Directory, File Names with Examples – POFTUT

Python Glob() Function To Match Path, Directory, File Names with Examples


glob is a general term used to define techniques to match specified patterns according to rules related to Unix shell. Linux and Unix systems and shells also support glob and also provide function glob() in system libraries. In this tutorial, we will look glob() function usage in Python programming language.

Import Glob Module

In order to use glob() and related functions we need to import the glob module. Keep in mind that glob module contains glob() and other related functions.

import glob
Import Glob Module
Import Glob Module

Exact String Search

We will start with a simple example. We will look how to match exact string or file name with a absolute path. In this example we will list file /home/ismail/poftut.c . We can see example below that the function returns a list which contains matches.

glob.glob("/home/ismail/poftut.c")
Exact String Search
Exact String Search

Wildcards

Wildcard is important glob operator for glob operations. Wildcard or asterisk is used to match zero or more characters. Wildcard specified that there may be zero character or multiple character where character is not important. In this exmaple we will match files those have .txt extension.

glob.glob("/home/ismail/*.txt")
Wildcards
Wildcards

As we can see that there are a lot of .txt files those return in a Python list.

Wildcards with Multilevel Directories

We can use wildcards in order to specify multilevel directories. If we want to search one level down directories for specified glob we will use /*/ . In this example, we search for .txt files in one level down directories in /home/ismail . This is also called “glob glob” because we use the module name glob and the function glob which is provided by the glob module.

glob.glob("/home/ismail/*/*.txt")
Wildcards with Multilevel Directories
Wildcards with Multilevel Directories

Single Character Wildcard

There is a question mark which is used to match single character. This can be useful if we do not know single character for given name. In this example we will match files with files file?.txt file where these will match

  • file.txt
  • file1.txt
  • file5.txt
glob.glob("/home/ismail/file?.txt")

Multiple Characters

Glob also supports for alphabetic and numeric characters too. We can use [ to start character range and ] is used to end character range. We can put whatever we want to match between square brackets. In this example we will match files and folders names those starts one of e,m,p .

glob.glob("/home/ismail/[emp]*.tx?")
Character Ranges
Character Ranges

Number Ranges

In some cases, we may want to match the number range. We can use - dash to specify start and end numbers. In this example, we will match 0 to 9 with 0-9. In this example, we will match file and folder names that contain numbers from 0 to 9.

glob.glob("/home/ismail/*[0-9]*")
Number Ranges
Number Ranges

Alphabet Ranges

We can also define Alphabet ranges similar to number ranges. we will use a-z for lowercase characters where A-Z for uppercase characters. What if we need to match lower and uppercase characters in a single statement. We can use a-Z to match both lower and uppercase letters. In this example, we will match files and folder names those starts with letters between a and c

glob.glob("/home/ismail/[a-c]*")
Alphabet Ranges
Alphabet Ranges

Return Generator with iglob() Mehtod

Generally glob method is used to list files for the specified patterns. But in some cases listing and storing them can be a tedious work. So iglob() function can be used to create an iterator which can be used to iterate the file names with the next() function.

import glob

gen = glob.iglob("*.txt")

for item in gen:
print(item)
Return Generator with iglob() Mehtod

Skip Specific Characters with escape() Method

escape() function can be used to skip or do not list some files those names has specifies characters. For example if we want to skip the files those names contains - or _ or # we can use the escape() function by providing these characters.

chars_skip = "-_#"

for char_skip in chars_skip:
esc_set = "*" + glob.escape(char_skip)+ "*" + ".txt"
for txt in (glob.glob(esc_set)):
print(txt)

LEARN MORE  Linux Cut Command With Examples

1 thought on “Python Glob() Function To Match Path, Directory, File Names with Examples”

Leave a Comment