Fundamentals 7 min read

Master Shell Wildcards and POSIX Regex: A Practical Guide

This article explains the meaning of common special characters used as shell wildcards, character classes, and POSIX regular expressions, demonstrates locale effects on pattern matching, and compares BRE and ERE syntax with practical examples and exercises.

360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Master Shell Wildcards and POSIX Regex: A Practical Guide

In programming you often encounter special characters such as * ? + [] {} ^ $ \ ( ) |; they represent concepts like wildcards, Basic Regular Expressions (BRE), Extended Regular Expressions (ERE) and PCRE.

Below are three questions to test your understanding of wildcards, their difference from regular expressions, and the two types of POSIX regex.

Introduction

Because the shell frequently works with file names, it provides special characters to quickly specify groups of files—these are called wildcards.

* matches any number of characters ? matches a single character
[characters] matches any one character from the set [!characters] matches any character not in the set [:class:] matches any character belonging to the specified POSIX character class

Examples of the first four patterns: * (all files), g* (files starting with g), b*.txt (txt files starting with b), Data??? (files starting with Data and total length 7), [abc] (files starting with a, b, or c), abc[0-9][0-9] (files starting with abc followed by two digits), [A-Z] (behaviour varies with system locale).

In a CentOS example the pattern [A-Z] did not produce the expected result because the locale influences character ordering. Early UNIX used 7‑bit ASCII (0‑127). Later extensions added 8‑bit characters (128‑255) to support non‑English languages. POSIX introduced the locale concept to adapt sorting rules, e.g., some locales order as aAbBcC…xXyYzZ, causing ls [A-Z]* to match all letters except ‘a’.

Character Classes

Because different locales behave differently, POSIX defines explicit character classes:

[:alnum:] matches any letter or digit [:alpha:] matches any letter [:digit:] matches any digit [:lower:] matches lowercase letters [:upper:] matches uppercase letters

Testing these classes with the following image shows their effect.

POSIX Regular Expressions

POSIX splits regular expressions into Basic Regular Expressions (BRE) and Extended Regular Expressions (ERE). The main difference is the set of meta‑characters they support.

BRE supports ^$.[]*\ , while ERE adds (){}?+| .

Application Support

Programs that support BRE include sed , grep , etc.; programs that support ERE include egrep , grep -E , awk , and others.

Exercise

The following screenshots illustrate how grep handles basic regex and how egrep handles extended regex.

Understanding the distinction between literal braces { in BRE and escaped braces \{ in ERE is crucial; misuse can lead to unpredictable results, so always write regexes carefully and document them for future reference.

Shellwildcardscharacter classeslocalePOSIX regex
360 Zhihui Cloud Developer
Written by

360 Zhihui Cloud Developer

360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.