Chapter 2

Character Matching Guide

The core of regular expressions is pattern matching. Understand two types of fuzzy matching, master character classes and quantifiers, and you can solve 80% of daily needs.

Two Types of Fuzzy Matching

πŸ’‘ Core Concept

The power of regular expressions lies in "fuzzy matching". Unlike exact matching (which can only match one fixed string), regex can match multiple possible characters or strings of different lengths.

1. Horizontal Fuzzy Matching

The same regex can match strings of different lengths. Achieved through quantifiers.

ab5c

Can match 2 to 5 b's

Match results:
βœ“ abbbc
βœ“ abbbbc
βœ“ abbbbbc
βœ“ abbbbbbc
βœ— abc (too few b's)
βœ— abbbbbbc (too many b's)

Common Quantifiers

Quantifier Meaning
* 0 or more times
+ 1 or more times
? 0 or 1 time
n Use n in brackets to mean exactly n times
n,m Use n,m in brackets to mean n to m times

2. Vertical Fuzzy Matching

The same regex can match multiple different characters at a certain position. Achieved through character classes.

a[123]b

The second character can be 1, 2, or 3

Match results:
βœ“ a1b
βœ“ a2b
βœ“ a3b
βœ— a4b (4 not in range)
βœ— aab (missing digit)
[a-z]

Any lowercase letter

[A-Z]

Any uppercase letter

[0-9]

Any digit (equivalent to \d)

[^abc]

Any character except a, b, c

Character Class Metacharacters

Metacharacter Matches Equivalent Example
\d Digit [0-9] \d3 β†’ "123"
\D Non-digit [^0-9] \D+ β†’ "abc"
\w Word character [a-zA-Z0-9_] \w+ β†’ "hello123"
\W Non-word character [^a-zA-Z0-9_] \W+ β†’ "!@#"
\s Whitespace [ \t\n\r\f] \s+ β†’ " "
\S Non-whitespace [^ \t\n\r\f] \S+ β†’ "abc"
. Any character (except newline) [^\n] . β†’ "a", "1", "@"

Practice Exercises

Exercise 1: Extract Numbers

Extract all numbers (including integers and decimals) from the text below

Price is 19.99 yuan, quantity is 100, total 1999 yuan

Exercise 2: Match Emails

Find all email addresses from the text below

Contact email: user@example.com or admin@test.org

Exercise 3: Match Dates

Match dates in YYYY-MM-DD format

Today is 2024-01-15, tomorrow is 2024-01-16

.* Regex Tester

See regex matches in real-time

/ /

Match Results

Found 0 matches
...

🎯 Key Takeaways

✨ Two Types of Fuzzy Matching

  • β€’ Horizontal: Match different lengths (quantifiers)
  • β€’ Vertical: Match different characters (character classes)

πŸ“ Common Metacharacters

  • β€’ \d digit, \w word, \s whitespace
  • β€’ [abc] character class, [^abc] negated
  • β€’ . any character (except newline)