How to Use Regular Expressions: A Beginner's Guide

February 10, 2026 · 8 min read · Developer

Regular expressions — commonly known as regex or regexp — are one of the most powerful tools in a programmer's arsenal. They let you search, match, and manipulate text using patterns instead of literal strings. Whether you're validating email addresses, parsing log files, or doing find-and-replace operations, regex can save you hours of manual work. This guide will take you from zero to confident with regular expressions.

What Are Regular Expressions?

A regular expression is a sequence of characters that defines a search pattern. Think of it as a mini programming language specifically designed for text matching. Regex is supported in virtually every programming language (JavaScript, Python, Java, PHP, Go, Ruby) and many text editors (VS Code, Sublime Text, Vim).

At its simplest, a regex can be a literal string. The pattern hello matches the text "hello" wherever it appears. But the real power comes from special characters — called metacharacters — that let you express complex patterns concisely.

Basic Regex Syntax

Literal Characters

Most characters match themselves. The pattern cat matches "cat" in "concatenate", "catalog", or "the cat sat".

The Dot (.)

The dot matches any single character except a newline. So c.t matches "cat", "cot", "cut", "c9t", and even "c t".

Character Classes [ ]

Square brackets define a set of characters to match. [aeiou] matches any single vowel. [0-9] matches any digit. [a-zA-Z] matches any letter. You can negate a class with a caret: [^0-9] matches any non-digit character.

Predefined Character Classes

\d — any digit (same as [0-9])
\D — any non-digit
\w — any word character (letters, digits, underscore)
\W — any non-word character
\s — any whitespace (space, tab, newline)
\S — any non-whitespace character

Quantifiers: How Many Times?

Quantifiers specify how many times the preceding element should appear:

* — zero or more times
+ — one or more times
? — zero or one time (makes it optional)
{3} — exactly 3 times
{2,5} — between 2 and 5 times
{3,} — 3 or more times

For example, \d{3}-\d{4} matches phone numbers like "555-1234", and colou?r matches both "color" and "colour".

Anchors: Where in the String?

Anchors don't match characters — they match positions:

^ — start of the string (or line in multiline mode)
$ — end of the string (or line)
\b — word boundary

The pattern ^\d{5}$ matches a string that is exactly a 5-digit number. \bcat\b matches "cat" as a whole word but not "catalog".

Groups and Alternation

Parentheses ( )

Parentheses create capture groups that let you extract parts of a match or apply quantifiers to a group. For example, (ha)+ matches "ha", "haha", "hahaha", etc.

The Pipe |

Practical Regex Examples

Email Validation (Simplified)

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

This matches most standard email formats. Note that truly RFC-compliant email validation via regex is notoriously complex.

URL Matching

https?://[^\s/$.?#].[^\s]*

Matches URLs starting with http:// or https://.

Phone Number (US Format)

\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}

Matches formats like (555) 123-4567, 555-123-4567, or 555.123.4567.

IP Address

\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b

Matches IPv4 addresses (though it doesn't validate that each octet is 0-255).

HTML Tags

<[^>]+>

Matches HTML tags. However, for real HTML parsing, always use a proper parser — regex and HTML are a famously poor combination for complex documents.

⚡ Try it yourself: Test any of these patterns instantly with the Wootils Regex Tester — paste your pattern and text, and see matches highlighted in real time.

Lookaheads and Lookbehinds

These are advanced features that let you match based on what comes before or after your pattern, without including it in the match:

(?=...) — positive lookahead: match only if followed by ...
(?!...) — negative lookahead: match only if NOT followed by ...
(?<=...) — positive lookbehind: match only if preceded by ...
(?<!...) — negative lookbehind: match only if NOT preceded by ...

For example, \d+(?= dollars) matches "100" in "100 dollars" but not "100" in "100 euros".

Flags and Modifiers

Most regex engines support flags that change how patterns are interpreted:

g — global: find all matches, not just the first
i — case-insensitive matching
m — multiline: ^ and $ match line starts/ends
s — dotall: . also matches newlines

Common Regex Mistakes

Forgetting to escape special characters: If you want to match a literal dot, use \. not .
Greedy vs. lazy matching: .* is greedy (matches as much as possible). Add ? to make it lazy: .*?
Overcomplicating patterns: Start simple and build up. Test incrementally.
Using regex when you shouldn't: For parsing structured data like JSON or HTML, use a proper parser.

Where to Practice

The best way to learn regex is to practice. Use the Wootils Regex Tester to experiment with patterns in real time. Start with simple patterns and gradually increase complexity. Try extracting data from sample text, validating formats, or doing search-and-replace operations.

Conclusion

Regular expressions may look intimidating at first, but they follow logical rules that become second nature with practice. Start with basic character matching and quantifiers, then work your way up to groups, lookaheads, and more advanced features. Once you're comfortable with regex, you'll find yourself reaching for it constantly — it's one of those skills that pays dividends across your entire career.

🔧 Related Wootils Tools:
Regex Tester · JSON Formatter · Text Diff