Test regular expressions live, highlight matches and preview replacements.
{{ replaceResult }}
| # | Match | Index | Groups |
|---|---|---|---|
| {{ i + 1 }} | {{ m.value }} |
{{ m.index }} | — ${{ gi + 1 }}: {{ g ?? 'undefined' }} |
{{ item.token }} |
{{ item.desc }} |
Regular expressions (RegEx) are powerful search patterns used in text processing and programming. They allow you to define complex string patterns to search, validate, extract or replace text. Almost every programming language — from JavaScript to Python to PHP — supports regular expressions.
RegEx is used for email validation, phone number detection, URL parsing, log analysis, data extraction from HTML, search-and-replace in code editors and much more. Our online RegEx tester uses the JavaScript RegExp engine, so you can test patterns directly in your browser.
*?, +?) when you need the shortest match.() to extract parts of a match and use them in replacements.(a+)+ can cause catastrophic backtracking.A regex (regular expression) is the formal description of a language class that a finite automaton (NFA or DFA) can recognize. Practically every mainstream engine — V8 (JavaScript), PCRE (PHP, Perl), RE2 (Go, Rust), .NET Regex, Java java.util.regex — uses an NFA-backtracking engine that matches the pattern's tokens sequentially against the input. The JavaScript engine implements the ECMA-262 RegExp subset, which omits some features compared to PCRE: no possessive quantifiers a++, no atomic groups (?>abc), no relative backreferences. Patterns ported between PHP and JS stumble exactly here.
When matching, the engine systematically tries every alternative. With a.*b on the input aXXXb, the greedy quantifier .* first consumes everything to the end and then backtracks character by character until the b matches. That is the Achilles heel: certain patterns like (a+)+$ on aaaaaaaaaaaaaaaab produce exponential backtracking paths, known as catastrophic backtracking, which has caused CPU lock-ups in production (CVE-2019-12041, the famous Cloudflare outage of 2019 was exactly this). Mitigations: lazy quantifier (a+?)+ or atomic groups (in PCRE), or switching to RE2-based engines like Hyperscan, which guarantee linear time.
Capturing groups (abc) remember the matched substring for backreferences \1 or the replacement expression $1. Non-capturing groups (?:abc) do the same without the memory overhead — the engine does not have to keep state internally. Lookarounds (?=...) (positive lookahead) and (?<=...) (positive lookbehind) are zero-width assertions: they check that a position matches a pattern without consuming characters. ECMA-262 supports lookbehind since ES2018 — before that it was a JS-specific pain point. Named groups (?<name>abc) are also available since ES2018 and expose match.groups.name.
These tokens cover most of the patterns you find in REST APIs, validators and log parsers.
\d digit, \w word character [A-Za-z0-9_], \s whitespace, uppercase negates (\D, \W, \S).* 0+, + 1+, ? 0/1, {3} exactly 3, {2,5} 2 to 5. A trailing ? makes them lazy: .*?.^ start of line, $ end of line, \b word boundary. In multiline mode ^ and $ match at every \n.(abc) capturing, (?:abc) non-capturing, (?<name>abc) named, a|b alternation.(?=...) positive lookahead, (?!...) negative lookahead, (?<=...) positive lookbehind, (?<!...) negative lookbehind.These expressions cover common validation and parsing tasks. Always test — email validation via regex is famously fragile, use a dedicated library for production.
^[A-Z][a-zA-Z0-9]*$ matches PascalCase identifiers (FooBar, HTTPClient), not foo or 123Bar.^\d{4}-\d{2}-\d{2}$ roughly validates an ISO date (2024-06-09) — full validation requires checking valid day numbers.\b(?:\d{1,3}\.){3}\d{1,3}\b finds IPv4 addresses in text. Stricter would be (?:25[0-5]|2[0-4]\d|[01]?\d?\d) per octet.(?<=\?)[^&]+(?=&|$) extracts the first query-parameter value from a URL without the leading ?.^(?=.*[A-Z])(?=.*\d)(?=.*[^A-Za-z0-9]).{12,}$ only accepts passwords with at least 12 characters, one uppercase letter, one digit, one special character.First boundary: backtracking. Patterns like (a+)+b or (a|aa)*b need exponential time on input without b — the classic ReDoS (Regular Expression Denial of Service) vector. Test against worst-case input before going to production, or prefer the RE2 backend (Go, Rust). Second: Unicode. \w in JavaScript without the u flag matches only ASCII word characters, German umlauts drop out. With /pattern/u the behaviour changes, in exchange surrogate pairs are handled correctly. PCRE needs the u modifier explicitly. Third: . by default does not match newline. With the DOTALL/s flag it does — useful for multi-line logs. Fourth: greedy vs. lazy. <.*> on <a>text</a> matches the entire <a>text</a> because .* is greedy. With <.*?> you get just <a>. Fifth: parsing HTML with regex is an anti-pattern — the language is not regular. Use a real parser for HTML/XML, JSON.parse for JSON.
*, +, ?, {n,m}) try to match as many characters as possible and only backtrack if the rest of the pattern would not match otherwise. Lazy variants (*?, +?, ??, {n,m}?) match as little as possible and only expand if needed. <.*> gives you the largest match, <.*?> the smallest.i flag: /foo/i matches foo, FOO, Foo. In PCRE and many other engines the same works via the modifier (?i) inline: (?i)foo. To make only parts case-insensitive use (?i:foo)bar in PCRE.. by default does not match \n. Enable DOTALL mode with the s flag (/pattern/s) or the inline modifier (?s). Or match any character explicitly: [\s\S]* or [\d\D]*.\. matches a literal dot, \? a literal question mark, \( a literal opening paren. Inside code strings you have to escape the backslash again: in JavaScript "\\.", in PHP strings "\\." or better the single-quoted form '\.'.^(a+)+$ on aaaaaaaaaaaaaaaab takes seconds to minutes — and that is only 17 characters. Avoid by unnesting patterns, using possessive or atomic groups, or an RE2-based engine like Go's regexp or Rust's regex.(?J) do not exist in JS.