R
Regex Master
TutorialsToolsFAQAboutContact
  1. Home
  2. Tutorials
  3. Programming
  4. Java Regular Expressions: Pattern and Matcher Advanced Usage
September 27, 2024Regex Master Team11 min read

Java Regular Expressions: Pattern and Matcher Advanced Usage

Programmingjavaregexpattern-matcherprogramming

Master Java's Pattern and Matcher classes, learn advanced regex techniques and best practices.

Java provides powerful regular expression support through the Pattern and Matcher classes. This guide will take you from basics to advanced, covering everything you need to know about Java regular expressions.

Java Regex Basics

Core Classes Introduction

Java's regex functionality is provided by the java.util.regex package, containing two main classes:

  • Pattern: Represents a compiled regular expression pattern
  • Matcher: Uses the pattern to perform matching operations on input strings

Basic Usage Flow

import java.util.regex.Pattern;
import java.util.regex.Matcher;

// 1. Compile regular expression
Pattern pattern = Pattern.compile("\\d+");

// 2. Create Matcher object
Matcher matcher = pattern.matcher("Hello 123 World");

// 3. Perform matching operations
if (matcher.find()) {
    System.out.println("Match found: " + matcher.group());
}

Pattern Class Details

1. Creating Pattern Objects

Basic Compilation

// Simple pattern
Pattern pattern = Pattern.compile("abc");

// Using flags
Pattern caseInsensitive = Pattern.compile("abc", Pattern.CASE_INSENSITIVE);

// Multi-line mode
Pattern multiLine = Pattern.compile("^test$", Pattern.MULTILINE);

// Combine multiple flags
Pattern combined = Pattern.compile(
    "test",
    Pattern.CASE_INSENSITIVE | Pattern.MULTILINE
);

Pattern Flags Explained

// CASE_INSENSITIVE - Ignore case
Pattern p1 = Pattern.compile("hello", Pattern.CASE_INSENSITIVE);
// Matches: hello, HELLO, Hello, etc.

// MULTILINE - Multi-line mode
Pattern p2 = Pattern.compile("^test$", Pattern.MULTILINE);
// Can match at the beginning and end of each line

// DOTALL - Dot matches newline
Pattern p3 = Pattern.compile(".*", Pattern.DOTALL);
// . can match newline characters

// UNICODE_CASE - Unicode case
Pattern p4 = Pattern.compile("äbc", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);

// CANON_EQ - Canonical equivalence
Pattern p5 = Pattern.compile("a\u030A", Pattern.CANON_EQ);
// Matches "å" (regardless of encoding)

Pre-defined Patterns

// Java provides some common pre-defined patterns

// Check for integer
boolean isInteger = Pattern.matches("-?\\d+", "123");

// Check for email
boolean isEmail = Pattern.matches(
    "[\\w.-]+@[\\w.-]+\\.[a-z]{2,}",
    "[email protected]"
);

// Check for phone number (simple version)
boolean isPhone = Pattern.matches("\\d{11}", "13812345678");

2. Pattern Common Methods

split() - Split String

String text = "apple,banana;orange|grape";
Pattern pattern = Pattern.compile("[,;|]");

String[] fruits = pattern.split(text);
// Output: ["apple", "banana", "orange", "grape"]

// Limit split count
String[] parts = pattern.split(text, 2);
// Output: ["apple", "banana;orange|grape"]

// Keep whitespace
String[] all = pattern.split(text, -1);

quote() - Escape Special Characters

String special = "a.b*c+d?e";
String escaped = Pattern.quote(special);
System.out.println(escaped);
// Output: \Qa.b*c+d?e\E

// Create literal matching pattern
Pattern literal = Pattern.compile(Pattern.quote("1.2"));
// Only matches literal string "1.2", not any number

Matcher Class Details

1. Creating Matcher Objects

Pattern pattern = Pattern.compile("\\d+");
String text = "Order #123, #456, #789";

Matcher matcher = pattern.matcher(text);

2. Matching Methods

matches() - Full Match

Pattern pattern = Pattern.compile("\\d+");
Matcher matcher = pattern.matcher("123");

if (matcher.matches()) {
    System.out.println("Entire string is digits");
}

// No match case
matcher = pattern.matcher("abc123");
System.out.println(matcher.matches());  // false

lookingAt() - Match from Beginning

Pattern pattern = Pattern.compile("\\d+");
Matcher matcher = pattern.matcher("123abc");

if (matcher.lookingAt()) {
    System.out.println("Matched from beginning");
}

// Difference with matches()
System.out.println(pattern.matcher("123abc").matches());      // false
System.out.println(pattern.matcher("123abc").lookingAt());    // true

find() - Search for Match

Pattern pattern = Pattern.compile("\\d+");
String text = "a1b2c3d4";
Matcher matcher = pattern.matcher(text);

// Find all matches
while (matcher.find()) {
    System.out.printf("Found: %s, Position: %d%n",
        matcher.group(), matcher.start());
}

// Output:
// Found: 1, Position: 1
// Found: 2, Position: 3
// Found: 3, Position: 5
// Found: 4, Position: 7

3. Getting Match Information

group() - Get Match Content

Pattern pattern = Pattern.compile("(\\d{4})-(\\d{2})-(\\d{2})");
Matcher matcher = pattern.matcher("Date: 2024-01-25");

if (matcher.find()) {
    // Full match
    System.out.println(matcher.group());        // 2024-01-25

    // Capture groups
    System.out.println(matcher.group(1));       // 2024
    System.out.println(matcher.group(2));       // 01
    System.out.println(matcher.group(3));       // 25

    // Group count
    System.out.println(matcher.groupCount());   // 3
}

start() and end() - Get Position

Pattern pattern = Pattern.compile("\\d+");
Matcher matcher = pattern.matcher("abc123def");

if (matcher.find()) {
    System.out.println("Start: " + matcher.start());  // 3
    System.out.println("End: " + matcher.end());    // 6
    System.out.println("Length: " + matcher.group().length());  // 3
}

start(int) and end(int) - Get Group Position

Pattern pattern = Pattern.compile("(\\w+)@(\\w+\\.\\w+)");
Matcher matcher = pattern.matcher("[email protected]");

if (matcher.find()) {
    System.out.println("Full match: " + matcher.group(0));
    System.out.println("Username position: " + matcher.start(1) + "-" + matcher.end(1));
    System.out.println("Domain position: " + matcher.start(2) + "-" + matcher.end(2));
}

4. Replacement Methods

replaceAll() - Replace All Matches

Pattern pattern = Pattern.compile("\\d+");
String text = "Price: 100, 200, 300";

String result = pattern.matcher(text).replaceAll("[number]");
System.out.println(result);
// Output: Price: [number], [number], [number]

// Using callback function (Java 9+)
String result2 = pattern.matcher(text).replaceAll(match -> {
    int num = Integer.parseInt(match.group());
    return String.valueOf(num * 0.9);
});
System.out.println(result2);
// Output: Price: 90.0, 180.0, 270.0

replaceFirst() - Replace First Match

Pattern pattern = Pattern.compile("\\d+");
String text = "Price: 100, 200, 300";

String result = pattern.matcher(text).replaceFirst("[number]");
System.out.println(result);
// Output: Price: [number], 200, 300

appendReplacement() and appendTail() - Accumulate Replacement

Pattern pattern = Pattern.compile("\\d+");
String text = "Count: 100, Price: 200";
Matcher matcher = pattern.matcher(text);

StringBuffer sb = new StringBuffer();

while (matcher.find()) {
    int num = Integer.parseInt(matcher.group());
    matcher.appendReplacement(sb, String.valueOf(num * 0.9));
}
matcher.appendTail(sb);

System.out.println(sb.toString());
// Output: Count: 90.0, Price: 180.0

Advanced Features

1. Named Capture Groups (Java 7+)

Pattern pattern = Pattern.compile(
    "(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})"
);
Matcher matcher = pattern.matcher("2024-01-25");

if (matcher.find()) {
    System.out.println(matcher.group("year"));   // 2024
    System.out.println(matcher.group("month"));  // 01
    System.out.println(matcher.group("day"));     // 25
}

2. Lookahead and Lookbehind

Positive Lookahead

// Match numbers followed by "元"
Pattern pattern = Pattern.compile("\\d+(?=元)");
Matcher matcher = pattern.matcher("Price: 100元, 200元");

while (matcher.find()) {
    System.out.println(matcher.group());
}
// Output: 100, 200

Negative Lookahead

// Match numbers not followed by "元"
Pattern pattern = Pattern.compile("\\d+(?!元)");
Matcher matcher = pattern.matcher("Count: 100, Price: 200元");

while (matcher.find()) {
    System.out.println(matcher.group());
}
// Output: 100

Positive Lookbehind

// Match numbers preceded by "Price:"
Pattern pattern = Pattern.compile("(?<=Price:)\\d+");
Matcher matcher = pattern.matcher("Price: 100, Count: 200");

while (matcher.find()) {
    System.out.println(matcher.group());
}
// Output: 100

3. Boundary Matching

String text = "hello world hello";

// \b - Word boundary
Pattern wordBoundary = Pattern.compile("\\bhello\\b");
Matcher matcher1 = wordBoundary.matcher(text);
while (matcher1.find()) {
    System.out.println(matcher1.group());
}
// Output: hello, hello (two standalone words)

// \B - Non-word boundary
Pattern nonWordBoundary = Pattern.compile("\\Bhello\\B");
Matcher matcher2 = nonWordBoundary.matcher(text);
System.out.println(matcher2.find());  // false

4. Quantifiers and Greediness

String text = "<div>content1</div><div>content2</div>";

// Greedy match (default)
Pattern greedy = Pattern.compile("<div>.*</div>");
Matcher matcher1 = greedy.matcher(text);
if (matcher1.find()) {
    System.out.println("Greedy: " + matcher1.group());
    // Output: <div>content1</div><div>content2</div>
}

// Non-greedy match
Pattern lazy = Pattern.compile("<div>.*?</div>");
Matcher matcher2 = lazy.matcher(text);
if (matcher2.find()) {
    System.out.println("Non-greedy: " + matcher2.group());
    // Output: <div>content1</div>
}

Practical Examples

Example 1: Validate Email Address

import java.util.regex.Pattern;

public class EmailValidator {
    private static final Pattern EMAIL_PATTERN = Pattern.compile(
        "^[\\w.-]+@[\\w.-]+\\.[a-zA-Z]{2,}$"
    );

    public static boolean isValid(String email) {
        if (email == null) {
            return false;
        }
        return EMAIL_PATTERN.matcher(email).matches();
    }

    public static void main(String[] args) {
        System.out.println(isValid("[email protected]"));      // true
        System.out.println(isValid("invalid.email"));          // false
        System.out.println(isValid("user@domain"));            // false
    }
}

Example 2: Extract Web Links

import java.util.regex.*;
import java.util.ArrayList;
import java.util.List;

public class LinkExtractor {
    private static final Pattern LINK_PATTERN = Pattern.compile(
        "href=[\"']([^\"']+)[\"']"
    );

    public static List<String> extractLinks(String html) {
        List<String> links = new ArrayList<>();
        Matcher matcher = LINK_PATTERN.matcher(html);

        while (matcher.find()) {
            links.add(matcher.group(1));
        }

        return links;
    }

    public static void main(String[] args) {
        String html = """
            <a href="https://example.com">Link 1</a>
            <a href="http://site.org/page">Link 2</a>
            <a href="/relative/path">Link 3</a>
            """;

        List<String> links = extractLinks(html);
        links.forEach(System.out::println);
    }
}

Example 3: Log Analysis

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class LogAnalyzer {
    private static final Pattern LOG_PATTERN = Pattern.compile(
        "(\\d{4}-\\d{2}-\\d{2}) (\\d{2}:\\d{2}:\\d{2}) \\[(\\w+)\\] (.+)"
    );

    public static void analyzeLog(String log) {
        Matcher matcher = LOG_PATTERN.matcher(log);

        if (matcher.matches()) {
            String date = matcher.group(1);
            String time = matcher.group(2);
            String level = matcher.group(3);
            String message = matcher.group(4);

            System.out.printf("Time: %s %s%n", date, time);
            System.out.printf("Level: %s%n", level);
            System.out.printf("Message: %s%n", message);
        }
    }

    public static void main(String[] args) {
        String log = "2024-01-25 10:30:45 [INFO] User login successful";
        analyzeLog(log);
    }
}

Example 4: Batch Replacement

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class BatchReplacer {
    public static String replaceWithPosition(String text, String search) {
        Pattern pattern = Pattern.compile(search);
        Matcher matcher = pattern.matcher(text);
        StringBuffer result = new StringBuffer();

        while (matcher.find()) {
            String replacement = String.format("[%s@%d]",
                matcher.group(), matcher.start());
            matcher.appendReplacement(result, Matcher.quoteReplacement(replacement));
        }
        matcher.appendTail(result);

        return result.toString();
    }

    public static void main(String[] args) {
        String text = "apple banana apple";
        System.out.println(replaceWithPosition(text, "apple"));
        // Output: [apple@0] banana [apple@13]
    }
}

Example 5: HTML Cleanup

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class HtmlCleaner {
    private static final Pattern TAG_PATTERN = Pattern.compile("<[^>]+>");

    public static String stripHtml(String html) {
        return TAG_PATTERN.matcher(html).replaceAll("");
    }

    public static void main(String[] args) {
        String html = "<p>Hello <b>World</b>!</p>";
        System.out.println(stripHtml(html));
        // Output: Hello World!
    }
}

Best Practices

1. Pre-compile Patterns

// Good practice: Pre-compile
public class EmailValidator {
    private static final Pattern EMAIL_PATTERN = Pattern.compile(
        "^[\\w.-]+@[\\w.-]+\\.[a-zA-Z]{2,}$"
    );

    public static boolean isValid(String email) {
        return EMAIL_PATTERN.matcher(email).matches();
    }
}

// Bad practice: Compile every time
public static boolean isValid(String email) {
    Pattern pattern = Pattern.compile("^[\\w.-]+@[\\w.-]+\\.[a-zA-Z]{2,}$");
    return pattern.matcher(email).matches();
}

2. Use Static Constants

public class RegexPatterns {
    public static final Pattern EMAIL = Pattern.compile(
        "^[\\w.-]+@[\\w.-]+\\.[a-zA-Z]{2,}$"
    );
    public static final Pattern PHONE = Pattern.compile("^\\d{11}$");
    public static final Pattern DATE = Pattern.compile(
        "^(\\d{4})-(\\d{2})-(\\d{2})$"
    );
}

// Use
if (RegexPatterns.EMAIL.matcher(email).matches()) {
    // ...
}

3. Handle Null and Empty Values

public static boolean isValid(String input) {
    if (input == null || input.isEmpty()) {
        return false;
    }
    return PATTERN.matcher(input).matches();
}

4. Use StringBuilder Instead of StringBuffer (Java 5+)

// Java 5+
StringBuilder sb = new StringBuilder();
while (matcher.find()) {
    matcher.appendReplacement(sb, replacement);
}
matcher.appendTail(sb);

5. Exception Handling

try {
    Pattern pattern = Pattern.compile("[" + input + "]");
    // Use pattern
} catch (PatternSyntaxException e) {
    System.err.println("Invalid regex: " + e.getMessage());
}

Performance Optimization

1. Avoid Repeated Compilation

// Good: Compile only once
Pattern pattern = Pattern.compile("\\d+");
for (String text : texts) {
    pattern.matcher(text).find();
}

// Bad: Compile every time
for (String text : texts) {
    Pattern.compile("\\d+").matcher(text).find();
}

2. Use More Specific Patterns

// Good: Specific pattern
Pattern good = Pattern.compile("\\d{3}-\\d{4}-\\d{4}");

// Bad: Broad pattern
Pattern bad = Pattern.compile(".+");

3. Use Non-Greedy Quantifiers

// Good: Non-greedy
Pattern good = Pattern.compile("<div>.*?</div>");

// Bad: Greedy
Pattern bad = Pattern.compile("<div>.*</div>");

Common Pitfalls

1. Forgetting to Escape Backslashes

// Error: No escape
Pattern p = Pattern.compile("\d+");  // Compilation error or unexpected behavior

// Correct: Escape backslash
Pattern p = Pattern.compile("\\d+");

2. Not Checking matches() Result

Matcher matcher = pattern.matcher(text);
// Error: May not have a match
System.out.println(matcher.group());  // IllegalStateException

// Correct: Check
if (matcher.find()) {
    System.out.println(matcher.group());
}

3. Forgetting to Reuse Pattern

// Error: Create new Pattern every time
for (int i = 0; i < 1000; i++) {
    Pattern p = Pattern.compile("\\d+");
    p.matcher(text).find();
}

// Correct: Reuse Pattern
Pattern p = Pattern.compile("\\d+");
for (int i = 0; i < 1000; i++) {
    p.matcher(text).find();
}

Summary

Java's Pattern and Matcher classes provide powerful and flexible regex support:

  • Pattern: Compiles regular expressions, provides static methods
  • Matcher: Performs matching operations, provides rich query and replacement features
  • Advanced features: Named capture groups, lookahead/lookbehind, boundary matching, etc.
  • Best practices: Pre-compile, use constants, handle exceptions properly

Mastering these techniques allows you to efficiently handle various text processing tasks, from simple validation to complex parsing.

Use our online Regex Tester to practice Java regex patterns!


About the Author

The Regex Master Team consists of experienced developers and technical writers dedicated to simplifying regular expressions for everyone. We ensure all patterns are rigorously tested and verified to provide accurate, production-ready solutions.

Try It: Regex Tester

Use our interactive regex tester to experiment with the patterns you learned in this article. Test your regular expressions in real-time and see immediate results.

Loading tester...

Related Articles

C# (.NET) Regular Expressions Classic Cases

Deep dive into C# regular expressions, master Regex class advanced usage and practical application cases.

Read Article

Golang Regex: regexp Package Best Practices

Learn Go language's regexp package, master efficient regex usage techniques and best practices.

Read Article

JavaScript Regex Methods: test vs match vs exec

Deep dive into JavaScript regex methods: understand the differences between test, match, and exec, and learn when to use each.

Read Article

PHP Regex Functions: preg_match vs preg_replace

Deep dive into PHP regex functions, understand the differences between preg_match and preg_replace, and learn correct usage.

Read Article
R
Regex Master

Your comprehensive guide to mastering regular expressions through tutorials and tools.

Company

  • About Us
  • Contact
  • FAQ

Resources

  • All Articles
  • Popular Tools
  • Sitemap

Legal

  • Privacy Policy
  • Terms of Service
  • Cookie Policy
  • Disclaimer

© 2026 Regex Master. All rights reserved.