Groups and Capturing

Learn regular expressions by seeing patterns, test cases, and match results side-by-side

Overview

Extract and organize matched data using capturing groups

01 Basic Capture

Parentheses () create capturing groups that extract matched portions.

Pattern breakdown:

  • (\d{3}) - captures exactly 3 digits (Group 1)
  • - - literal hyphen
  • (\d{4}) - captures exactly 4 digits (Group 2)

This extracts phone numbers and separates the prefix and number into groups.

Source: Python Regex HOWTO - Grouping

Pattern: (\d{3})-(\d{4})
Test Input
Call me at 555-1234
Another number: 867-5309
Invalid: 12-345
Match Results
Line 1: "Call me at 555-1234"
Match 1: "555-1234"
Groups: 1: "555", 2: "1234"
Line 2: "Another number: 867-5309"
Match 1: "867-5309"
Groups: 1: "867", 2: "5309"
Line 3: "Invalid: 12-345" - No match
All matches found: 2
Matches: [('555', '1234'), ('867', '5309')]

02 Named Groups

Named groups (?P<name>...) let you reference captures by name instead of number.

Pattern breakdown:

  • (?P<year>\d{4}) - 4 digits named "year"
  • (?P<month>\d{2}) - 2 digits named "month"
  • (?P<day>\d{2}) - 2 digits named "day"

This makes extracted data much more readable and maintainable in code.

Source: Python re Syntax - Named Groups

Pattern: (?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})
Test Input
Today is 2024-01-15
Another date: 2023-12-25
Invalid: 24-1-5
Match Results
Line 1: "Today is 2024-01-15"
Match 1: "2024-01-15"
Groups: 1: "2024", 2: "01", 3: "15"
Named: year: "2024", month: "01", day: "15"
Line 2: "Another date: 2023-12-25"
Match 1: "2023-12-25"
Groups: 1: "2023", 2: "12", 3: "25"
Named: year: "2023", month: "12", day: "25"
Line 3: "Invalid: 24-1-5" - No match
All matches found: 2
Matches: [('2024', '01', '15'), ('2023', '12', '25')]

03 Non Capturing

Non-capturing groups (?:...) group patterns without creating a capture.

Pattern breakdown:

  • (?:Mr|Ms|Mrs) - match title but don't capture it
  • \. - literal period
  • (\w+) - capture the last name (Group 1)

Use non-capturing groups when you need grouping for alternation or quantifiers but don't need to extract that part.

Source: Regular-Expressions.info - Non-Capturing Groups

Pattern: (?:Mr|Ms|Mrs)\. (\w+)
Test Input
Mr. Smith
Ms. Johnson
Mrs. Williams
Dr. Brown
Match Results
Line 1: "Mr. Smith"
Match 1: "Mr. Smith"
Groups: 1: "Smith"
Line 2: "Ms. Johnson"
Match 1: "Ms. Johnson"
Groups: 1: "Johnson"
Line 3: "Mrs. Williams"
Match 1: "Mrs. Williams"
Groups: 1: "Williams"
Line 4: "Dr. Brown" - No match
All matches found: 3
Matches: ['Smith', 'Johnson', 'Williams']