Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex match 8-digit phone with optional hyphens between each digit

I want to extract 8-digit numbers from a paragraph. It can contain a single optional hyphen anywhere between digits and MUST start with 6 or 7, so the following should match:

71234567
6-1234567
7-123-4567
61-23-45-67
7-1-2-3-4-5-6-7
...

I'd like to extract only the digits, so when matching 7-1-2-3-4-5-6-7, it returns only 71234567.

I tried to hardcode it like this:

[\b\D]([67]-?\d-?\d-?\d-?\d-?\d-?\d-?\d)[\b\D]

and then removing manually the hyphen later, but it doesn't work.

like image 642
GusSL Avatar asked Dec 21 '25 18:12

GusSL


1 Answers

You can't omit chars from a matched substring. You need to postprocess your matches.

Also, note that [\b\D] matches a backspace char or a non-digit char. [\b] does not match a word boundary.

You can use

numbers = [x.replace('-', '') for x in re.findall(r'\b[67](?:-?\d){7}\b', data)]
# or, if the number can be glued to a letter or underscore
numbers = [x.replace('-', '') for x in re.findall(r'(?<!\d)[67](?:-?\d){7}(?!\d)', data)]

See the regex demo. Details:

  • \b - a word boundary
  • (?<!\d) - a negative lookbehind that fails the match if there is a digit immediately to the left of the current location
  • [67] - 6 or 7
  • (?:-?\d){7} - seven occurrences of an optional - and a digit sequences
  • (?!\d) - a negative lookahead that fails the match if there is a digit immediately to the right of the current location.

See the Python demo:

import re
data = '71234567 6-1234567 7-123-4567  61-23-45-67  7-1-2-3-4-5-6-7'
print([x.replace('-', '') for x in re.findall(r'\b[67](?:-?\d){7}\b', data)])
# => ['71234567', '61234567', '71234567', '61234567', '71234567']
like image 123
Wiktor Stribiżew Avatar answered Dec 24 '25 10:12

Wiktor Stribiżew