Regex match 8-digit phone with optional hyphens between each digit

Question

I want to extract 8-digit numbers from a paragraph. It can contain a single optional hyphen anywhere between digits and MUST start with 6 or 7, so the following should match:

71234567
6-1234567
7-123-4567
61-23-45-67
7-1-2-3-4-5-6-7
...

I'd like to extract only the digits, so when matching 7-1-2-3-4-5-6-7, it returns only 71234567.

I tried to hardcode it like this:

[\b\D]([67]-?\d-?\d-?\d-?\d-?\d-?\d-?\d)[\b\D]

and then removing manually the hyphen later, but it doesn't work.

Wiktor Stribiżew · Accepted Answer

You can't omit chars from a matched substring. You need to postprocess your matches.

Also, note that [\b\D] matches a backspace char or a non-digit char. [\b] does not match a word boundary.

You can use

numbers = [x.replace('-', '') for x in re.findall(r'\b[67](?:-?\d){7}\b', data)]
# or, if the number can be glued to a letter or underscore
numbers = [x.replace('-', '') for x in re.findall(r'(?<!\d)[67](?:-?\d){7}(?!\d)', data)]

See the regex demo. Details:

\b - a word boundary
(?<!\d) - a negative lookbehind that fails the match if there is a digit immediately to the left of the current location
[67] - 6 or 7
(?:-?\d){7} - seven occurrences of an optional - and a digit sequences
(?!\d) - a negative lookahead that fails the match if there is a digit immediately to the right of the current location.

See the Python demo:

import re
data = '71234567 6-1234567 7-123-4567  61-23-45-67  7-1-2-3-4-5-6-7'
print([x.replace('-', '') for x in re.findall(r'\b[67](?:-?\d){7}\b', data)])
# => ['71234567', '61234567', '71234567', '61234567', '71234567']

Regex match 8-digit phone with optional hyphens between each digit

Tags:

regex

python-3.x

GusSL

1 Answers

Wiktor Stribiżew

Recent Activity

Donate For Us

Regex match 8-digit phone with optional hyphens between each digit

Tags:

regex

python-3.x

GusSL

1 Answers

Wiktor Stribiżew

Related questions

Recent Activity

Donate For Us