In Python, regular expressions exclude number matching

In Python, regular expressions exclude number matching … here is a solution to the problem.

In Python, regular expressions exclude number matching

To extract any number longer than 2 in a string using a regular expression, but also exclude “2016”, this is what I have:

import re

string = "Employee ID DF856, Year 2016, Department Finance, Team 2, Location 112 "

print re.findall(r'\d{3,}', string)

Output:

['856', '2016', '112']

I tried to change it to the following to exclude “2016” but both failed.

print re.findall(r'\d{3,}/^(!2016)/', string)
print re.findall(r"\d{3,}/?! 2016/", string)
print re.findall(r"\d{3,}!' 2016'", string)

What is the right thing to do? Thank you.

The issue has been extended, see Wiktor Stribiżew’s final comments on the update.

Solution

You can use

it

import re
s = "Employee ID DF856, Year 2016, Department Finance, Team 2, Location 112 20161 12016 120162"
print(re.findall(r'(?<!\d)(?! 2016(?! \d))\d{3,}', s))

See Python demo and one regex demo .

Details

  • (?<!\d) – No numbers allowed to the left of the current position
  • (?! 2016(?! \d)) – No 2016 immediately following the right side of the current position is allowed not to be followed by another number
  • \d{3,} – 3 or more digits.

Alternative solution with some code:

import re
s = "Employee ID DF856, Year 2016, Department Finance, Team 2, Location 112 20161 12016 120162"
print([x for x in re.findall(r'\d{3,}', s) if x != "2016"])

Here, we extract any block (re.findall(r'\d{3,}', s)) of any 3 or more digits and then filter out those equal to 2016

Related Problems and Solutions