In Python, regular expressions exclude number matching … here is a solution to the problem.
In Python, regular expressions exclude number matching
To extract any number longer than 2 in a string using a regular expression, but also exclude “2016”, this is what I have:
import re
string = "Employee ID DF856, Year 2016, Department Finance, Team 2, Location 112 "
print re.findall(r'\d{3,}', string)
Output:
['856', '2016', '112']
I tried to change it to the following to exclude “2016” but both failed.
print re.findall(r'\d{3,}/^(!2016)/', string)
print re.findall(r"\d{3,}/?! 2016/", string)
print re.findall(r"\d{3,}!' 2016'", string)
What is the right thing to do? Thank you.
The issue has been extended, see Wiktor Stribiżew’s final comments on the update.
Solution
You can use
it
import re
s = "Employee ID DF856, Year 2016, Department Finance, Team 2, Location 112 20161 12016 120162"
print(re.findall(r'(?<!\d)(?! 2016(?! \d))\d{3,}', s))
See Python demo and one regex demo .
Details
-
(?<!\d)
– No numbers allowed to the left of the current position -
(?! 2016(?! \d))
– No2016
immediately following the right side of the current position is allowed not to be followed by another number -
\d{3,}
– 3 or more digits.
Alternative solution with some code:
import re
s = "Employee ID DF856, Year 2016, Department Finance, Team 2, Location 112 20161 12016 120162"
print([x for x in re.findall(r'\d{3,}', s) if x != "2016"])
Here, we extract any block (re.findall(r'\d{3,}', s))
of any 3 or more digits and then filter out those equal to 2016