Python regular expressions replace all numbers unless they are part of a substring

Python regular expressions replace all numbers unless they are part of a substring … here is a solution to the problem.

Python regular expressions replace all numbers unless they are part of a substring

I want to remove all numbers unless they form one of the special substrings. In the example below, the special substrings I should skip number removal are 1s, 2s, s4, 3s. I guess I need to use negative forwarding

s = "a61s8sa92s3s3as4s4af3s"
pattern = r"(?! 1s|2s|s4|3s)[0-9\.]"
re.sub(pattern, ' ', s)

As I understand it, the pattern above is:

  • Match all numbers including decimals starting from the end ([]).
  • Only do this if we don’t have a pattern after matching ?!
  • They are 1, 2, s4, or 3 (| = OR).

Everything makes sense until you try. The example s above returns a 1s sa 2s3s as s af3s, which indicates that all exclusion patterns are valid unless the number is at the end of a special substring, in which case it still matches?!

I

believe this operation should return a 1s sa 2s3s as4s4af3s, how do I fix my pattern?

Solution

You can use

it

import re
s = "a61s8sa92s3s3as4s4af3s"
pattern = r"(1s|2s|s4|3s)| [\d.]"
print( re.sub(pattern, lambda x: x.group(1) or ' ', s) )
# => a 1s sa 2s3s as4s4af3s

See Python demo

Details:

  • (1s|2s|s4|3s) – Group 1: 1s, 2s, s4, or 3s
  • | – or
  • [\d.] – A number or dot.

If Group 1 matches, Group 1 values are substitutes, otherwise, it is a space.

Related Problems and Solutions