Python: Extract numbers from strings such as dates, fractions, percentages, etc … here is a solution to the problem.
Python: Extract numbers from strings such as dates, fractions, percentages, etc
I want to recognize all types of numbers in a string.
Example:
a = 'I 0.34 -345 3/4 3% want to get -0.34 2018-09 all numbers'
Result:
['I', '_num', '_num', '_num', '_num', 'want', 'to', 'get', '_num', '_num', 'all', 'numbers']
It’s an NLP project, I don’t know if there’s a better way to get results.
I could just list all the types and then use regular expressions, but it’s not concise, does anyone have a good idea?
Solution
The list here is very concise to understand:
a = 'I 0.34 -345 3/4 3% want to get -0.34 2018-09 all numbers'
pattern = re.compile('\d')
result = ['_num' if re.search(pattern, word) else word for word in re.compile(' +').split(a)]
If the want to get
in the input is a misspelling, then you can split a separate space without using the regular expression:
pattern = re.compile('\d')
result = ['_num' if re.search(pattern, word) else word for word in a.split(' ')]
Result:
['I', '_num', '_num', '_num', '_num', 'want', 'to', 'get', '_num', '_num', 'all', 'numbers']