Python – Find multiple strings between characters

Find multiple strings between characters… here is a solution to the problem.

Find multiple strings between characters

I have a long string with data like this :

category: 33 ; id: AF45DA; category: 54 ; id: KF65YA; category: 60 ; id: XC36IA;

I want to create a list from it that looks like this:

new_list = [33,54,60]

Basically I just need category: and ; The values are in the string while maintaining the original order.

I

can create something that looks like it’s working, and I’m assuming there will be an exception when it doesn’t work. I’m new to Python and don’t really understand the possibilities.

This is the actual version:

s = "category: 33 ; id: AF45DA; category: 54 ; id: KF65YA; category: 60 ; id: XC36IA; "
c = s.count("category")
z = 0
number_list = []
for x in range(z,c):
    val = s.split('category:')[x+1]
    number = val.split(' ; ')[0]
    print (number)
    number_list.append(number.strip())

print ("All Values:", number_list)

Solution

Simply construct a regular expression:

<b>import re</b>

rgx = <b>re.compile(r'category:\s*(\d+)\s*; ')</b>
number_list = <b>rgx.findall(</b>'category: 33 ; id: AF45DA; category: 54 ; id: KF65YA; category: 60 ; id: XC36IA; '<b>) </b>

This gives:

>>> rgx.findall('category: 33 ; id: AF45DA; category: 54 ; id: KF65YA; category: 60 ; id: XC36IA; ')
['33', '54', '60']

If you want the result to be int, you can use map:

import re

rgx = re.compile(r'category:\s*(\d+)\s*; ')
number_list = <b>list(map(int,</b>rgx.findall('category: 33 ; id: AF45DA; category: 54 ; id: KF65YA; category: 60 ; id: XC36IA; ')<b>)))</b>

This produces:

>>> number_list
[33, 54, 60]

Related Problems and Solutions