Python – reshape the list of strings as rows

reshape the list of strings as rows… here is a solution to the problem.

reshape the list of strings as rows

I have a Pandas data frame like this :

df = pandas. DataFrame({
        'Grouping': ["A", "B", "C"], 
        'Elements': ['[\"A1\"]', '[\"B1\", \"B2\", \"B3\"]', '[\"C1\", \"C2\"]']
    }).set_index('Grouping')

So

            Elements
Grouping
===============================
A           ["A1"]
B           ["B1", "B2", "B3"]
C           ["C1", "C2"]

That is, some lists are encoded as lists of strings. What is a clean way to reshape it into a neat dataset like this:

            Elements
Grouping
====================
A           A1
B           B1
B           B2
B           B3
C           C1
C           C2

Not resorting to for loops? The best I can think of:

df1 = pandas. DataFrame()
for index, row in df.iterrows():
    df_temp = pandas. DataFrame({'Elements': row['Elements'].replace("[\"", "").replace("\"]", "").split('\", \"')})
    df_temp['Grouping'] = index
    df1 = pandas.concat([df1, df_temp])
df1.set_index('Grouping', inplace=True)

But it’s ugly.

Solution

You can use .str.extractall():

df. Elements.str.extractall(r'"(.+?)"'). reset_index(level="match", drop=True).rename({0:"Elements"}, axis=1)

Result:

         Elements
Grouping         
A              A1
B              B1
B              B2
B              B3
C              C1
C              C2

Related Problems and Solutions