Python – Write a list to a pandas dataframe to csv, read a dataframe from csv and convert to a list again without strings

Write a list to a pandas dataframe to csv, read a dataframe from csv and convert to a list again without strings… here is a solution to the problem.

Write a list to a pandas dataframe to csv, read a dataframe from csv and convert to a list again without strings

Initially I had a list of lists, each containing string tuples (from some calculations). I want to save them for later use so I don’t have to do all the calculations again and just read the csv.

 L = [l1,l2,...]
 l1 = [('a','b'), ('c','d'),...]
 l2 = [('e','f'), ('g','h'),...] ...

I converted it to a pandas data frame:

 import pandas as pd
 df = pd. DataFrame(L)
 df.to_csv('MyLists.csv', sep="; ")

So each list l is saved as a row in csv.
After a while I want to use the list saved in csv again.
So I imported pandas again and did :

readdf = pd.read_csv('MyLists.csv', delimiter = ";" )
newList = readdf.values.tolist()

The problem is that now each tuple itself is a string, i.e. each list in the newList looks like this:

l1 = ['('a','b')', '('c', 'd')',...]

When I look at csv with a text editor, it looks correct, kind of like :

('a','b');('c','d'); ... 

I tried to read it directly :

import csv

newList = []
with open('MyLists.csv') as f:    
    reader = csv.reader(f, delimiter="; ")
    for row in reader:
        newList.append(row)

But the problem is the same.
So how can I get rid of the excess “‘”?

Solution

I think you need to convert string to tuples because the data in csv is string:

import ast

l1 = [('a','b'), ('c','d')]
l2 = [('e','f'), ('g','h')]
L = [l1,l2]

df = pd. DataFrame(L)
print (df)
        0       1
0  (a, b)  (c, d)
1  (e, f)  (g, h)

df.to_csv('MyLists.csv', sep="; ")

readdf = pd.read_csv('MyLists.csv', delimiter = ";" , index_col=0)
newList = readdf.applymap(ast.literal_eval).values.tolist()
print (newList)
[[('a', 'b'), ('c', 'd')], [('e', 'f'), ('g', 'h')]]

But I think it’s better to save data using pickle – use to_ pickle/read_pickle :

df.to_pickle('MyLists.pkl')

Related Problems and Solutions