Python – How can I use Python to combine multiple csvs into one file serially?

How can I use Python to combine multiple csvs into one file serially?… here is a solution to the problem.

How can I use Python to combine multiple csvs into one file serially?

I’m trying to merge multiple CSV files into one.

The CSV file looks like this

 Energy_and_Power_Day1.csv,     
 Energy_and_Power_Day2.csv, 
 Energy_and_Power_Day3.csv,      
  ....................., 
 Energy_and_Power_Day31.csv

I used a small python script to concatenate multiple CSV files. The script is doing its job, but it is not connecting files serially.
It should take Energy_and_Power_Day1.csv then Energy_and_Power_Day2.csv then Energy_and_Power_Day3.csv like this: But not so, but random rather than continuous .
Here is my code

import pandas as pd
import csv
import glob
import os

os.chdir("/home/mayukh/Downloads/Northam_bill_data")
results = pd. DataFrame([])
filelist = glob.glob("Energy_and_Power_Day*.csv")
#dfList=[]
for filename in filelist:
  print(filename)  
  namedf = pd.read_csv(filename, skiprows=0, index_col=0)
  results = results.append(namedf)

results.to_csv('Combinefile.csv')

The script gives the output from print(filename) and combines these csv files in this way.

Energy_and_Power_Day1.csv
Energy_and_Power_Day16.csv
Energy_and_Power_Day23.csv
Energy_and_Power_Day22.csv
Energy_and_Power_Day11.csv
Energy_and_Power_Day21.csv
Energy_and_Power_Day31.csv
Energy_and_Power_Day17.csv
Energy_and_Power_Day25.csv
Energy_and_Power_Day28.csv
Energy_and_Power_Day9.csv
Energy_and_Power_Day19.csv
Energy_and_Power_Day7.csv
Energy_and_Power_Day15.csv
Energy_and_Power_Day20.csv
Energy_and_Power_Day24.csv
Energy_and_Power_Day4.csv
Energy_and_Power_Day6.csv
Energy_and_Power_Day14.csv
Energy_and_Power_Day13.csv
Energy_and_Power_Day27.csv
Energy_and_Power_Day3.csv
Energy_and_Power_Day18.csv
Energy_and_Power_Day8.csv
Energy_and_Power_Day30.csv
Energy_and_Power_Day12.csv
Energy_and_Power_Day29.csv
Energy_and_Power_Day10.csv
Energy_and_Power_Day5.csv
Energy_and_Power_Day2.csv
Energy_and_Power_Day26.csv

My question is how or how to combine these CSV files consecutively?

Solution

This is not “random” (it depends on how the underlying file system organizes these files – @tripleee)。

You can sort the file names before opening them. Use list.sort with the key parameter. After this, you can use list understanding and pass a list of dataframes to pd.concat. It should be more efficient than DataFrame.append.

import re

filelist = glob.glob("Energy_and_Power_Day*.csv")
filelist.sort(key=lambda x: int(re.search('\d+', x).group()))

df = pd.concat([
        pd.read_csv(f, skiprows=0, index_col=0) for f in filelist
     ],
     axis=0
)

Related Problems and Solutions