Python – Uses credentials to extract csv from ftp urls

Uses credentials to extract csv from ftp urls… here is a solution to the problem.

Uses credentials to extract csv from ftp urls

Newbie to Python and APIs. My source creates an ftp url where they dump files every day and I want to scrape files to perform engineering + analysis. My question is, how do I specify a username and password to fetch csv?

import pandas as pd
data = pd.read_csv('http://site-ftp.site.com/test/cat/filename.csv)

How do I add credentials for this?
PS- url is fake, for example.

Solution

For older versions of Pandas, you can use methods like requests.get() to download CSV data into memory. You can then use StringIO to make the data “like a file” so that pd.read_csv() can read it. This approach avoids having to write data to a file first.

import requests
import pandas as pd
from io import StringIO

csv = requests.get("http://site-ftp.site.com/test/cat/filename.csv", auth=HTTPBasicAuth('user', 'password'))
data = pd.read_csv(StringIO(csv.text))

print(data)

Starting with pandas 0.19.2, the pd.read_csv() function now allows you to pass URLs directly. For example:

data = pd.read_csv('http://site-ftp.site.com/test/cat/filename.csv')

Related Problems and Solutions