Python – BeautifulSoup: Opens local and http html files

BeautifulSoup: Opens local and http html files… here is a solution to the problem.

BeautifulSoup: Opens local and http html files

I can’t work at the same time, only one or the other :

link = open(url)
soup = BeautifulSoup(link.read(), "html.parser")

^ Use local files

link = urlopen(url).read()    
soup = BeautifulSoup(link, "html.parser")

^For http:// (Internet) links

How do I simply make both work?

Solution

What format is the path to your local file?
You can simply check if your input string is URL:

if url.startswith('http'):
    link = urlopen(url).read()
else:
    link = open(url)

Otherwise, simply convert the path to the local file to file URI scheme, and you should be able to open them just like a regular URL

Related Problems and Solutions