Python – How do I import “HdfsClient” in python 3?

How do I import “HdfsClient” in python 3?… here is a solution to the problem.

How do I import “HdfsClient” in python 3?

I’m new to Python and I’m trying to connect to a Hadoop HDFS system. I got the following reference code, I tried to implement it, but it shows an error when importing the package.

from pyarrow import HdfsClient

# Using libhdfs
hdfs = HdfsClient('192.168.0.119', '50070', 'cloudera', driver='libhdfs')

Error: ImportError: cannot import name ‘HdfsClient’

I even tried installing it with “pip“, however

Could not find a version that satisfies the requirement HdfsClient
(from versi ons: ) No matching distribution found for HdfsClient

Then I tried using “conda“, but again

Collecting package metadata: done Solving environment: failed

PackagesNotFoundError: The following packages are not available from
current cha nnels:

  • hdfsclient

Current channels:

To search for alternate channels that may provide the conda package
you’re looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.

Actually, I’m trying to connect to HUE using:

IP address -> 192.168.0.119

Port name -> 50070

Username – > cloudera

Password – > Cloudera

But it didn’t work out. Can anyone suggest a better way to connect it or how to import the “HdfsClient” package in Python 3.

Solution

HDFSClient is deprecated. You may want to use pyarrow.hdfs.connect.
Also try pip freeze to see if you have the relevant libraries installed in your Python environment.
Predecessor.

from pyarrow import hdfs
hdfs.connect('192.168.0.119', 50070, 'cloudera', driver='libhdfs')

Related Problems and Solutions