How do I import “HdfsClient” in python 3?
I’m new to Python and I’m trying to connect to a Hadoop HDFS system. I got the following reference code, I tried to implement it, but it shows an error when importing the package.
from pyarrow import HdfsClient
# Using libhdfs
hdfs = HdfsClient('192.168.0.119', '50070', 'cloudera', driver='libhdfs')
Error: ImportError: cannot import name ‘HdfsClient’
I even tried installing it with “pip“, however
Could not find a version that satisfies the requirement HdfsClient
(from versi ons: ) No matching distribution found for HdfsClient
Then I tried using “conda“, but again
Collecting package metadata: done Solving environment: failed
PackagesNotFoundError: The following packages are not available from
current cha nnels:
- hdfsclient
Current channels:
- https://repo.anaconda.com/pkgs/main/win-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/free/win-64
- https://repo.anaconda.com/pkgs/free/noarch
- https://repo.anaconda.com/pkgs/r/win-64
- https://repo.anaconda.com/pkgs/r/noarch
- https://repo.anaconda.com/pkgs/msys2/win-64
- https://repo.anaconda.com/pkgs/msys2/noarch
To search for alternate channels that may provide the conda package
you’re looking for, navigate tohttps://anaconda.org
and use the search bar at the top of the page.
Actually, I’m trying to connect to HUE using:
IP address -> 192.168.0.119
Port name -> 50070
Username – > cloudera
Password – > Cloudera
But it didn’t work out. Can anyone suggest a better way to connect it or how to import the “HdfsClient” package in Python 3.
Solution
HDFSClient
is deprecated. You may want to use pyarrow.hdfs.connect.
Also try pip freeze
to see if you have the relevant libraries installed in your Python environment.
Predecessor.
from pyarrow import hdfs
hdfs.connect('192.168.0.119', 50070, 'cloudera', driver='libhdfs')