Java – How to convert a Hadoop Path object to a Java File object

How to convert a Hadoop Path object to a Java File object… here is a solution to the problem.

How to convert a Hadoop Path object to a Java File object

Is there a way to change an existing valid Hadoop Path object to a useful Java File object. Is there a good way to do this, or do I need to write code drastically to commit? The more obvious method doesn’t work, it seems to be a plain piece of code

void func(Path p) {
  if (p.isAbsolute()) {
     File f = new File(p.toURI());
  }
}

This doesn’t work because Path::toURI() returns the “hdfs” identifier, while Java’s File(URI URI) constructor only recognizes the “file” identifier.

Is there a way to get Path and File to work together?

**

Okay, how about a specific limited example.

Path[] paths = DistributedCache.getLocalCacheFiles(job);

DistributedCache should provide a localized copy of the file, but it returns a path. I assume that DistributedCache makes local copies of files, which are on the same disk. Considering this limited example, hopefully HDFS is not in the equation, is there a way to reliably convert a path to a file?

**

Solution

I’ve had the same issue recently, there is indeed a way to get the file from the path, but it requires the file to be downloaded temporarily. Obviously, this is not suitable for many tasks, but if time and space are not necessary for you and you only need to work with files in Hadoop, do the following:

import java.io.File;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

public final class PathToFileConverter {
    public static File makeFileFromPath(Path some_path, Configuration conf) throws IOException {
        FileSystem fs = FileSystem.get(some_path.toUri(), conf);
        File temp_data_file = File.createTempFile(some_path.getName(), "");
        temp_data_file.deleteOnExit();
        fs.copyToLocalFile(some_path, new Path(temp_data_file.getAbsolutePath()));
        return temp_data_file;
    }
}

Related Problems and Solutions