Java – Embedded Pig in Java: java. io. IOException: Unable to run program “cygpath”

Embedded Pig in Java: java. io. IOException: Unable to run program “cygpath”… here is a solution to the problem.

Embedded Pig in Java: java. io. IOException: Unable to run program “cygpath”

I’m trying to run basic Embedded Pig Java code.

I’m accessing a Hadoop cluster from a remote machine.

Hadoop version: 2.0.0-cdh4.3.0,
Pig version: 0.11.0-cdh4.3.0

The code looks like this:

Properties lProperties = new Properties();
lProperties.setProperty("fs.defaultFS", "<server>:<hdfsport>");
lProperties.setProperty("yarn.resourcemanager.address", "<server>:<port>");
try {
    PigServer pigServer = new PigServer(ExecType.MAPREDUCE,lProperties);
    pigServer.registerQuery("A = load '/input_data/pig_input.txt' as (key,name);" );
    pigServer.registerQuery("B = foreach A generate $0 as id;" );
    pigServer.store("B", "test_output");
}

I was able to run the Pig command and the Pig script independently using PuTTy SSH.
However, when running the above Java code in the Eclipse IDE, the following error occurs:

java.io.IOException: Unable to run program "cygpath": CreateProcess error=2, the system cannot find the specified file

Do I have to install Cygwin to successfully run Embedded Pig in Java??

Solution

For Pig version 0.11, Cygwin is a dependency. As of version 0.12, Cygwin is no longer required, but you may still need to install some basic utilities such as sed and gzip.

See also PIG-2793 Make Pig Work on Windows without Cygwin

Related Problems and Solutions