пʼятниця, 17 травня 2013 р.

Fix PigUnit issue on Windows

PigUnit is the nice and extremely easy way to test your Pig script. Read more here 
However, it doesn't run on Windows at all. When you write your first PigUnit script, you will get the following exception:


Exception in thread "main" java.io.IOException: Cannot run program "chmod": CreateProcess error=2, The system cannot find the file specified :

In fact, it means Cygwin is not correctly installed. To fix it, you have to download and install Cygwin, after that edit PATH variable and enter the path name to cygwin directory. 

Try to run again. The next possible error will be:

ERROR mapReduceLayer.Launcher: Backend error message during job submission java.io.IOException: Failed to set permissions of path: \tmp\hadoop-MyUsername\mapred\staging\MyUsername1049214732.staging to 0700

It means, your temporary directory is not set correctly (or doesn't set at all). Be honest, I tried to set up this temporary directory with the following code:

        pigServer.getPigContext().getProperties().setProperty("pig.temp.dir", "D:/TMP");
        pigServer.getPigContext().getProperties().setProperty("hadoop.tmp.dir", "D:/TMP");

Unfortunately,it doesn't work.... The solution is to set up system property. There are a lot of way to do it, and one of them is to tune java run configuration when you run your test, just add:

 -Djava.io.tmpdir=D:\TMP\
Ok, that's much better, but it's not the finish yet, there is error
java.io.IOException: Failed to set permissions of path: file:/tmp/hadoop-iwonabb/mapred/staging/iwonabb-1931875024/.staging to 0700 
at org.apache.hadoop.fs.RawLocalFileSystem.checkReturnValue(RawLocalFileSystem.java:526) 

that's because of error in the code.

There are several solutions to fix this bug (it is present in Hadoop for a years... :(). One of them, is to use this patch  or fix code and recompile. But for me it was the best way (special, it will be fix only for specefic version, also it is difficult to maintain on several clusters on dev machines and so on).
So, I've decided to change code at runtime with Javassist

So, the solution is a very simple and self-describing:

package example.pig;
import javassist.CannotCompileException;
import javassist.ClassPool;
import javassist.CtClass;
import javassist.CtMethod;
import javassist.NotFoundException;
public class FixHadoopOnWindows {
/**
* Fix the followind Hadoop problem on Windows:
* 1) mapReduceLayer.Launcher: Backend error message during job submission java.io.IOException: Failed to set permissions of path: \tmp\hadoop-MyUsername\mapred\staging\
* 2) java.io.IOException: Failed to set permissions of path: bla-bla-bla\.staging to 0700
*/
public static void runFix() throws NotFoundException, CannotCompileException {
if( isWindows() ) { // run fix only on Windows
setUpSystemVariables();
fixCheckReturnValueMethod();
}
}
// set up correct temporary directory on windows
private static void setUpSystemVariables() {
System.getProperties().setProperty("java.io.tmpdir", "D:/TMP/");
}
/**
* org.apache.hadoop.fs.FileUtil#checkReturnValue doesn't work on Windows at all
* so, let's change method body with Javassist on empty body
*/
private static void fixCheckReturnValueMethod() throws NotFoundException, CannotCompileException {
ClassPool cp = new ClassPool(true);
CtClass ctClass = cp.get("org.apache.hadoop.fs.FileUtil");
CtMethod ctMethod = ctClass.getDeclaredMethod("checkReturnValue");
ctMethod.setBody("{ }");
ctClass.toClass();
}
private static boolean isWindows() {
String OS = System.getProperty("os.name");
return OS.startsWith("Windows");
}
private FixHadoopOnWindows() { }
}
To apply it, just call from you test before run in. I'd recommend to do it just after PigTest instance creation.

4 коментарі:

  1. Nice fix!! Worked like a charm from inside a JUnit test. All I had to do was add FixHadoopOnWindows.runFix() in the @SetUp()

    ВідповістиВидалити
  2. when running maven test : Tests in error:
    testStudentsPigScript(com.ram.hjk.debug_pig.AppTest): The system cannot find the path specified

    Thank you.

    ВідповістиВидалити
  3. pigTest.assertOutput("D", new String[] { "(2,No)", "(3,Ha!)",);
    unable to find the path "D" .

    ВідповістиВидалити