пʼятниця, 17 травня 2013 р.

Fix PigUnit issue on Windows

PigUnit is the nice and extremely easy way to test your Pig script. Read more here 
However, it doesn't run on Windows at all. When you write your first PigUnit script, you will get the following exception:


Exception in thread "main" java.io.IOException: Cannot run program "chmod": CreateProcess error=2, The system cannot find the file specified :

In fact, it means Cygwin is not correctly installed. To fix it, you have to download and install Cygwin, after that edit PATH variable and enter the path name to cygwin directory. 

Try to run again. The next possible error will be:

ERROR mapReduceLayer.Launcher: Backend error message during job submission java.io.IOException: Failed to set permissions of path: \tmp\hadoop-MyUsername\mapred\staging\MyUsername1049214732.staging to 0700

It means, your temporary directory is not set correctly (or doesn't set at all). Be honest, I tried to set up this temporary directory with the following code:

        pigServer.getPigContext().getProperties().setProperty("pig.temp.dir", "D:/TMP");
        pigServer.getPigContext().getProperties().setProperty("hadoop.tmp.dir", "D:/TMP");

Unfortunately,it doesn't work.... The solution is to set up system property. There are a lot of way to do it, and one of them is to tune java run configuration when you run your test, just add:

 -Djava.io.tmpdir=D:\TMP\
Ok, that's much better, but it's not the finish yet, there is error
java.io.IOException: Failed to set permissions of path: file:/tmp/hadoop-iwonabb/mapred/staging/iwonabb-1931875024/.staging to 0700 
at org.apache.hadoop.fs.RawLocalFileSystem.checkReturnValue(RawLocalFileSystem.java:526) 

that's because of error in the code.

There are several solutions to fix this bug (it is present in Hadoop for a years... :(). One of them, is to use this patch  or fix code and recompile. But for me it was the best way (special, it will be fix only for specefic version, also it is difficult to maintain on several clusters on dev machines and so on).
So, I've decided to change code at runtime with Javassist

So, the solution is a very simple and self-describing:

To apply it, just call from you test before run in. I'd recommend to do it just after PigTest instance creation.

4 коментарі:

  1. Nice fix!! Worked like a charm from inside a JUnit test. All I had to do was add FixHadoopOnWindows.runFix() in the @SetUp()

    ВідповістиВидалити
  2. when running maven test : Tests in error:
    testStudentsPigScript(com.ram.hjk.debug_pig.AppTest): The system cannot find the path specified

    Thank you.

    ВідповістиВидалити
  3. pigTest.assertOutput("D", new String[] { "(2,No)", "(3,Ha!)",);
    unable to find the path "D" .

    ВідповістиВидалити