Recently, I switched in a favor of Spark instead of traditional MapReduce paradigm and was need to implement some kind of unit/integration testing... of course, it was need to work under Windows 7.
I've written very simple test: run ETL in-memory, without touching Hadoop at all (in future, I'd like to read input from local filesystem):
Bum! I got exception:
org.apache.hadoop.util.Shell.I swear, I didn't use Hadoop in my code!
Unfortunately, Hadoop configuration is initialized together with SparkContext :( no way to omit it...
I was recommended to install HDP on Windows, but I hate this idea...
I tried the most stupid idea - provide winutils.exe... I hope, it's only the check of environment and Hadoop functionality won't be used if I don't touch it.
So, I downloaded winutils.exe from msdn (msdn still helpful even for hadooper), put it to created directory d:\winutil\bin and then add
System.setProperty("hadoop.home.dir", "d:\\winutil\\")at the beginning of my unit test