Let's think... Hadoop uses hard-drive very-very intensive. Memory.. it's good too have enough memory, but it doesn't critical; I believe 512 MB will be enough. CPU... depends on your code, but usual it's not critical point for map-reduce in general
So, with Raspberry Pi you get (just for $35!):
- RAM 512 MB
- CPU ARM11 700 MHz
- SD with Linux 4-16 GB (you will need to buy it separately)
Some time ago there was the nice article about Paspbery Pi supercomputer: 64 Raspberry Pi computers were connected into the one cluster (via Ethernet); each has 16 GB SD card and it means 1 TB storage for whle cluster (!), and costs about $4000
One concern: access speed to SD card. It isn't good enough and you will need to buy external SSD hard-drive. I assume each Raspberry Pi has to have own SSD (32-64 GB must be enough). So, this solution will be a more expensive that $4000, but cheapest than whole PC or cloud instances.
Let's try to calculate: 64 Raspberry Pi * $35 = 2240, SSD 64 GB * 64 = 4TB costs $4500, whole solution will cost $6500-$7000 for 64 physical node:)
So, does is make sense to build hadoop-oriented cluster? I believe so, what do you think?
At least, it will be a great experiment!
PS. Maybe someone wants to donate money for this experiment? kickstarter sounds resonable here