Without a doubt, there is avery common situation when UnitTest (or IntegrationTest) is required to test functionality of MapReduce job. This approach perfect fit TDD, moreover, it gives opportunity to develop MapReduce jobs faster, because there is no needs to redeploy jar on a cluster each time and debugging is easy to use.
The first line of defence is MRUnit. Great framework for unit testing, input/output format independent with possibility to run/test map and reduce functions separately. Unfortunately, this framework has a several meaningful drawbacks. For example, no access to MR counters, or during the MR test only one Mapper allowed.
Local execution mode may be used to overcome MRUnit limitations or create integration test for mapreduce job. Let's assume there is runnable MapReduce tool with several input sources (mappers) and reducer:
Nice integration test (or unit, call and use it as you like) for this Hadoop MapReduce a listed bellow: