Changing Number Of Mappers
August 9, 2015Speculative Execution
August 16, 2015Changing Number Of Reducers
In this blog post we saw how we can change the number of mappers in a MapReduce execution. In this post, we will see how we can change the number of reducers in a MapReduce execution.
Let’s say your MapReduce program requires 100 Mappers. Now imagine the output from all 100 Mappers are being sent to one reducer. This one reducer will become a bottleneck for the entire MapReduce execution because this Reducer now has to wait for all 100 Mappers to complete, copy the data from all the 100 Mappers, merge the output from all 100 Mappers and then move on to the actual reduce execution. This is no ideal so it is wise to distribute the work load at the reduce side as well just like we did for Mappers by increasing the number of reducers.
Ways To Change Number Of Reducers
Update the driver program and set the setNumReduceTasks to the desired value on the job object.
job.setNumReduceTasks(5);
There is also a better ways to change the number of reducers, which is by using the mapred.reduce.tasks property. This is a better option because if you decide to increase or decrease the number of reducers later, you can do so with out changing the MapReduce program.
-D mapred.reduce.tasks=10
Usage
hadoop jar /hirw-starterkit/mapreduce/stocks/MaxClosePrice-1.0.jar com.hirw.maxcloseprice.MaxClosePrice -D mapred.reduce.tasks=10 /user/hirw/input/stocks output/mapreduce/stocks