Tuesday 29 March 2022

What is Map Only Job in Hadoop?

There are some situations, where you do not require a reducer to aggregate/process the results from mapper phase.

 

For example, I just want to print all the log messages that contains the string ‘maclicious login attempt’. In this example, I do not require a reducer phase, only map phase output is sufficient to me to get all the log messages.

 

How to turn off the reducer?

job.setNumReduceTasks(0);

 

Some other examples of Map only jobs

Example 1: Delete the documents that are older than 5 years from Hadoop.

 

Example 2: Get all the employees who are staying at Bangalore.

SELECT * FROM employees where city='Bangalore';

 

Reference

https://hadoop.apache.org/docs/r2.4.1/api/org/apache/hadoop/mapreduce/Job.html#setNumReduceTasks(int)

Previous                                                 Next                                                 Home

No comments:

Post a Comment