Thursday 31 March 2022

What is sorting phase in MapReduce?

Sorting is a phase that is done on Reducer machine, where all the <key, value> pairs are sorted and group by key before reducer start processing.

What is the advantage of sorting phase?

Since sorting phase sort and groups reducer inputs by keys, it is easy for the reducer to perform aggregate operations.

 

Before Sorting

(day, 2)
(good, 2)
(there, 1)
(the, 1)
(there, 2)
(good, 3)

 

After sorting above data is transformed like below.

(day, [2])
(good, [2, 3])
(the, [1])
(there, [1, 1])

 

What is the order of mapper, partition, shuffling, sorting and reducer phases?

Mapper -> Partitioner -> Shuffle -> Sort -> Reducer

 


Is the sorting phase done on reducer machine?

Yes

 

Previous                                                 Next                                                 Home

No comments:

Post a Comment