### Forum

This topic contains 0 replies, has 1 voice, and was last updated by  sumansinharoy 8 months, 2 weeks ago.

Viewing 1 post (of 1 total)
• Author
Posts

sumansinharoy
Participant

How many types of sorting?
There are three types of sorting in MR job.
A. Partial Sorting
B. Total Sorting
C. Secondary Sorting

Answer ==> This is the sorting on the mapper’s output (list[k2,v2]) to optimize the reducer phase. The sorting is based on the composite key where values are sorted on ascending or descending order associated with key. Therefore, the sorted list of key/value (list[k3,v3]) is passed to the reducer as input.

Composite Key = (key + value)

This is the another solution of the problem of finding the max or min values against each keys by avoiding the iteration on every key to find the desired value in reduce phase.
For example, we want to find the maximum temperature for the each of the years from the input files containing various recorded temperatures for several years. In the reduce phase, only pick the first value from every keys as the value (Temperature) is already sorted on descending order.

Hence, complexity of the reducer job is now n*1 or n instead of n*m, where n is the number of keys, m is the number of values.

Year Temp
==== ====
2001 30
2002 35
2003 40
2002 41
2001 42
2002 33
2001 34
2004 45
2001 31
2003 39
2004 47

Required output :
Year Max Temp
==== ========
2001 42
2002 41
2003 41
2004 47

==========================

Viewing 1 post (of 1 total)