Sorting saves time for the reducer, helping it easily distinguish when a new reduce task should start. It simply starts a new reduce task, when the next key in the sorted input data is different than the previous, to put it simply.
Partitioning, that you mentioned in one of the answers, is a different process. It determines in which reducer a (key, value) pair, output of the map phase, will be sent. reducer is different than a reduce task. A reducer can run multiple reduce tasks. Note also that shuffling and sorting are performed locally, by each reducer, for its own input data, whereas partioning is not local.