1.what are side data distribution techniques?
Side data refers yo extra static small data required by MR to perform job.main challenge is avaliability of side data on node the map would be executed.hadoop provides two side data distribution techniques.
2.shuffling in map reduce?
Shiffling is the process by which intermediate data from mappers are transferred to 0, 1.
3.can we change the file cached by distributed cache?
Distributed cach in MR can be updated by replacing the file with new one & changing the pointer location to new lacation and restart MR job.
4.what if job tracker machine is down?
Job tracker is single point of failure for the hadoop MR service.if it goes down all jobs are halted.