Forum

This topic contains 0 replies, has 1 voice, and was last updated by  sumansinharoy 8 months, 2 weeks ago.

Viewing 1 post (of 1 total)
  • Author
    Posts
  • #3103 Reply

    sumansinharoy
    Participant

    Spill buffer data means writing the buffer data (from in-memory cache) to the local physical disk to emptied the buffer when it reaches the 80% occupied. If this spilling is not done in proper time, the data in buffer may be overwritten by upcoming mapper output. Amount of memory available for this is set by mapreduce.task.io.sort.mb.

    The spill buffer data happens at least once when the mapper finishes all its task, because the output of the mapper should be sorted and saved to the disk for reducer to read and process to generate the final output. The final output then is written back to HDFS.

Viewing 1 post (of 1 total)
Reply To: What is spill buffer data?
Your information:




cf22

Your Name (required)

Your Email (required)

Subject

Phone No

Your Message

Cart

  • No products in the cart.