Forum

This topic contains 0 replies, has 1 voice, and was last updated by  sumansinharoy 6 months, 3 weeks ago.

Viewing 1 post (of 1 total)
  • Author
    Posts
  • #3091 Reply

    sumansinharoy
    Participant

    This is the manageable split (processing unit) of input file read from HDFS. Each split is then assigned to each map task to guarantee the processing of entire input file by the MR Job. This splitting mechanism is depends on input file formats (4 such formats exist). The default format is Text Input Format. In Text Input Format, splitting is done on new line character (‘/n’).

    Also, the splitting of input files depends on the following properties:

    dfs.block.size
    mapred.max.split.size
    mapred.min.split.size

    Split size is calculated as : max(mapred.min.split.size, min(mapred.max.split.size, dfs.block.size))

    if dfs.block.size = 64
    mapred.min.split.size=1
    mapred.max.split.size=256

    then Split size is 64

    but, if dfs.block.size = 64
    mapred.min.split.size=128
    mapred.max.split.size=256

    then Split size is 128
    =============

Viewing 1 post (of 1 total)
Reply To: What is logical split?
Your information:




cf22

Your Name (required)

Your Email (required)

Subject

Phone No

Your Message

Cart

  • No products in the cart.