Forum

Tagged: 

This topic contains 2 replies, has 3 voices, and was last updated by  naveenkumar_mce 1 year, 11 months ago.

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #1192 Reply

    Let’s say you have 80 TB of data to store and to run MapReduce on this amount of data. Configuration of datanodes
    · 8 GB RAM
    · 100 MB/s read-write speed
    Total no. of nodes = 20
    Let’s assume the replication factor is 4 and block size is 64 mb.
    By simple calculation you will need: = Total amount of Data * Replication Factor / Total no of nodes = 80 * 4 / 20 = 16 (disk size per datanode)

    Now let’s say you need to run MapReduce program on this 80 TB of data. Reading 80 TB data at a speed of 100 MB/s using only 1 node will take: = Total data / Read-write speed = 80 * 1024 * 1024 / 100 = 838860.8 seconds = 13981.01 hours With 20 data node you will be able to finish this job in = 13981.01/ 20 = 699.05 hours

    ——————-Task for you——————- Q. What will be Replication Factor to complete MapReduce job.if we have disk size per datanode is 20 and no. of nodes you have 40.

    Answer: Replication Factor = 10

    #1208 Reply

    narmada
    Participant

    How the Replication factor will be 10 can you give explanation for it.

    #1398 Reply

    naveenkumar_mce
    Participant

    = Total amount of Data * Replication Factor / Total no of nodes = 80 * 4 / 20 = 16 (disk size per datanode)
    20=(80*X)/40
    X=(20*40)/80=10(replication factor)

    so 10 is the replication factor

Viewing 3 posts - 1 through 3 (of 3 total)
Reply To: Dec 16 th batch – assignment for Day -1
Your information:




cf22

Your Name (required)

Your Email (required)

Subject

Phone No

Your Message

Cart

  • No products in the cart.