Return to Video

Data Redundancy - Intro to Hadoop and MapReduce

  • 0:00 - 0:02
    The problem with things right now, is that if one
  • 0:02 - 0:05
    of our nodes fails, we're left with missing data for the
  • 0:05 - 0:08
    file. If the node goes away for example, we've got
  • 0:08 - 0:12
    a 64 megabyte hole in the middle of my data.txt, and
  • 0:12 - 0:15
    of course similar problems with any other files that happen
  • 0:15 - 0:17
    to be stored on that node. To solve this problem, Hadoop
  • 0:17 - 0:22
    replicates each block three times, as it's stored in HDFS.
  • 0:22 - 0:25
    So block one doesn't just live here, it may also be
  • 0:25 - 0:29
    here and here. Block two is here, here and here,
  • 0:29 - 0:34
    and block three is here, here, and here. Hadoop just
  • 0:34 - 0:37
    picks three nodes at random and puts one copy of
  • 0:37 - 0:39
    the block on each of the three. Well actually, it's not
  • 0:39 - 0:42
    totally random, but that's close enough for us right now.
  • 0:42 - 0:45
    So now if a single node fails, it's not a problem
  • 0:45 - 0:47
    because we have two other copies of the block on
  • 0:47 - 0:50
    other nodes. And the NameNode is smart enough to see that
  • 0:50 - 0:53
    these blocks are now under-replicated and it will arrange
  • 0:53 - 0:56
    to have those block re-replicated on the cluster. So we're
  • 0:56 - 0:59
    back to having three copies of them. So we've taken
  • 0:59 - 1:01
    care of the problem, if one of our data node
  • 1:01 - 1:05
    fails. But there's another obvious single point of failure here.
  • 1:05 - 1:09
    What happens if the NameNode has a hardware problem? Might
  • 1:09 - 1:12
    the data be inaccessible? Or is the data in HDFS
  • 1:12 - 1:16
    lost forever? Or is everything fine and there's no problem?
Cím:
Data Redundancy - Intro to Hadoop and MapReduce
Leírás:

more » « less
Video Language:
English
Team:
Udacity
Projekt:
ud617 - Intro to Hadoop and Mapreduce
Duration:
01:17
Udacity Robot edited Angol subtitles for 03-03 Data Redundancy
Udacity Robot edited Angol subtitles for 03-03 Data Redundancy
Cogi-Admin edited Angol subtitles for 03-03 Data Redundancy

English subtitles

Felülvizsgálatok Compare revisions