1 00:00:00,360 --> 00:00:02,890 So, what's really great about Pig is that it abstracts 2 00:00:02,890 --> 00:00:07,450 away from the user the actual implementation of how the 3 00:00:07,450 --> 00:00:10,610 calculation is done. So, you can write in a more 4 00:00:10,610 --> 00:00:15,070 abstract language what kinds of operations should be performed to 5 00:00:15,070 --> 00:00:17,730 your data. You can say, for example, join this data 6 00:00:17,730 --> 00:00:19,980 set to this data set, or filter out this data 7 00:00:19,980 --> 00:00:22,470 set, or reduce this data. But you're not telling it 8 00:00:22,470 --> 00:00:25,930 how to do the analysis. And then Pig can decide 9 00:00:25,930 --> 00:00:29,300 the fastest way to do it for you. So instead of having to write 10 00:00:29,300 --> 00:00:31,600 very detailed jobs describing exactly what should 11 00:00:31,600 --> 00:00:34,310 be done, you can just specify in an 12 00:00:34,310 --> 00:00:37,630 abstract sense the calculation that needs to be done, and you can let the 13 00:00:37,630 --> 00:00:41,550 computer do the hard work of figuring out the faster way to do the calculation.