1. In this case, it's easy--it's this one, because it seems to describe the data almost perfectly
2. whereas the red one has substantially an error as does the green curve even more so.
3. Let's look at more data, which line would you pick?
4. The one in the middle, the red one, the blue one, or perhaps any of those.
5. Perhaps, it is all equally good. It turns out green is the right answer. Let's just look at this.
6. The blue one suffers no loss for the three points over here
7. but a fairly substantial loss for the other three data points.
8. Let's call this loss c.
9. So, the blue curve would have an error of 3c²--if this distance here is called c.
10. The reason why it's squared is because we have three of those
11. and we're using the quadratic distance.
12. Similarly, the red one has the same problem, 3*c² for the red one.
13. Now, how about the green one? For the green one, we have errors for all six points.
14. But now, the amount of the error itself is half as big as c. So, we're going to write (c/2)².
15. And when you work this out, you'll find that this is 6/4 c² is the same as 3/2 c².
16. So, the total error for the green one, the quadratic error,
17. is half as big as it is for the blue one and that's a surprise.
18. It has to do with the fact that in a quadratic version of the error,
19. large errors count much, much more than small errors.
20. So, green is the best regression line we can find for this data
21. over here among the choices I've given you.