  1. And remember that we,
  2. when we started with all the data, the entropy of the parent was 1.0.
  3. Now let's use information gain to decide which variable to use when splitting.
  4. We'll start by doing an information gain calculation for the grade.
  5. So here's my parent node, with all four examples in it.
  6. Now I split based on the grade.
  7. So the first question for
  8. you is how many examples are going to go in the steep branch?
  9. Write your answer in the box.