  1. On this split, the information gain is actually zero.
  2. We don't learn anything by splitting the train,
  3. these particular training examples based on the bumpiness of the train.
  4. So this clearly is probably not where we want to start splitting our
  5. sample if we want to build a decision tree.