English subtitles

← How Failures Come to Be - Software Debugging

Get Embed Code
4 Languages

Showing Revision 4 created 05/25/2016 by Udacity Robot.

  1. We can see a program as a succession of program states.
  2. Each program states consists of several variables with values.
  3. As a program executes, it processes these states and transforms them into a new states.
  4. For instance, by reading variables and writing variables. This is the normal mode of operation.
  5. Now, however, since in the beginning, we have a normal input and in the end we have a failure,
  6. there must be a defect somewhere in our program that actually causes the problem.
  7. So let me assume that this statement we're executing here actually has a defect.
  8. What happens is that now, when executed, it introduces
  9. an error in the program state which we call an infection.
  10. This infection is now being propagated possibly to other state
  11. and eventually becomes visible as a failure towards the user.
  12. What we get in here is actually an entire cause-effect chain.
  13. You see, these failures, which is an infection, is caused by earlier infections
  14. and if we are at a state where the infection has no further origin, that is, the input state is the same,
  15. and the output is infected, and we know the statement that was executed at this precise moment
  16. which caused this transition from the same state to infected state,
  17. this is the statement which caused the infection, this is the statement which has the defect.
  18. When we're debugging now, we need to identify this cause-effect chain
  19. not only do we need to identify but we also need to break the cause-effect chain.
  20. If we can break this cause-effect chain from defect to failure, then we're done with debugging.
  21. So all of this looks very simple, however, in real life, it's much more complicated than that.
  22. To start with, not every defect automatically causes a failure.
  23. It may well be that the defect causes an infection which later simply is not
  24. propagated as a real life infection just as well.
  25. So the infection is not propagated and never ever becomes visible to the user.
  26. It may not even cause a failure at all or the statement with the defect may not even be executed
  27. or only under very specific circumstances may actually cause an infection and later a failure.
  28. This is the problem of testing. You can execute a program again and again,
  29. never have a failure and still have a defect in there, however, if a program fails,
  30. that is if we actually see a failure, then we can always trace it back to the defect that causes it.
  31. So if there's a failure, we can always fix it by following back the cause-effect chain.
  32. But then the next problem is: these states are huge.
  33. So over here we have 1, 2, 3, 4, ... 12 variables. Cute.
  34. In reality, we have 10,000 of such variables and not only do we have 10,000 of such variables,
  35. we also have 10,000 of steps between defect and failure.
  36. So tracing back the cause-effect chain can be much,
  37. much more complicated that it is in this simple picture.
  38. The longer the cause-effect chain, that is the longer the time we have to cover,
  39. the more states we have to cover, the harder is to debug it.
  40. And also, the larger the state, the more we have to search for an infection.
  41. Again, this makes debugging harder and harder.
  42. It's like finding a needle in a haystack except that the haystack
  43. sometimes is larger than any haystack you'll ever find on earth.