< Return to Video

Lecture 7: Debugging and Profiling (2020)

  • 0:00 - 0:04
    So welcome back. Today we are gonna
    cover debugging and profiling.
  • 0:05 - 0:09
    Before I get into it we're gonna make another
    reminder to fill in the survey.
  • 0:10 - 0:15
    Just one of the main things we want to get
    from you is questions, because the last day
  • 0:15 - 0:18
    is gonna be questions from
    you guys: about things that
  • 0:18 - 0:22
    we haven't covered, or like you want
    us to kind of talk more in depth.
  • 0:23 - 0:27
    The more questions we get, the more interesting
    we can make that section,
  • 0:27 - 0:29
    so please go on and fill in the survey.
  • 0:29 - 0:36
    So today's lecture is gonna be a lot of topics.
    All the topics revolve around the concept of
  • 0:36 - 0:40
    what do you do when you have
    a program that has some bugs.
  • 0:40 - 0:43
    Which is most of the time, like when you
    are programming, you're kind of thinking
  • 0:43 - 0:47
    about how you implement something and there's
    like a half life of fixing all the issues that
  • 0:48 - 0:52
    that program has. And even if your program behaves
    like you want, it might be that it's
  • 0:52 - 0:56
    really slow, or it's taking a lot
    of resources in the process.
  • 0:56 - 1:01
    So today we're gonna see a lot of different
    approaches of dealing with these problems.
  • 1:01 - 1:05
    So first, the first section is on debugging.
  • 1:06 - 1:08
    Debugging can be done in many different ways,
  • 1:08 - 1:10
    there are all kinds of...
  • 1:10 - 1:14
    The most simple approach that, pretty much, all
  • 1:14 - 1:17
    CS students will go through, will be just:
    you have some code, and it's not behaving
  • 1:17 - 1:20
    like you want, so you probe the code by adding
  • 1:20 - 1:23
    print statements. This is called
    "printf debugging" and
  • 1:23 - 1:24
    it works pretty well.
  • 1:24 - 1:27
    Like, I have to be honest,
  • 1:27 - 1:33
    I use it a lot of the time because of how simple
    to set up and how quick the feedback can be.
  • 1:34 - 1:39
    One of the issues with printf debugging
    is that you can get a lot of output
  • 1:39 - 1:41
    and maybe you don't want
  • 1:41 - 1:43
    to get as much output as you're getting.
  • 1:44 - 1:49
    There has... people have thought of slightly more
    complex ways of doing printf debugging and
  • 1:54 - 1:58
    one of these ways is what is usually
    referred to as "logging".
  • 1:58 - 2:05
    So the advantage of doing logging versus doing printf
    debugging is that, when you're creating logs,
  • 2:05 - 2:10
    you're not necessarily creating the logs because
    there's a specific issue you want to fix;
  • 2:10 - 2:12
    it's mostly because you have built a
  • 2:12 - 2:17
    more complex software system and you
    want to log when some events happen.
  • 2:17 - 2:22
    One of the core advantages of using
    a logging library is that
  • 2:22 - 2:27
    you can can define severity levels,
    and you can filter based on those.
  • 2:27 - 2:32
    Let's see an example of how we
    can do something like that.
  • 2:32 - 2:36
    Yeah, everything fits here. This
    is a really silly example:
  • 2:36 - 2:38
    We're just gonna
  • 2:38 - 2:41
    sample random numbers and, depending
    on the value of the number,
  • 2:41 - 2:45
    that we can interpret as a kind
    of "how wrong things are going".
  • 2:45 - 2:49
    We're going to log the value
    of the number and then
  • 2:49 - 2:52
    we can see what is going on.
  • 2:53 - 2:59
    I need to disable these formatters...
  • 3:00 - 3:04
    And if we were just to execute the code as it is,
  • 3:04 - 3:07
    we just get the output and we just
    keep getting more and more output.
  • 3:07 - 3:14
    But you have to kind of stare at it and make
    sense of what is going on, and we don't know
  • 3:14 - 3:20
    what is the relative timing between printfs, we don't really
    know whether this is just an information message
  • 3:20 - 3:23
    or a message of whether something went wrong.
  • 3:24 - 3:26
    If we just go in,
  • 3:27 - 3:30
    and undo, not that one...
  • 3:34 - 3:37
    That one, we can set that formatter.
  • 3:39 - 3:42
    Now the output looks something more like this
  • 3:42 - 3:45
    So for example, if you have several different
    modules that you are programming with,
  • 3:45 - 3:47
    you can identify them with like different levels.
  • 3:47 - 3:50
    Here, we have, we have debug levels,
  • 3:50 - 3:52
    we have critical
  • 3:52 - 3:58
    info, different levels. And it might be handy because
    here we might only care about the error messages.
  • 3:58 - 4:01
    Like those are like, the... We have been
  • 4:01 - 4:04
    working on our code, so far so good,
    and suddenly we get some error.
  • 4:04 - 4:07
    We can log that to identify where it's happening.
  • 4:07 - 4:12
    But maybe there's a lot of information
    messages, but we can deal with that
  • 4:13 - 4:17
    by just changing the level to error level.
  • 4:17 - 4:18
    And
  • 4:19 - 4:23
    now if we were to run this again,
    we are only going to get those
  • 4:24 - 4:28
    errors in the output, and we can just look through
    those to make sense of what is going on.
  • 4:29 - 4:33
    Another really useful tool when
    you're dealing with logs is
  • 4:34 - 4:37
    As you kind of look at this,
  • 4:37 - 4:43
    it has become easier because now we have this critical
    and error levels that we can quickly identify.
  • 4:43 - 4:47
    But since humans are fairly visual creatures,
  • 4:49 - 4:53
    one thing that you can do is use
    colors from your terminal to
  • 4:54 - 4:57
    identify these things. So now,
    changing the formatter,
  • 4:57 - 5:03
    what I've done is slightly change
    how the output is formatted.
  • 5:04 - 5:09
    When I do that, now whenever I get a warning
    message, it's color coded by yellow;
  • 5:09 - 5:11
    whenever I get like an error,
  • 5:11 - 5:16
    faded red; and when it's critical, I have a
    bold red indicating something went wrong.
  • 5:16 - 5:23
    And here it's a really short output, but when you start
    having thousands and thousands of lines of log,
  • 5:23 - 5:26
    which is not unrealistic and happens
    every single day in a lot of apps,
  • 5:27 - 5:32
    quickly browsing through them and identifying
    where the error or the red patches are
  • 5:33 - 5:35
    can be really useful.
  • 5:36 - 5:41
    A quick aside is, you might be curious about
    how the terminal is displaying these colors.
  • 5:42 - 5:45
    At the end of the day, the terminal
    is only outputting characters.
  • 5:47 - 5:49
    Like, how is this program or how
    are other programs, like LS,
  • 5:50 - 5:56
    that has all these fancy colors. How are they telling the
    terminal that it should use these different colors?
  • 5:56 - 5:59
    This is nothing extremely fancy,
  • 5:59 - 6:03
    what these tools are doing, is
    something along these lines.
  • 6:04 - 6:05
    Here we have...
  • 6:05 - 6:08
    I can clear the rest of the output,
    so we can focus on this.
  • 6:09 - 6:14
    There's some special characters,
    some escape characters here,
  • 6:14 - 6:20
    then we have some text and then we have some other
    special characters. And if we execute this line
  • 6:20 - 6:22
    we get a red "This is red".
  • 6:22 - 6:27
    And you might have picked up on the
    fact that we have a "255;0;0" here,
  • 6:27 - 6:31
    this is just telling the RGB values of
    the color we want in the terminal.
  • 6:31 - 6:38
    And you pretty much can do this in any piece of code that
    you have, and like that you can color code the output.
  • 6:38 - 6:43
    Your terminal is fairly fancy and supports
    a lot of different colors in the output.
  • 6:43 - 6:45
    This is not even all of them, this
    is like a sixteenth of them.
  • 6:46 - 6:49
    I think it can be fairly useful
    to know about that.
  • 6:52 - 6:56
    Another thing is maybe you don't
    enjoy or you don't think
  • 6:56 - 6:59
    logs are really fit for you.
  • 6:59 - 7:02
    The thing is a lot of other systems that
    you might start using will use logs.
  • 7:03 - 7:05
    As you start building larger and larger systems,
  • 7:05 - 7:10
    you might rely on other dependencies. Common
    dependencies might be web servers or
  • 7:10 - 7:12
    databases, it's a really common one.
  • 7:12 - 7:18
    And those will be logging their errors
    or exceptions in their own logs.
  • 7:18 - 7:21
    Of course, you will get some client-side error,
  • 7:21 - 7:25
    but those sometimes are not informative enough
    for you to figure out what is going on.
  • 7:26 - 7:34
    In most UNIX systems, the logs are usually
    placed under a folder called "/var/log"
  • 7:34 - 7:38
    and if we list it, we can see there's
    a bunch of logs in here.
  • 7:43 - 7:48
    So we have like the shutdown monitor
    log, or some weekly logs.
  • 7:50 - 7:56
    Things related to the Wi-Fi, for
    example. And if we output the
  • 7:58 - 8:01
    System log, which contains a lot
    of information about the system,
  • 8:01 - 8:04
    we can get information about what's going on.
  • 8:04 - 8:07
    Similarly, there are tools that will let you
  • 8:07 - 8:13
    more sanely go through this output.
    But here, looking at the system log,
  • 8:13 - 8:16
    I can look at this, and say:
  • 8:16 - 8:20
    oh there's some service that is
    exiting with some abnormal code
  • 8:20 - 8:25
    and based on that information, I can go
    and try to figure out what's going on,
  • 8:26 - 8:28
    like what's going wrong.
  • 8:29 - 8:32
    One thing to know when you're
    working with logs is that
  • 8:32 - 8:36
    more traditionally, every software had their own
  • 8:36 - 8:43
    log, but it has been increasingly more popular to have
    a unified system log where everything is placed.
  • 8:43 - 8:49
    Pretty much any application can log into the system
    log, but instead of being in a plain text format,
  • 8:49 - 8:52
    it will be compressed in some special format.
  • 8:52 - 8:56
    An example of this, it was what we covered
    in the data wrangling lecture.
  • 8:57 - 9:00
    In the data wrangling lecture we
    were using the "journalctl",
  • 9:00 - 9:04
    which is accessing the log and
    outputting all that output.
  • 9:04 - 9:07
    Here in Mac, now the command is "log show",
  • 9:07 - 9:10
    which will display a lot of information.
  • 9:10 - 9:16
    I'm gonna just display the last ten seconds,
    because logs are really, really verbose and
  • 9:17 - 9:24
    just displaying the last 10 seconds is still
    gonna output a fairly large amount of lines.
  • 9:24 - 9:28
    So if we go back through what's going on,
  • 9:28 - 9:33
    we here see that a lot of Apple things
    are going on, since this is a macbook.
  • 9:34 - 9:38
    Maybe we could find errors about
    like some system issue here.
  • 9:39 - 9:47
    Again they're fairly verbose, so you might want
    to practice your data wrangling techniques here,
  • 9:47 - 9:50
    like 10 seconds equal to like 500
    lines of logs, so you can kind of
  • 9:51 - 9:55
    get an idea of how many lines
    per second you're getting.
  • 9:56 - 10:01
    They're not only useful for figuring
    out some other programs' output,
  • 10:01 - 10:06
    they're also useful for you, if you want to
    log there instead of into your own file.
  • 10:06 - 10:11
    So using the "logger" command,
    in both linux and mac,
  • 10:12 - 10:13
    You can say okay
  • 10:13 - 10:19
    I'm gonna log this "Hello Logs"
    into this system log.
  • 10:19 - 10:22
    We execute the command and then
  • 10:23 - 10:28
    we can check by going through
    the last minute of logs,
  • 10:28 - 10:32
    since it's gonna be fairly recent,
    and grepping for that "Hello"
  • 10:32 - 10:38
    we find our entry. Fairly recent entry, that
    we just created that said "Hello Logs".
  • 10:39 - 10:47
    As you become more and more familiar with
    these tools, you will find yourself using
  • 10:49 - 10:51
    the logs more and more often, since
  • 10:52 - 10:56
    even if you have some bug that you haven't detected,
    and the program has been running for a while,
  • 10:56 - 11:02
    maybe the information is already in the log and can
    tell you enough to figure out what is going on.
  • 11:03 - 11:08
    However, printf debugging is not everything.
    So now I'm going to be covering debuggers.
  • 11:08 - 11:10
    But first any questions on logs so far?
  • 11:12 - 11:15
    So what kind of things can you
    figure out from the logs?
  • 11:15 - 11:19
    like this Hello Logs says that you did
    something with Hello at that time?
  • 11:19 - 11:25
    Yeah, like say, for example, I can
    write a bash script that detects...
  • 11:25 - 11:29
    Well, that checks every time what
    Wi-Fi network I'm connected to.
  • 11:29 - 11:34
    And every time it detects that it has changed,
    it makes an entry in the logs and says
  • 11:34 - 11:37
    Oh now it looks like we have
    changed Wi-Fi networks.
  • 11:37 - 11:41
    and then you might go back and parse
    through the logs and take like, okay
  • 11:42 - 11:48
    When did my computer change from one Wi-Fi network to
    another. And this is just kind of a simple example
  • 11:48 - 11:50
    But there are many, many ways,
  • 11:51 - 11:54
    many types of information that
    you could be logging here.
  • 11:54 - 11:59
    More commonly, you will probably want to
    check if your computer, for example, is
  • 11:59 - 12:03
    entering sleep, for example,
    for some unknown reason.
  • 12:03 - 12:05
    Like it's on hibernation mode.
  • 12:05 - 12:09
    There's probably some information in the
    logs about who asked that to happen,
  • 12:09 - 12:10
    or why it's that happening.
  • 12:12 - 12:15
    Any other questions? Okay.
  • 12:15 - 12:17
    So when printf debugging is not enough,
  • 12:18 - 12:22
    the best alternative after that is using...
  • 12:23 - 12:25
    [Exit that]
  • 12:28 - 12:30
    So, it's using a debugger.
  • 12:31 - 12:38
    So a debugger is a tool that will wrap around
    your code and will let you run your code,
  • 12:38 - 12:40
    but it will kind of keep control over it.
  • 12:40 - 12:42
    So it will let you step
  • 12:42 - 12:47
    through the code and execute
    it and set breakpoints.
  • 12:47 - 12:50
    You probably have seen debuggers
    in some way, if you have
  • 12:50 - 12:56
    ever used something like an IDE, because IDEs have this
    kind of fancy: set a breakpoint here, execute, ...
  • 12:56 - 12:59
    But at the end of the day what
    these tools are using is just
  • 12:59 - 13:05
    these command line debuggers and they're just
    presenting them in a really fancy format.
  • 13:05 - 13:10
    Here we have a completely broken bubble
    sort, a simple sorting algorithm.
  • 13:10 - 13:12
    Don't worry about the details,
  • 13:12 - 13:15
    but we just want to sort this
    array that we have here.
  • 13:17 - 13:19
    We can try doing that by just doing
  • 13:21 - 13:23
    Python bubble.py
  • 13:24 - 13:28
    And when we do that... Oh there's some
    index error, list index out of range.
  • 13:28 - 13:31
    We could start adding prints
  • 13:31 - 13:34
    but if have a really long string,
    we can get a lot of information.
  • 13:34 - 13:38
    So how about we go up to the
    moment that we crashed?
  • 13:38 - 13:41
    We can go to that moment and examine what the
  • 13:41 - 13:43
    current state of the program was.
  • 13:44 - 13:49
    So for doing that I'm gonna run the
    program using the Python debugger.
  • 13:49 - 13:54
    Here I'm using technically the ipython debugger,
    just because it has nice coloring syntax
  • 13:54 - 13:59
    so it's probably easier for
    both of us to understand
  • 13:59 - 14:01
    what's going on in the output.
  • 14:01 - 14:05
    But they're pretty much identical anyway.
  • 14:05 - 14:09
    So we execute this, and now we are given a prompt
  • 14:09 - 14:13
    where we're being told that we are here,
    at the very first line of our program.
  • 14:13 - 14:15
    And we can...
  • 14:16 - 14:20
    "L" stands for "List", so as
    with many of these tools
  • 14:21 - 14:24
    there's kind of like a language
    of operations that you can do,
  • 14:24 - 14:28
    and they are often mnemonic, as it
    was the case with VIM or TMUX.
  • 14:29 - 14:33
    So here, "L" is for "Listing" the code,
    and we can see the entire code.
  • 14:35 - 14:39
    "S" is for "Step" and will let us kind of one
  • 14:39 - 14:42
    line at a time, go through the execution.
  • 14:42 - 14:47
    The thing is we're only triggering
    the error some time later.
  • 14:47 - 14:49
    So
  • 14:49 - 14:55
    we can restart the program and instead of
    trying to step until we get to the issue,
  • 14:55 - 15:01
    we can just ask for the program to continue
    which is the "C" command and
  • 15:01 - 15:04
    hey, we reached the issue.
  • 15:05 - 15:08
    We got to this line where everything crashed,
  • 15:08 - 15:11
    we're getting this list index out of range.
  • 15:11 - 15:14
    And now that we are here we can say, huh?
  • 15:14 - 15:18
    Okay, first, let's print the value of the array.
  • 15:18 - 15:22
    This is the value of the current array
  • 15:23 - 15:27
    So we have six items. Okay. What
    is the value of "J" here?
  • 15:27 - 15:32
    So we look at the value of "J". "J" is 5
    here, which will be the last element, but
  • 15:32 - 15:37
    "J" plus 1 is going to be 6, so that's
    triggering the out of bounds error.
  • 15:38 - 15:40
    So what we have to do is
  • 15:41 - 15:48
    this "N", instead of "N" has to be "N minus one".
    We have identified that the error lies there.
  • 15:48 - 15:51
    So we can quit, which is "Q".
  • 15:52 - 15:55
    Again, because it's a post-mortem debugger.
  • 15:56 - 16:00
    We go back to the code and say okay,
  • 16:03 - 16:06
    we need to append this "N minus one".
  • 16:07 - 16:11
    That will prevent the list index out of range and
  • 16:11 - 16:14
    if we run this again without the debugger,
  • 16:15 - 16:19
    okay, no errors now. But this
    is not our sorted list.
  • 16:19 - 16:21
    This is sorted, but it's not our list.
  • 16:21 - 16:23
    We are missing entries from our list,
  • 16:23 - 16:27
    so there is some behavioral issue
    that we're reaching here.
  • 16:28 - 16:32
    Again, we could start using printf
    debugging but kind of a hunch now
  • 16:32 - 16:38
    is that probably the way we're swapping entries
    in the bubble sort program is wrong.
  • 16:38 - 16:46
    We can use the debugger for this. We can go through
    them to the moment we're doing a swap and
  • 16:46 - 16:48
    check how the swap is being performed.
  • 16:49 - 16:51
    So a quick overview,
  • 16:51 - 16:57
    we have two for loops and
    in the most nested loop,
  • 16:57 - 17:03
    we are checking if the array is larger than the other array.
    The thing is if we just try to execute until this line,
  • 17:04 - 17:07
    it's only going to trigger
    whenever we make a swap.
  • 17:07 - 17:12
    So what we can do is we can set
    a breakpoint in the sixth line.
  • 17:12 - 17:16
    We can create a breakpoint in this line and then
  • 17:16 - 17:21
    the program will execute and the moment we try to swap
    variables is when the program is going to stop.
  • 17:21 - 17:23
    So we create a breakpoint there
  • 17:23 - 17:27
    and then we continue the execution
    of the program. The program halts
  • 17:27 - 17:31
    and says hey, I have executed
    and I have reached this line.
  • 17:31 - 17:32
    Now
  • 17:32 - 17:39
    I can use "locals()", which is a Python function
    that returns a dictionary with all the values
  • 17:39 - 17:41
    to quickly see the entire context.
  • 17:43 - 17:48
    The string, the array is fine and is
    six, again, just the beginning and
  • 17:49 - 17:51
    I step, go to the next line.
  • 17:52 - 17:53
    Oh,
  • 17:53 - 17:57
    and I identify the issue: I'm swapping one
    item at a time, instead of simultaneously,
  • 17:57 - 18:02
    so that's what's triggering the fact that
    we're losing variables as we go through.
  • 18:03 - 18:07
    That's kind of a very simple example, but
  • 18:07 - 18:09
    debuggers are really powerful.
  • 18:09 - 18:13
    Most programming languages will
    give you some sort of debugger,
  • 18:14 - 18:20
    and when you go to more low level debugging
    you might run into tools like...
  • 18:20 - 18:22
    You might want to use something like
  • 18:25 - 18:27
    GDB.
  • 18:32 - 18:34
    And GDB has one nice property:
  • 18:34 - 18:38
    GDB works really well with C/C++
    and all these C-like languages.
  • 18:38 - 18:43
    But GDB actually lets you work with pretty
    much any binary that you can execute.
  • 18:43 - 18:48
    So for example here we have sleep, which is just
    a program that's going to sleep for 20 seconds.
  • 18:49 - 18:55
    It's loaded and then we can do run, and then we
    can interrupt this sending an interrupt signal.
  • 18:55 - 19:02
    And GDB is displaying for us, here, very low-level
    information about what's going on in the program.
  • 19:02 - 19:07
    So we're getting the stack trace, we're seeing
    we are in this nanosleep function,
  • 19:07 - 19:12
    we can see the values of all the hardware
    registers in your machine. So
  • 19:12 - 19:17
    you can get a lot of low-level
    detail using these tools.
  • 19:19 - 19:23
    I think that's all I want to cover for debuggers.
  • 19:23 - 19:26
    Any questions related to that?
  • 19:34 - 19:39
    Another interesting tool when you're trying to
    debug is that sometimes you want to debug as if
  • 19:39 - 19:42
    your program is a black box.
  • 19:42 - 19:46
    So you, maybe, know what the internals
    of the program but at the same time
  • 19:46 - 19:52
    your computer knows whenever your program
    is trying to do some operations.
  • 19:52 - 19:55
    So this is in UNIX systems,
  • 19:55 - 19:58
    there's this notion of like user
    level code and kernel level code.
  • 19:58 - 20:03
    And when you try to do some operations like reading
    a file or like reading the network connection
  • 20:03 - 20:06
    you will have to do something
    called system calls.
  • 20:06 - 20:13
    You can get a program and go through
    those operations and ask
  • 20:14 - 20:18
    what operations did this software do?
  • 20:18 - 20:21
    So for example, if you have
    like a Python function
  • 20:21 - 20:27
    that is only supposed to do a mathematical operation
    and you run it through this program,
  • 20:27 - 20:28
    and it's actually reading files,
  • 20:28 - 20:32
    Why is it reading files? It shouldn't
    be reading files. So, let's see.
  • 20:35 - 20:37
    This is "strace".
  • 20:37 - 20:39
    So for example, we can do it something like this.
  • 20:39 - 20:41
    So here we're gonna run the "LS - L"
  • 20:42 - 20:48
    And then we're ignoring the output of LS, but
    we are not ignoring the output of STRACE.
  • 20:48 - 20:50
    So if we execute that...
  • 20:52 - 20:55
    We're gonna get a lot of output.
  • 20:55 - 20:59
    This is all the different system calls
  • 21:01 - 21:02
    That this
  • 21:02 - 21:08
    LS has executed. You will see a bunch
    of OPEN, you will see FSTAT.
  • 21:08 - 21:14
    And for example, since it has to list all the properties
    of the files that are in this folder, we can
  • 21:15 - 21:20
    check for the LSTAT call. So the LSTAT call will
    check for the properties of the files and
  • 21:21 - 21:27
    we can see that, effectively, all the files
    and folders that are in this directory
  • 21:28 - 21:32
    have been accessed through
    a system call, through LS.
  • 21:34 - 21:43
    Interestingly, sometimes you actually
    don't need to run your code to
  • 21:44 - 21:47
    figure out that there is something
    wrong with your code.
  • 21:48 - 21:52
    So far we have seen enough ways of identifying
    issues by running the code,
  • 21:52 - 21:54
    but what if you...
  • 21:54 - 21:59
    you can look at a piece of code like this, like
    the one I have shown right now in this screen,
  • 21:59 - 22:01
    and identify an issue.
  • 22:01 - 22:02
    So for example here,
  • 22:02 - 22:07
    we have some really silly piece of code. It
    defines a function, prints a few variables,
  • 22:08 - 22:12
    multiplies some variables, it sleeps for
    a while and then we try to print BAZ.
  • 22:12 - 22:15
    And you could try to look at
    this and say, hey, BAZ has
  • 22:16 - 22:21
    never been defined anywhere. This is a new
    variable. You probably meant to say BAR
  • 22:21 - 22:23
    but you just mistyped it.
  • 22:23 - 22:26
    Thing is, if we try to run this program,
  • 22:29 - 22:37
    it's gonna take 60 seconds, because like we have to wait until
    this time.sleep function finishes. Here, sleep is just for
  • 22:38 - 22:42
    motivating the example but in general you may
    be loading a data set that takes really long
  • 22:42 - 22:45
    because you have to copy everything into memory.
  • 22:45 - 22:49
    And the thing is, there are programs
    that will take source code as input,
  • 22:49 - 22:55
    will process it and will say, oh probably this is
    wrong about this piece of code. So in Python,
  • 22:56 - 23:01
    or in general, these are called
    static analysis tools.
  • 23:01 - 23:03
    In Python we have for example pyflakes.
  • 23:03 - 23:07
    If we get this piece of code
    and run it through pyflakes,
  • 23:07 - 23:10
    pyflakes is gonna give us a couple of issues.
  • 23:10 - 23:16
    First one is the one.... The second one is the one
    we identified: here's an undefined name called BAZ.
  • 23:16 - 23:18
    You probably should be doing
    something about that.
  • 23:18 - 23:23
    And the other one is like
    oh, you're redefining the
  • 23:23 - 23:27
    the FOO variable name in that line.
  • 23:28 - 23:31
    So here we have a FOO function
    and then we are kind of
  • 23:31 - 23:35
    shadowing that function by
    using a loop variable here.
  • 23:35 - 23:38
    So now that FOO function that we
    defined is not accessible anymore
  • 23:38 - 23:42
    and then if we try to call it afterwards,
    we will get into errors.
  • 23:44 - 23:46
    There are other types of
  • 23:46 - 23:53
    Static Analysis tools. MYPY is a different one. MYPY
    is gonna report the same two errors, but it's also
  • 23:54 - 24:00
    going to complain about type checking. So it's gonna
    say, oh here you're multiplying an int by a float and
  • 24:01 - 24:06
    if you care about the type checking of your
    code, you should not be mixing those up.
  • 24:07 - 24:12
    it can be kind of inconvenient, having to run
    this, look at the line, going back to your
  • 24:13 - 24:17
    VIM or like your editor, and figuring
    out what the error matches to.
  • 24:18 - 24:21
    There are already solutions for that. One
  • 24:22 - 24:27
    way is that you can integrate most
    editors with these tools and here..
  • 24:28 - 24:34
    You can see there is like some red highlighting on
    the bash, and it will read the last line here.
  • 24:34 - 24:36
    So, undefined named 'baz'.
  • 24:36 - 24:39
    So as I'm editing this piece of Python code,
  • 24:39 - 24:43
    my editor is gonna give me feedback
    about what's going wrong with this.
  • 24:44 - 24:48
    Or like here have another one saying
    the redefinition of unused foo.
  • 24:50 - 24:52
    And
  • 24:53 - 24:56
    even, there are some stylistic complaints.
  • 24:56 - 24:58
    So, oh, I will expect two empty lines.
  • 24:58 - 25:04
    So like in Python, you should be having two
    empty lines between a function definition.
  • 25:06 - 25:07
    There are...
  • 25:07 - 25:09
    there is a resource on the lecture notes
  • 25:09 - 25:13
    about pretty much static analyzers for a
    lot of different programming languages.
  • 25:14 - 25:18
    There are even static analyzers for English.
  • 25:19 - 25:24
    So I have my notes
  • 25:25 - 25:30
    for the class here, and if I run it through this
    static analyzer for English, that is "writegood".
  • 25:30 - 25:33
    It's going to complain about
    some stylistic properties.
  • 25:33 - 25:33
    So like, oh,
  • 25:33 - 25:37
    I'm using "very", which is a weasel
    word and I shouldn't be using it.
  • 25:37 - 25:43
    Or "quickly" can weaken meaning, and you can have
    this for spell checking, or for a lot of different
  • 25:44 - 25:48
    types of stylistic analysis.
  • 25:49 - 25:52
    Any questions so far?
  • 25:58 - 25:59
    Oh,
  • 25:59 - 26:01
    I forgot to mention...
  • 26:02 - 26:07
    Depending on the task that you're performing,
    there will be different types of debuggers.
  • 26:07 - 26:10
    For example, if you're doing web development,
  • 26:10 - 26:14
    both Firefox and Chrome
  • 26:14 - 26:21
    have a really really good set of tools
    for doing debugging for websites.
  • 26:21 - 26:24
    So here we go and say inspect element,
  • 26:24 - 26:26
    we can get the... do you know?
    how to make this larger...
  • 26:28 - 26:29
    We're getting
  • 26:29 - 26:33
    the entire source code for
    the web page for the class.
  • 26:36 - 26:38
    Oh, yeah, here we go.
  • 26:39 - 26:41
    Is that better?
  • 26:41 - 26:47
    And we can actually go and change properties about
    the course. So we can say... we can edit the title.
  • 26:47 - 26:51
    Say, this is not a class on
    debugging and profiling.
  • 26:52 - 26:54
    And now the code for the website has changed.
  • 26:54 - 26:56
    This is one of the reasons
    why you should never trust
  • 26:56 - 27:01
    any screenshots of websites, because
    they can be completely modified.
  • 27:01 - 27:05
    And you can also modify this style.
    Like, here I have things
  • 27:06 - 27:08
    using the
  • 27:08 - 27:10
    the dark mode preference,
  • 27:10 - 27:12
    but we can alter that.
  • 27:12 - 27:17
    Because at the end of the day, the
    browser is rendering this for us.
  • 27:18 - 27:22
    We can check the cookies, but there's
    like a lot of different operations.
  • 27:22 - 27:28
    There's also a built-in debugger for JavaScript,
    so you can step through JavaScript code.
  • 27:28 - 27:34
    So kind of the takeaway is, depending on what you are
    doing, you will probably want to search for what tools
  • 27:34 - 27:37
    programmers have built for them.
  • 27:45 - 27:48
    Now I'm gonna switch gears and
  • 27:48 - 27:52
    stop talking about debugging, which is kind
    of finding issues with the code, right?
  • 27:52 - 27:54
    kind of more about the behavior,
    and then start talking
  • 27:54 - 27:57
    about like how you can use profiling.
  • 27:57 - 27:59
    And profiling is how to optimize the code.
  • 28:01 - 28:06
    It might be because you want to optimize
    the CPU, the memory, the network, ...
  • 28:06 - 28:10
    There are many different reasons that
    you want to be optimizing it.
  • 28:10 - 28:14
    As it was the case with debugging,
    the kind of first-order approach
  • 28:14 - 28:17
    that a lot of people have
    experience with already is
  • 28:17 - 28:22
    oh, let's use just printf profiling,
    so to say, like we can just take...
  • 28:23 - 28:26
    Let me make this larger. We can
  • 28:26 - 28:28
    take the current time here,
  • 28:28 - 28:35
    then we can check, we can do some execution
    and then we can take the time again and
  • 28:35 - 28:37
    subtract it from the original time.
  • 28:37 - 28:39
    And by doing this you can kind of narrow down
  • 28:40 - 28:46
    and fence some different parts of your code and try to figure
    out what is the time taken between those two parts.
  • 28:47 - 28:53
    And that's good. But sometimes it can be interesting,
    the results. So here, we're sleeping for
  • 28:54 - 29:00
    0.5 seconds and the output is saying,
    oh it's 0.5 plus some extra time,
  • 29:00 - 29:06
    which is kind of interesting. And if we keep running it,
    we see there's like some small error and the thing is
  • 29:06 - 29:12
    here, what we're actually measuring is what
    is usually referred to as the "real time".
  • 29:12 - 29:14
    Real time is as if you get
  • 29:14 - 29:16
    like a
  • 29:16 - 29:19
    clock, and you start it when your program starts,
    and you stop it when your program ends.
  • 29:20 - 29:23
    But the thing is, in your computer it is
    not only your program that is running.
  • 29:23 - 29:27
    There are many other programs running
    at the same time and those might
  • 29:28 - 29:35
    be the ones that are taking the CPU.
    So, to try to make sense of that,
  • 29:36 - 29:39
    A lot of... you'll see a lot of programs
  • 29:41 - 29:43
    using the terminology that is
  • 29:44 - 29:47
    real time, user time and system time.
  • 29:47 - 29:51
    Real time is what I explained, which is kind of
    the entire length of time from start to finish.
  • 29:52 - 30:00
    Then there is the user time, which is the amount of time
    your program spent on the CPU doing user level cycles.
  • 30:00 - 30:06
    So as I was mentioning, in UNIX, you can be running
    user level code or kernel level code.
  • 30:07 - 30:13
    System is kind of the opposite, it's the amount of CPU, like
    the amount of time that your program spent on the CPU
  • 30:14 - 30:18
    executing kernel mode instructions.
    So let's show this with an example.
  • 30:19 - 30:22
    Here I'm going to "time", which is a command,
  • 30:22 - 30:28
    a shell command that's gonna get these three metrics
    for the following command, and then I'm just
  • 30:28 - 30:31
    grabbing a URL from
  • 30:31 - 30:37
    a website that is hosted in Spain. So that's gonna take
    some extra time to go over there and then go back.
  • 30:37 - 30:39
    If we see, here, if we were to just...
  • 30:40 - 30:44
    We have two prints, between the beginning
    and the end of the program.
  • 30:44 - 30:49
    We could think that this program is taking like
    600 milliseconds to execute, but actually
  • 30:50 - 30:57
    most of that time was spent just waiting for the
    response on the other side of the network and
  • 30:57 - 31:05
    we actually only spent 16 milliseconds at the user level
    and like 9 seconds, in total 25 milliseconds, actually
  • 31:05 - 31:08
    executing CURL code. Everything
    else was just waiting.
  • 31:12 - 31:14
    Any questions related to timing?
  • 31:20 - 31:22
    Ok, so
  • 31:22 - 31:24
    timing can be
  • 31:24 - 31:29
    can become tricky, it's also kind of a black box solution.
    Or if you start adding print statements,
  • 31:30 - 31:36
    it's kind of hard to add print statements, with time everywhere.
    So programmers have figured out better tools.
  • 31:36 - 31:39
    These are usually referred to as "profilers".
  • 31:40 - 31:44
    One quick note that I'm gonna make, is that
  • 31:45 - 31:47
    profilers, like usually when people
  • 31:47 - 31:49
    refer to profilers they usually talk about
  • 31:49 - 31:55
    CPU profilers because they are the most common, at identifying
    where like time is being spent on the CPU.
  • 31:57 - 31:59
    Profilers usually come in kind of two flavors:
  • 31:59 - 32:02
    there's tracing profilers and sampling profilers.
  • 32:02 - 32:06
    and it's kind of good to know the difference
    because the output might be different.
  • 32:08 - 32:10
    Tracing profilers kind of instrument your code.
  • 32:11 - 32:16
    So they kind of execute with your code and every
    time your code enters a function call,
  • 32:16 - 32:20
    they kind of take a note of it. It's like, oh we're entering
    this function call at this moment in time and
  • 32:22 - 32:25
    they keep going and, once they
    finish, they can report
  • 32:25 - 32:28
    oh, you spent this much time executing
    in this function and
  • 32:29 - 32:34
    this much time in this other function. So on, so forth,
    which is the example that we're gonna see now.
  • 32:35 - 32:38
    Another type of tools are tracing,
    sorry, sampling profilers.
  • 32:38 - 32:45
    The issue with tracing profilers is they add a lot of overhead.
    Like you might be running your code and having these kind of
  • 32:46 - 32:49
    profiling next to you making all these counts,
  • 32:49 - 32:54
    will hinder the performance of your program, so
    you might get counts that are slightly off.
  • 32:55 - 32:59
    A sampling profiler, what it's gonna do
    is gonna execute your program and every
  • 33:00 - 33:05
    100 milliseconds, 10 milliseconds, like some defined period,
    it's gonna stop your program. It's gonna halt it,
  • 33:06 - 33:12
    it's gonna look at the stack trace and say, oh, you're
    right now in this point in the hierarchy, and
  • 33:13 - 33:16
    identify which function is gonna
    be executing at that point.
  • 33:16 - 33:20
    The idea is that as long as you
    execute this for long enough,
  • 33:20 - 33:24
    you're gonna get enough statistics to know
    where most of the time is being spent.
  • 33:26 - 33:29
    So, let's see an example of a tracing profiling.
  • 33:29 - 33:32
    So here we have a piece of
    code that is just like a
  • 33:33 - 33:36
    really simple re-implementation of grep
  • 33:36 - 33:38
    done in Python.
  • 33:38 - 33:44
    What we want to check is what is the bottleneck of this
    program? Like we're just opening a bunch of files,
  • 33:45 - 33:50
    trying to match this pattern, and then
    printing whenever we find a match.
  • 33:50 - 33:52
    And maybe it's the regex, maybe it's the print...
  • 33:52 - 33:54
    We don't really know.
  • 33:54 - 33:59
    So to do this in Python, we have the "cProfile".
  • 33:59 - 34:00
    And
  • 34:01 - 34:07
    here I'm just calling this module and saying I want
    to sort this by the total amount of time, that
  • 34:07 - 34:13
    we're gonna see briefly. I'm calling the
    program we just saw in the editor.
  • 34:13 - 34:19
    I'm gonna execute this a thousand times
    and then I want to match (the grep
  • 34:19 - 34:22
    Arguments here) is I want to match these regex
  • 34:23 - 34:27
    to all the Python files in here.
    And this is gonna output some...
  • 34:31 - 34:34
    This is gonna produce some output,
    then we're gonna look at it. First,
  • 34:34 - 34:39
    is all the output from the greps,
    but at the very end, we're getting
  • 34:39 - 34:43
    output from the profiler itself. If we go up
  • 34:44 - 34:47
    we can see that, hey,
  • 34:48 - 34:55
    by sorting we can see that the total number of calls. So we
    did 8000 calls, because we executed this 1000 times and
  • 34:57 - 35:03
    this is the total amount of time we spent in this function
    (cumulative time). And here we can start to identify
  • 35:04 - 35:06
    where the bottleneck is.
  • 35:06 - 35:11
    So here, this built-in method IO open, is saying that
    we're spending a lot of the time just waiting for
  • 35:12 - 35:14
    reading from the disk or...
  • 35:14 - 35:16
    There, we can check, hey,
  • 35:16 - 35:20
    a lot of time is also being spent
    trying to match the regex.
  • 35:20 - 35:23
    Which is something that you will expect.
  • 35:23 - 35:26
    One of the caveats of using this
  • 35:26 - 35:30
    tracing profiler is that, as you can see, here
  • 35:30 - 35:35
    we're seeing our function but we're also seeing
    a lot of functions that correspond to built-ins.
  • 35:35 - 35:36
    So like,
  • 35:36 - 35:42
    functions that are third party functions from the libraries.
    And as you start building more and more complex code,
  • 35:42 - 35:44
    This is gonna be much harder.
  • 35:44 - 35:45
    So
  • 35:46 - 35:50
    here is another piece of Python code that,
  • 35:52 - 35:54
    don't read through it, what it's doing is just
  • 35:54 - 35:58
    grabbing the course website and
    then it's printing all the...
  • 35:58 - 36:02
    It's parsing it, and then it's printing
    all the hyperlinks that it has found.
  • 36:02 - 36:04
    So there are like these two operations:
  • 36:04 - 36:08
    going there, grabbing a website, and
    then parsing it, printing the links.
  • 36:08 - 36:10
    And we might want to get a sense of
  • 36:10 - 36:16
    how those two operations compare to each
    other. If we just try to execute the
  • 36:17 - 36:19
    cProfiler here and
  • 36:19 - 36:25
    we're gonna do the same, this is not gonna print anything.
    I'm using a tool we haven't seen so far,
  • 36:25 - 36:26
    but I think it's pretty nice.
  • 36:26 - 36:33
    It's "TAC", which is the opposite of "CAT", and it is going
    to reverse the output so I don't have to go up and look.
  • 36:33 - 36:35
    So we do this and...
  • 36:36 - 36:39
    Hey, we get some interesting output.
  • 36:40 - 36:46
    we're spending a bunch of time in this built-in method
    socket_getaddr_info and like in _imp_create_dynamic and
  • 36:47 - 36:49
    method_connect and posix_stat...
  • 36:49 - 36:56
    nothing in my code is directly calling these functions so I
    don't really know what is the split between the operation of
  • 36:56 - 37:04
    making a web request and parsing the output of
    that web request. So, for that, we can use
  • 37:05 - 37:08
    a different type of profiler which is
  • 37:10 - 37:14
    a line profiler. And the line profiler is
    just going to present the same results
  • 37:14 - 37:21
    but in a more human-readable way, which is just, for this
    line of code, this is the amount of time things took.
  • 37:25 - 37:31
    So it knows it has to do that, we have to add a
    decorator to the Python function, we do that.
  • 37:35 - 37:37
    And as we do that,
  • 37:37 - 37:40
    we now get slightly cropped output,
  • 37:40 - 37:46
    but the main idea, we can look at the percentage of time and
    we can see that making this request, get operation, took
  • 37:46 - 37:53
    88% of the time, whereas parsing the
    response took only 10.9% of the time.
  • 37:54 - 38:01
    This can be really informative and a lot of different programming
    languages will support this type of a line profiling.
  • 38:05 - 38:07
    Sometimes, you might not care about CPU.
  • 38:07 - 38:15
    Maybe you care about the memory or like some other resource.
    Similarly, there are memory profilers: in Python
  • 38:15 - 38:22
    there is "memory_profiler", for C you will have
    "Valgrind". So here is a fairly simple example,
  • 38:22 - 38:29
    we just create this list with a million elements. That's
    going to consume like megabytes of space and
  • 38:29 - 38:34
    we do the same, creating another
    one with 20 million elements.
  • 38:35 - 38:38
    To check, what was the memory allocation?
  • 38:39 - 38:44
    How it's gonna happen, what's the consumption?
    We can go through one memory profiler and
  • 38:45 - 38:47
    we execute it,
  • 38:47 - 38:51
    and it's telling us the total memory
    usage and the increments.
  • 38:51 - 38:58
    And we can see that we have some overhead, because
    this is an interpreted language and when we create
  • 38:58 - 39:01
    this million,
  • 39:04 - 39:07
    this list with a million entries, we're gonna
    need this many megabytes of information.
  • 39:08 - 39:15
    Then we were getting another 150 megabytes. Then, we're freeing
    this entry and that's decreasing the total amount.
  • 39:15 - 39:19
    We are not getting a negative increment because
    of a bug, probably in the profiler.
  • 39:20 - 39:27
    But if you know that your program is taking a huge amount of
    memory and you don't know why, maybe because you're copying
  • 39:27 - 39:30
    objects where you should be
    doing things in place, then
  • 39:31 - 39:33
    using a memory profiler can be really useful.
  • 39:33 - 39:38
    And in fact there's an exercise that will
    kind of work you through that, comparing
  • 39:38 - 39:40
    an in-place version of quicksort with like a
  • 39:40 - 39:44
    non-inplace, that keeps making new and new copies.
    And if you using the memory profiler
  • 39:44 - 39:48
    you can get a really good comparison
    between the two of them
  • 39:51 - 39:53
    Any questions so far, with profiling?
  • 39:53 - 39:58
    Is the memory profiler running the
    program in order to get that?
  • 39:58 - 40:03
    Yeah... you might be able to figure
    out like just looking at the code.
  • 40:03 - 40:06
    But as you get more and more complex
    (for this code at least)
  • 40:06 - 40:11
    But you get more and more complex programs what
    this is doing is running through the program
  • 40:11 - 40:17
    and for every line, at the very beginning,
    it's looking at the heap and saying
  • 40:17 - 40:19
    "What are the objects that I have allocated now?"
  • 40:19 - 40:23
    "I have seven megabytes of objects",
    and then goes to the next line,
  • 40:23 - 40:28
    looks again, "Oh now I have 50,
    so I have now added 43 there".
  • 40:29 - 40:35
    Again, you could do this yourself by asking for those
    operations in your code, every single line.
  • 40:35 - 40:40
    But that's not how you should be doing things since people
    have already written these tools for you to use.
  • 40:43 - 40:46
    As it was the case with...
  • 40:51 - 40:58
    So as in the case with strace, you can
    do something similar in profiling.
  • 40:58 - 41:03
    You might not care about the specific
    lines of code that you have,
  • 41:03 - 41:08
    but maybe you want to check for outside events.
    Like, you maybe want to check how many
  • 41:09 - 41:14
    CPU cycles your computer program is using,
    or how many page faults it's creating.
  • 41:14 - 41:19
    Maybe you have like bad cache locality
    and that's being manifested somehow.
  • 41:19 - 41:23
    So for that, there is the "perf" command.
  • 41:23 - 41:27
    The perf command is gonna do this, where it
    is gonna run your program and it's gonna
  • 41:29 - 41:33
    keep track of all these statistics and report them back
    to you. And this can be really helpful if you are
  • 41:34 - 41:36
    working at a lower level. So
  • 41:37 - 41:43
    we execute this command, I'm gonna
    explain briefly what it's doing.
  • 41:49 - 41:52
    And this stress program is just
  • 41:52 - 41:55
    running in the CPU, and it's
    just a program to just
  • 41:55 - 42:00
    hog one CPU and like test that you can
    hog the CPU. And now if we Ctrl-C,
  • 42:01 - 42:03
    we can go back and
  • 42:03 - 42:09
    we get some information about the number of
    page faults that we have or the number of
  • 42:10 - 42:12
    CPU cycles that we utilize, and other
  • 42:12 - 42:14
    useful
  • 42:14 - 42:19
    metrics from our code. For some programs you can
  • 42:21 - 42:25
    look at what the functions
    that were being used were.
  • 42:26 - 42:30
    So we can record what this program is doing,
  • 42:31 - 42:35
    which we don't know about because it's
    a program someone else has written.
  • 42:35 - 42:37
    And
  • 42:38 - 42:42
    we can report what it was doing by looking
    at the stack trace and we can say oh,
  • 42:42 - 42:44
    It's spending a bunch of time in this
  • 42:45 - 42:47
    __random_r
  • 42:47 - 42:53
    standard library function. And it's mainly because the way of hogging
    a CPU is by just creating more and more pseudo-random numbers.
  • 42:54 - 42:56
    There are some other
  • 42:56 - 42:58
    functions that have not been mapped, because they
  • 42:58 - 43:01
    belong to the program, but if
    you know about your program
  • 43:01 - 43:05
    you can display this information
    using more flags, about perf.
  • 43:05 - 43:10
    There are really good tutorials online
    about how to use this tool.
  • 43:12 - 43:14
    Oh
  • 43:14 - 43:17
    One one more thing regarding
    profilers is, so far,
  • 43:17 - 43:20
    we have seen that these profilers
    are really good at
  • 43:21 - 43:25
    aggregating all this information and giving
    you a lot of these numbers so you can
  • 43:26 - 43:30
    optimize your code or you can reason
    about what is happening, but
  • 43:31 - 43:32
    the thing is
  • 43:32 - 43:36
    humans are not really good at making
    sense of lots of numbers and since
  • 43:36 - 43:39
    humans are more visual creatures, it's much
  • 43:40 - 43:43
    easier to kind of have some
    sort of visualization.
  • 43:43 - 43:49
    Again, programmers have already thought about
    this and have come up with solutions.
  • 43:49 - 43:56
    A couple of popular ones, is a
    FlameGraph. A FlameGraph is a
  • 43:57 - 44:00
    sampling profiler. So this is just running
    your code and taking samples
  • 44:00 - 44:03
    And then on the y-axis here
  • 44:03 - 44:11
    we have the depth of the stack so we know that the bash function
    called this other function, and this called this other function,
  • 44:11 - 44:14
    so on, so forth. And on the x-axis it's
  • 44:15 - 44:18
    not time, it's not the timestamps.
  • 44:18 - 44:23
    Like it's not this function run before, but it's just time
    taken. Because, again, this is a sampling profiler:
  • 44:23 - 44:29
    we're just getting small glimpses of what was it going
    on in the program. But we know that, for example,
  • 44:29 - 44:33
    this main program took the most time because the
  • 44:34 - 44:36
    x-axis is proportional to that.
  • 44:36 - 44:43
    They are interactive and they can be really useful
    to identify the hot spots in your program.
  • 44:45 - 44:51
    Another way of displaying information, and there is also
    an exercise on how to do this, is using a call graph.
  • 44:51 - 44:58
    So a call graph is going to be displaying information, and it's gonna
    create a graph of which function called which other function.
  • 44:59 - 45:01
    And then you get information about, like,
  • 45:01 - 45:06
    oh, we know that "__main__" called this
    "Person" function ten times and
  • 45:06 - 45:09
    it took this much time. And as you have
  • 45:09 - 45:13
    larger and larger programs, looking at one of
    these call graphs can be useful to identify
  • 45:14 - 45:20
    what piece of your code is calling this really
    expensive IO operation, for example.
  • 45:25 - 45:30
    With that I'm gonna cover the last
    part of the lecture, which is that
  • 45:30 - 45:37
    sometimes, you might not even know what exact
    resource is constrained in your program.
  • 45:37 - 45:39
    Like how do I know how much CPU
  • 45:39 - 45:44
    my program is using, and I can quickly
    look in there, or how much memory.
  • 45:44 - 45:47
    So there are a bunch of really
  • 45:47 - 45:50
    nifty tools for doing that one of them is
  • 45:50 - 45:53
    HTOP. so HTOP is an
  • 45:54 - 46:00
    interactive command-line tool and here it's
    displaying all the CPUs this machine has,
  • 46:00 - 46:08
    which is 12. It's displaying the amount of memory, it says I'm
    consuming almost a gigabyte of the 32 gigabytes my machine has.
  • 46:08 - 46:12
    And then I'm getting all the different processes.
  • 46:12 - 46:13
    So for example we have
  • 46:13 - 46:20
    zsh, mysql and other processes that are running in this
    machine, and I can sort through the amount of CPU
  • 46:20 - 46:24
    they're consuming or through the
    priority they're running at.
  • 46:26 - 46:28
    We can check this, for example. Here
  • 46:28 - 46:30
    we have the stress command again
  • 46:30 - 46:31
    and we're going to
  • 46:31 - 46:37
    run it to take over four CPUs and check
    that we can see that in HTOP.
  • 46:37 - 46:43
    So we did spot those four CPU
    jobs, and now I have seen that
  • 46:44 - 46:46
    besides the ones we had before,
    now I have this...
  • 46:50 - 46:56
    Like this "stress -c" command running
    and taking a bunch of our CPU.
  • 46:57 - 47:03
    Even though you could use a profiler to get similar information to
    this, the way HTOP displays this kind of in a live interactive
  • 47:03 - 47:07
    fashion can be much quicker
    and much easier to parse.
  • 47:08 - 47:10
    In the notes, there's a
  • 47:10 - 47:15
    really long list of different tools for evaluating
    different parts of your system.
  • 47:15 - 47:17
    So that might be tools for analyzing the
  • 47:17 - 47:20
    network performance, about looking the
  • 47:20 - 47:25
    number of IO operations, so you know
    whether you're saturating the
  • 47:26 - 47:28
    the reads from your disks,
  • 47:29 - 47:31
    you can also look at what is the space usage.
  • 47:32 - 47:34
    Which, I think, here...
  • 47:39 - 47:45
    So NCDU... There's a tool called "du"
    which stands for "disk usage" and
  • 47:45 - 47:49
    we have the "-h" flag for
    "human readable output".
  • 47:52 - 47:59
    We can do videos and we can get output about
    the size of all the files in this folder.
  • 48:08 - 48:10
    Yeah, there we go.
  • 48:10 - 48:15
    There are also interactive versions,
    like HTOP was an interactive version.
  • 48:15 - 48:21
    So NCDU is an interactive version that will let me navigate
    through the folders and I can see quickly that
  • 48:21 - 48:26
    oh, we have... This is one of the
    folders for the video lectures,
  • 48:26 - 48:29
    and we can see there are these four files
  • 48:30 - 48:37
    that have like almost 9 GB each and I could
    quickly delete them through this interface.
  • 48:38 - 48:44
    Another neat tool is "LSOF" which
    stands for "LIST OF OPEN FILES".
  • 48:44 - 48:48
    Another pattern that you
    may encounter is you know
  • 48:48 - 48:55
    some process is using a file, but you don't know exactly which process
    is using that file. Or, similarly, some process is listening in
  • 48:55 - 48:59
    a port, but again, how do you
    find out which one it is?
  • 48:59 - 49:01
    So to set an example.
  • 49:01 - 49:04
    We just run a Python HTTP server on port
  • 49:05 - 49:07
    444
  • 49:07 - 49:11
    Running there. Maybe we don't know that
    that's running, but then we can
  • 49:13 - 49:15
    use...
  • 49:17 - 49:19
    we can use LSOF.
  • 49:23 - 49:29
    Yeah, we can use LSOF, and the thing is LSOF
    is gonna print a lot of information.
  • 49:30 - 49:33
    You need SUDO permissions because
  • 49:34 - 49:39
    this is gonna ask for who has all these items.
  • 49:40 - 49:44
    Since we only care about the one
    who is listening in this 444 port
  • 49:45 - 49:46
    we can ask
  • 49:46 - 49:48
    grep for that.
  • 49:48 - 49:56
    And we can see, oh, there's like this Python process, with
    this identifier, that is using the port and then we can
  • 49:57 - 49:58
    kill it,
  • 49:58 - 50:01
    and that terminates that process.
  • 50:02 - 50:07
    Again, there's a lot of different
    tools. There's even tools for
  • 50:08 - 50:11
    doing what is called benchmarking.
  • 50:12 - 50:19
    So in the shell tools and scripting lecture, I said
    like for some tasks "fd" is much faster than "find"
  • 50:19 - 50:22
    But like how will you check that?
  • 50:22 - 50:30
    I can test that with "hyperfine" and I have here
    two commands: one with "fd" that is just
  • 50:30 - 50:34
    searching for JPEG files and
    the same one with "find".
  • 50:35 - 50:41
    If I execute them, it's gonna benchmark these
    scripts and give me some output about
  • 50:42 - 50:44
    how much faster "fd" is
  • 50:45 - 50:47
    compared to "find".
  • 50:48 - 50:52
    So I think that kind of concludes...
    yeah, like 23 times for this task.
  • 50:53 - 50:56
    So that kind of concludes the whole overview.
  • 50:57 - 51:00
    I know that there's like a lot of different
    topics and there's like a lot of
  • 51:01 - 51:05
    perspectives on doing these things, but
    again I want to reinforce the idea
  • 51:05 - 51:08
    that you don't need to be a master
    of all these topics but more...
  • 51:09 - 51:11
    To be aware that all these things exist.
  • 51:11 - 51:18
    So if you run into these issues you don't reinvent the wheel,
    and you reuse all that other programmers have done.
  • 51:18 - 51:24
    Given that, I'm happy to take any questions related
    to this last section or anything in the lecture.
  • 51:26 - 51:30
    Is there any way to sort of think about
    how long a program should take?
  • 51:30 - 51:33
    You know, if it's taking a while to run
  • 51:33 - 51:43
    you know, should you be worried? Or depending on your process, let me wait
    another ten minutes before I start looking at why it's taking so long.
  • 51:43 - 51:45
    Okay, so the...
  • 51:46 - 51:49
    The task of knowing how long a program
  • 51:49 - 51:54
    should run is pretty infeasible to figure out.
    It will depend on the type of program.
  • 51:54 - 52:02
    It depends on whether you're making HTTP requests or you're
    reading data... one thing that you can do is if you have
  • 52:02 - 52:03
    for example,
  • 52:03 - 52:11
    if you know you have to read two gigabytes from memory,
    like from disk, and load that into memory, you can make
  • 52:12 - 52:17
    back-of-the-envelope calculation. So like that shouldn't
    take longer than like X seconds because this is
  • 52:17 - 52:20
    how things are set up. Or if you are
  • 52:21 - 52:27
    reading some files from the network and you know kind of what the
    network link is and they are taking say five times longer than
  • 52:27 - 52:29
    what you would expect then you could
  • 52:30 - 52:31
    try to do that.
  • 52:31 - 52:38
    Otherwise, if you don't really know. Say you're trying to do some
    mathematical operation in your code and you're not really sure
  • 52:38 - 52:44
    about how long that will take you can use something
    like logging and try to kind of print intermediate
  • 52:45 - 52:50
    stages to get a sense of like, oh I need
    to do a thousand operations of this and
  • 52:52 - 52:54
    three iterations
  • 52:54 - 53:01
    took ten seconds. Then this is gonna take
    much longer than I can handle in my case.
  • 53:01 - 53:05
    So I think there are there are ways, it
    will again like depend on the task,
  • 53:05 - 53:09
    but definitely, given all the tools we've
    seen really, we probably have like
  • 53:10 - 53:13
    a couple of really good ways
    to start tackling that.
  • 53:15 - 53:17
    Any other questions?
  • 53:17 - 53:19
    You can also do things like
  • 53:19 - 53:21
    run HTOP and see if anything is running.
  • 53:22 - 53:26
    Like if your CPU is at 0%, something
    is probably wrong.
  • 53:31 - 53:33
    Okay.
  • 53:33 - 53:38
    There's a lot of exercises for all the topics
    that we have covered in today's class,
  • 53:38 - 53:41
    so feel free to do the ones
    that are more interesting.
  • 53:42 - 53:45
    We're gonna be holding office hours again today.
  • 53:45 - 53:49
    Just a reminder, office hours. You can come
    and ask questions about any lecture.
  • 53:49 - 53:54
    Like we're not gonna expect you to kind of
    do the exercises in a couple of minutes.
  • 53:54 - 53:58
    They take a really long while to get through
    them, but we're gonna be there
  • 53:59 - 54:04
    to answer any questions from previous classes, or even not related
    to exercises. Like if you want to know more about how you
  • 54:05 - 54:10
    would use TMUX in a way to kind of quickly switch
    between panes, anything that comes to your mind.
Title:
Lecture 7: Debugging and Profiling (2020)
Description:

more » « less
Video Language:
Russian
Duration:
54:14

English subtitles

Revisions