Lecture 11: Q&A (2020)

Edit subtitles

0:00 - 0:07

I guess we should do an intro to to this as well,
0:07 - 0:10

so this is a just sort of a
0:10 - 0:15

free-form Q&A lecture where you, as in
the two people sitting here, but also
0:15 - 0:20

everyone at home who did not come here
in person get to ask questions and we
0:20 - 0:23

have a bunch of questions people asked
in advance but you can also ask
0:23 - 0:27

additional questions during, for the two
of you who are here, you can do it either
0:27 - 0:34

by raising your hand or you can submit it on
the forum and be anonymous, it's up to you
0:34 - 0:36

regardless though, what we're gonna
do is just go through some of the
0:36 - 0:40

questions have been asked and try to
give as helpful answers as we can
0:40 - 0:44

although they are unprepared on our side and
0:44 - 0:46

yeah that's the plan I guess we go
0:46 - 0:49

from popular to least popular
0:49 - 0:50

fire away
0:50 - 0:52

all right so for our first question any
0:52 - 0:56

recommendations on learning operating
system related topics like processes,
0:56 - 1:00

virtual memory, interrupts,
memory management, etc
1:00 - 1:02

so I think this is a
1:02 - 1:07

is an interesting question because these
are really low level concepts that often
1:07 - 1:11

do not matter, unless you have to
deal with this in some capacity,
1:11 - 1:13

right so
1:13 - 1:18

one instance where this matters is you're
writing really low level code like
1:18 - 1:20

you're implementing a kernel or something
like that, or you want to
1:20 - 1:23

just hack on the Linux kernel.
1:23 - 1:25

It's rare otherwise that you need to work with
1:25 - 1:28

especially like virtual memory and
interrupts and stuff yourself
1:28 - 1:32

processes, I think are a more general concept
that we've talked a little bit about in
1:32 - 1:37

this class as well and tools like
htop, pgrep, kill, and signals and
1:37 - 1:38

that sort of stuff
1:38 - 1:39

in terms of learning it
1:39 - 1:45

maybe one of the best ways, is to try to
take either an introductory class on the
1:45 - 1:51

topic, so for example MIT has a class
called 6.828, which is where
1:51 - 1:55

you essentially build and develop your
own operating system based on some code
1:55 - 1:59

that you're given, and all of those labs
are publicly available and all the
1:59 - 2:02

resources for the class are publicly available,
and so that is a good way to
2:02 - 2:04

really learn them is by doing them yourself.
2:04 - 2:05

There are also various
2:05 - 2:11

tutorials online that basically guide
you through how do you write a kernel
2:11 - 2:15

from scratch. Not necessarily a very
elaborate one, not one you would want
2:15 - 2:21

to run any real software on, but just to
teach you the basics and so that would
2:21 - 2:22

be another thing to look up.
2:22 - 2:24

Like how do I write a kernel in and then your
2:24 - 2:28

language of choice. You will probably not
find one that lets you do it in Python
2:28 - 2:34

but in like C, C++, Rust, there
are a bunch of topics like this
2:34 - 2:37

one other note on operating systems
2:37 - 2:40

so like Jon mentioned MIT has a 6.828 class but
2:40 - 2:43

if you're looking for a more high-level
overview, not necessarily programming or
2:43 - 2:46

an operating system, but just learning about
the concepts another good resource
2:46 - 2:51

is a book called "Modern Operating
Systems" by Andy Tannenbaum
2:51 - 2:58

there's also actually a book called the "The FreeBSD
Operating System" which is really good,
2:58 - 3:03

It doesn't go through Linux, but it goes
through FreeBSD and the BSD kernel is
3:03 - 3:07

arguably better organized than the Linux
one and better documented and so it
3:07 - 3:12

might be a gentler introduction to some of those
topics than trying to understand Linux
3:12 - 3:15

You want to check it as answered?
3:15 - 3:17

- Yes + Nice
3:17 - 3:17

Answered
3:17 - 3:19

For our next question,
3:19 - 3:24

What are some of the tools you'd
prioritize learning first?
3:24 - 3:30

- Maybe we can all go through and
give our opinion on this? + Yeah
3:30 - 3:32

Tools to prioritize learning first?
3:32 - 3:36

I think learning your editor well,
just serves you in all capacities
3:37 - 3:41

like being efficient at editing files,
is just like a majority of
3:41 - 3:45

what you're going to spend your time doing.
And in general, just using your
3:45 - 3:49

keyboard more and your mouse less. It means
that you get to spend more of your
3:49 - 3:54

time doing useful things and
less of your time moving
3:54 - 3:56

I think that would be my top priority,
4:05 - 4:07

so I would say that for what
4:07 - 4:10

tool to prioritize will depend
on what exactly you're doing
4:10 - 4:16

I think the core idea is you should try
to find the types of tasks that you are
4:16 - 4:18

doing repetitively and so
4:18 - 4:24

if you are doing some sort of like
machine learning workload and
4:24 - 4:27

you find yourself using Jupyter notebooks,
like the one we presented
4:27 - 4:33

yesterday, a lot. Then again, using
a mouse for that might not be
4:33 - 4:36

the best idea and you want to familiarize
with the keyboard shortcuts
4:36 - 4:41

and pretty much with anything you will
end up figuring out that there are some
4:41 - 4:46

repetitive tasks, and you're running a
computer, and just trying to figure out
4:46 - 4:48

oh there's probably a better way to do this
4:48 - 4:51

be it a terminal, be it an editor
4:51 - 4:56

And it might be really interesting to
learn to use some of the topics that
4:56 - 5:01

we have covered, but if they're not
extremely useful in a everyday
5:01 - 5:05

basis then it might not be worth prioritizing them
5:07 - 5:07

Out of the topics
5:08 - 5:12

covered in this class, in my opinion, two
of the most useful things are version
5:12 - 5:15

control and text editors, and I think they're
a little bit different from each
5:15 - 5:19

other, in the sense that text editors I
think are really useful to learn well,
5:19 - 5:22

but it was probably the case that before
we started using Vim and all its fancy
5:22 - 5:25

keyboard shortcuts you had some other
text editor you were using before and
5:25 - 5:30

you could edit text just fine maybe a little
bit inefficiently, whereas I think
5:30 - 5:33

version control is another really useful
skill and that's one where if you don't
5:33 - 5:37

really know the tool properly, it can actually
lead to some problems like loss
5:37 - 5:39

of data or just inability to collaborate
properly with people. So I
5:39 - 5:43

think version control is one of the first
things that's worth learning well.
5:43 - 5:47

Yeah, I agree with that. I think
learning a tool like Git is just
5:47 - 5:50

gonna save you so much heartache down the line.
5:50 - 5:51

It, also, to add on to that,
5:52 - 5:57

it really helps you collaborate with others,
and Anish touched a little bit on GitHub
5:57 - 6:01

in the last lecture, and just learning
to use that tool well in order
6:01 - 6:05

to work on larger software projects
that other people are working on is
6:05 - 6:06

an invaluable skill.
6:10 - 6:11

For our next question,
6:11 - 6:13

"When do I use Python versus a
6:13 - 6:16

Bash script versus some other language?"
6:16 - 6:20

This is tough, because I think this comes
6:20 - 6:22

down to what Jose was saying earlier too,
6:22 - 6:24

that it really depends on
what you're trying to do.
6:24 - 6:27

For me, I think for Bash scripts in particular,
6:27 - 6:29

Bash scripts are for
6:29 - 6:33

automating running a bunch of commands.
You don't want to write any
6:33 - 6:35

other, like, business logic in Bash.
6:35 - 6:39

Like, it is just for, 'I want to run these
6:39 - 6:44

commands, in this order... maybe with
arguments?' But - but, like, even that,
6:44 - 6:48

it's unclear that you want a Bash script
once you start taking arguments.
6:48 - 6:53

Similarly, like, once you start doing any
kind of, like, text processing, or
6:53 - 6:55

configuration, all that,
6:55 - 6:59

reach for a language that is... a more, a more serious
6:59 - 7:01

programming language than Bash is.
7:01 - 7:03

Bash is really for short, one-off
7:03 - 7:10

scripts, or ones that have a very well-defined
use case, on the terminal, in
7:10 - 7:13

the shell, probably.
7:13 - 7:16

For a slightly more concrete guideline,
you might say, 'Write a
7:16 - 7:19

Bash script if it's less than a hundred
lines of code or so', but once it gets
7:19 - 7:22

beyond that point, Bash is kind of
unwieldy, and it's probably worth
7:22 - 7:25

switching to a more serious programming
language, like Python.
7:25 - 7:27

And, to add to that,
7:27 - 7:32

I would say, like, I found myself writing,
sometimes, scripts in Python, because
7:32 - 7:37

if I have already solved some subproblem
that covers part of the problem in Python,
7:37 - 7:41

I find it much easier to compose the
previous solution that I found out in
7:41 - 7:46

Python than just try to reuse Bash code,
that I don't find as reusable as Python.
7:46 - 7:50

And in the same way it's kind of nice that
a lot of people have written something
7:50 - 7:53

like Python libraries or like Ruby libraries
to do a lot of these things,
7:53 - 7:58

whereas, in Bash, it's kind of hard
to have, like, code reuse.
7:58 - 8:02

And, in fact,
8:02 - 8:08

I think to add to that, usually, if you
find a library, in some language that
8:08 - 8:12

helps with the task you're trying to
do, use that language for the job.
8:12 - 8:16

And in Bash, there are no libraries. There
are only the programs on your computer.
8:16 - 8:19

So you probably don't want to use
it, unless like there's a program
8:19 - 8:24

you can just invoke. I do think another
thing worth remembering about Bash is:
8:24 - 8:26

Bash is really hard to get right.
8:26 - 8:31

It's very easy to get it right for the particular
use case you're trying to solve right now,
8:31 - 8:32

but, like, things like,
8:32 - 8:36

"What if one of the filenames has a space in it?"
8:36 - 8:39

It has caused so many bugs, and so
8:39 - 8:43

many problems in Bash scripts. And, if you
use a - a real programming language, then
8:43 - 8:47

those problems just go away.
8:47 - 8:50

Yes! Checked it.
8:51 - 8:52

For our next question,
8:52 - 8:56

what is the difference between sourcing
a script, and executing that script?
8:57 - 9:03

Ooh. So, this, actually, we got in office
hours a - a while back, as well, which is,
9:03 - 9:07

'Aren't they the same? Like, aren't they
both just running the Bash script?'
9:07 - 9:08

And, it is true
9:08 - 9:12

both of these will end up executing the
lines of code that are in the script.
9:12 - 9:17

The ways in which they differ is that
sourcing a script is telling your
9:17 - 9:23

current Bash script, your current Bash
session, to execute that program,
9:23 - 9:29

whereas the other one is, 'Start up a new instance
of Bash, and run the program there, instead.'
9:29 - 9:35

And, this matters for things like... Imagine that
"script.sh" tries to change directories.
9:35 - 9:38

If you are running the script,
as in the second invocation,
9:38 - 9:43

"./script.sh", then the new
process is going to change
9:43 - 9:47

directories. But, by the time that script
exits, and returns to your shell,
9:47 - 9:52

your shell still remains in the same place. However,
if you do "cd" in a script, and you "source" it,
9:52 - 9:55

your current instance of Bash is the
one that ends up running it, and
9:55 - 9:58

so, it ends up "cd"-ing where you are.
9:58 - 10:01

This is also why, if you define functions,
10:01 - 10:05

for example, that you may want to
execute in your shell session,
10:05 - 10:07

you need to source the script, not run it,
10:07 - 10:10

because if you run it, that function
will be defined in the
10:10 - 10:12

instance of Bash,
10:12 - 10:17

in the Bash process that gets launched, but it
will not be defined in your current shell.
10:17 - 10:23

I think those are two of the biggest
differences between the two.
10:29 - 10:30

Next question...
10:30 - 10:35

"What are the places where various packages and tools
are stored and how does referencing them work?
10:35 - 10:39

What even is /bin or /lib?"
10:39 - 10:45

So, as we covered in the first lecture,
there is this PATH environment variable,
10:45 - 10:50

which is like a semicolon-separated-
string of all the places
10:50 - 10:55

where your shell is gonna look for binaries.
And, if you just do something like
10:55 - 10:58

"echo $PATH", you're gonna get this list;
10:58 - 11:02

all these places are gonna
be consulted, in order.
11:02 - 11:04

It's gonna go through all of them, and, in fact,
11:04 - 11:07

- There is already... Did we cover which? + Yeah
11:07 - 11:10

So, if you run "which", and a specific command,
11:10 - 11:14

the shell is actually gonna tell
you where it's finding this (command).
11:14 - 11:15

Beyond that,
11:15 - 11:20

there is like some conventions where a lot
of programs will install their binaries
11:20 - 11:24

and they're like /usr/bin (or at
least they will include symlinks)
11:24 - 11:26

in /usr/bin so you can find them
11:26 - 11:28

There's also a /usr/local/bin
11:28 - 11:34

There are special directories. For example,
/usr/sbin it's only for sudo user and
11:34 - 11:38

some of these conventions are slightly
different between different distros so
11:38 - 11:48

I know like some distros for example install
the user libraries under /opt for example
11:51 - 11:55

Yeah I think one thing just
to talk a little bit of more
11:56 - 12:01

about /bin and then Anish maybe you can
do the other folders so when it comes to
12:01 - 12:03

/bin the convention
12:03 - 12:10

There are conventions, and the conventions are
usually /bin are for essential system utilities
12:10 - 12:13

/usr/bin are for user programs and
12:13 - 12:17

/usr/local/bin are for user
compiled programs, sort of
12:17 - 12:22

so things that you installed that you intend
the user to run, are in /usr/bin
12:22 - 12:27

things that a user has compiled themselves and stuck
on your system, probably goes in /usr/local/bin
12:27 - 12:30

but again, this varies a lot from machine
to machine, and distro to distro
12:30 - 12:34

On Arch Linux, for example, /bin
is a symlink to /usr/bin
12:34 - 12:40

They're the same and as Jose mentioned, there's
also /sbin which is for programs that are
12:40 - 12:44

intended to only be run as root, that
also varies from distro to distro
12:44 - 12:47

whether you even have that directory, and
on many systems like /usr/local/bin
12:47 - 12:51

might not even be in your PATH, or
might not even exist on your system
12:51 - 12:56

On BSD on the other hand /usr/local/bin
is often used a lot more heavily
12:57 - 12:57

yeah so
12:57 - 13:01

What we were talking about so far, these
are all ways that files and folders are
13:01 - 13:05

organized on Linux things or Linux or
BSD things vary a little bit between
13:05 - 13:07

that and macOS or other platforms
13:07 - 13:09

I think for the specific locations,
13:09 - 13:11

if you to know exactly what it's
used for, you can look it up
13:11 - 13:17

But some general patterns to keep in mind or anything
with /bin in it has binary executable programs in it,
13:17 - 13:20

anything with \lib in it, has
libraries in it so things that
13:20 - 13:25

programs can link against, and then some
other things that are useful to know are
13:25 - 13:29

there's a /etc on many systems, which
has configuration files in it and
13:29 - 13:34

then there's /home, which underneath that directory
contains each user's home directory
13:34 - 13:39

so like on a linux box my username
or if it's Anish will
13:39 - 13:41

correspond to a home directory /home/anish
13:42 - 13:43

Yeah I guess there are
13:43 - 13:48

a couple of others like /tmp is usually
a temporary directory that gets
13:48 - 13:51

erased when you reboot not always but sometimes,
you should check on your system
13:52 - 13:59

There's a /var which often holds like
files the change over time so
13:59 - 14:06

these these are usually going to be things
like lock files for package managers
14:06 - 14:12

they're gonna be things like log files
files to keep track of process IDs
14:12 - 14:16

then there's /dev which shows devices so
14:16 - 14:21

usually so these are special files that
correspond to devices on your system we
14:21 - 14:27

talked about /sys, Anish mentioned /etc
14:29 - 14:36

/opt is a common one for just like third-party
software that basically it's usually for
14:36 - 14:41

companies ported their software to Linux
but they don't actually understand what
14:41 - 14:45

running software on Linux is like, and
so they just have a directory with all
14:45 - 14:51

their stuff in it and when those get installed
they usually get installed into /opt
14:51 - 14:56

I think those are the ones off the top of my head
14:56 - 14:58

yeah
14:58 - 15:02

And we will list these in our lecture notes
which will produce after this lecture
15:02 - 15:04

Next question
15:04 - 15:07

Should I apt-get install a Python whatever
15:07 - 15:11

package or pip install that package
15:11 - 15:14

so this is a good question that I think at
15:14 - 15:17

a higher level this question is asking
should I use my systems package manager
15:17 - 15:21

to install things or should I use some other
package manager. Like in this case
15:21 - 15:25

one that's more specific to a particular
language. And the answer here is also
15:25 - 15:29

kind of it depends, sometimes it's nice
to manage things using a system package
15:29 - 15:32

manager so everything can be installed
and upgraded in a single place but
15:32 - 15:35

I think oftentimes whatever is available
in the system repositories the things
15:35 - 15:38

you can get via a tool like
apt-get or something similar
15:38 - 15:41

might be slightly out of date compared to
the more language specific repository
15:41 - 15:45

so for example a lot of the Python packages
I use I really want the most
15:45 - 15:48

up-to-date version and so
I use pip to install them
15:49 - 15:51

Then, to extend on that is
15:51 - 15:58

sometimes the case the system packages
might require some other
15:58 - 16:02

dependencies that you might not have realized
about, and it's also might be
16:02 - 16:07

the case or like for some systems,
at least for like alpine Linux they
16:07 - 16:11

don't have wheels for like a lot of the
Python packages so it will just take
16:11 - 16:15

longer to compile them, it will take more
space because they have to compile them
16:15 - 16:21

from scratch. Whereas if you just go
to pip, pip has binaries for a lot of
16:21 - 16:23

different platforms and that will probably work
16:23 - 16:29

You also should be aware that pip might not do
the exact same thing in different computers
16:29 - 16:34

So, for example, if you are in a kind of laptop
or like a desktop that is running like
16:34 - 16:39

a x86 or x86_64 you probably have binaries,
but if you're running something
16:39 - 16:43

like Raspberry Pi or some other kind of
embedded device. These are running on a
16:43 - 16:48

different kind of hardware architecture
and you might not have binaries
16:48 - 16:52

I think that's also good to take into account,
in that case in might be worthwhile to
16:52 - 16:59

use the system packages just because they
will take much shorter to get them
16:59 - 17:02

than to just to compile from scratch
the entire Python installation
17:02 - 17:07

Apart from that, I don't think I can think of any exceptions
where I would actually use the system packages
17:07 - 17:09

instead of the Python provided ones
17:19 - 17:21

So, one other thing to keep in mind is that
17:21 - 17:26

sometimes you will have more than one
program on your computer and you might
17:26 - 17:30

be developing more than one program on
your computer and for some reason not
17:30 - 17:34

all programs are always built with the latest
version of things, sometimes they
17:34 - 17:39

are a little bit behind, and when you
install something system-wide you can
17:39 - 17:45

only... depends on your exact system,
but often you just have one version
17:45 - 17:50

what pip lets you do, especially combined
with something like python's virtualenv,
17:50 - 17:55

and similar concepts exist for other
languages, where you can sort of say
17:55 - 18:00

I want to (NPM does the same thing as well
with its node modules, for example) where
18:00 - 18:06

I'm gonna compile the dependencies of
this package in sort of a subdirectory
18:06 - 18:10

of its own, and all of the versions that it
requires are going to be built in there
18:10 - 18:14

and you can do this separately for separate
projects so there they have
18:14 - 18:17

different dependencies or the same dependencies
with different versions
18:17 - 18:21

they still sort of kept separate. And that
is one thing that's hard to achieve
18:21 - 18:23

with system packages
18:27 - 18:28

Next question
18:28 - 18:33

What's the easiest and best profiling tools
to use to improve performance of my code?
18:34 - 18:39

This is a topic we could talk
about for a very long time
18:39 - 18:43

The easiest and best is to print stuff using time
18:43 - 18:48

Like, I'm not joking, very often
the easiest thing is in your code
18:49 - 18:54

At the top you figure out what the current
time is, and then you do sort of
18:54 - 18:58

a binary search over your program of add
a print statement that prints how much
18:58 - 19:03

time has elapsed since the start of your
program and then you do that until you
19:03 - 19:06

find the segment of code that took the
longest. And then you go into that
19:06 - 19:10

function and then you do the same thing
again and you keep doing this until you
19:10 - 19:14

find roughly where the time was spent. It's
not foolproof, but it is really easy
19:14 - 19:17

and it gives you good information quickly
19:17 - 19:25

if you do need more advanced information
Valgrind has a tool called cache-grind?
19:25 - 19:29

call grind? Cache grind? One of the two.
19:29 - 19:33

and this tool lets you run your program and
19:33 - 19:39

measure how long everything takes and
all of the call stacks, like which
19:39 - 19:43

function called which function, and what
you end up with is a really neat
19:43 - 19:47

annotation of your entire program source
with the heat of every line basically
19:47 - 19:52

how much time was spent there. It does
slow down your program by like an order
19:52 - 19:56

of magnitude or more, and it doesn't really
support threads but it is really
19:56 - 20:01

useful if you can use it. If you can't,
then tools like perf or similar tools
20:01 - 20:05

for other languages that do usually some
kind of sampling profiling like we
20:05 - 20:10

talked about in the profiler lecture, can
give you pretty useful data quickly,
20:10 - 20:15

but it's a lot of data around
this, but they're a little bit
20:15 - 20:19

biased and what kind of things they usually
highlight as a problem and it
20:19 - 20:23

can sometimes be hard to extract meaningful
information about what should
20:23 - 20:28

I change in response to them. Whereas the
sort of print approach very quickly
20:28 - 20:32

gives you like this section
of code is bad or slow
20:32 - 20:35

I think would be my answer
20:35 - 20:40

Flamegraphs are great, they're a good way
to visualize some of this information
20:41 - 20:46

Yeah I just have one thing to add,
oftentimes programming languages
20:46 - 20:49

have language specific tools for profiling
so to figure out what's the
20:49 - 20:52

right tool to use for your language like if
you're doing JavaScript in the web browser
20:52 - 20:55

the web browser has a really nice tool for
doing profiling you should just use that
20:55 - 21:00

or if you are using go, for example, go has a built-in
profiler is really good you should just use that
21:02 - 21:04

A last thing to add to that
21:04 - 21:10

Sometimes you might find that doing this binary
search over time that you're kind of
21:10 - 21:14

finding where the time is going, but this
time is sometimes happening because
21:14 - 21:18

you're waiting on the network, or you're
waiting for some file, and in that case
21:18 - 21:23

you want to make sure that the time
that is, if I want to write
21:23 - 21:27

like 1 gigabyte file or like read 1
gigabyte file and put it into memory
21:27 - 21:32

you want to check that the actual time
there, is the minimum amount of time
21:32 - 21:36

you actually have to wait. If it's ten times
longer, you should try to use some
21:36 - 21:39

other tools that we covered in the debugging
and profiling section to see
21:39 - 21:46

why you're not utilizing all your
resources because that might...
21:51 - 21:56

Because that might be a lot of what's happening
thing, like for example, in my research
21:56 - 21:59

in machine learning workloads, a lot of
time is loading data and you have to
21:59 - 22:03

make sure well like the time it takes to
load data is actually the minimum amount
22:03 - 22:08

of time you want to have that happening
22:08 - 22:13

And to build on that, there are actually
specialized tools for doing things like
22:13 - 22:17

analyzing wait times. Very often when
you're waiting for something what's
22:17 - 22:21

really happening is you're issuing your
system call, and that system call takes
22:21 - 22:24

some amount of time to respond. Like you do
a really large write, or a really large read
22:24 - 22:28

or you do many of them, and one thing
that can be really handy here is
22:28 - 22:32

to try to get information out of the
kernel about where your program is
22:32 - 22:37

spending its time. And so there's (it's
not new), but there's a relatively
22:37 - 22:43

newly available thing called BPF or eBPF.
Which is essentially kernel tracing
22:43 - 22:49

and you can do some really cool things with
it, and that includes tracing user programs.
22:49 - 22:52

It can be a little bit awkward to
get started with, there's a tool
22:52 - 22:56

called BPF trace that i would recommend
you looking to, if you need to do like
22:56 - 23:00

this kind of low-level performance debugging.
But it is really good for this
23:00 - 23:05

kind of stuff. You can get things like
histograms over how much time was spent
23:05 - 23:07

in particular system calls
23:07 - 23:10

It's a great tool
23:12 - 23:15

What browser plugins do you use?
23:17 - 23:20

I try to use as few as I can get away with using
23:20 - 23:26

because I don't like things being in
my browser, but there are a couple of
23:26 - 23:30

ones that are sort of staples.
The first one is uBlock Origin.
23:30 - 23:37

So uBlock Origin is one of many ad blockers but
it's a little bit more than an ad blocker.
23:37 - 23:43

It is (a what do they call it?) a
network filtering tool so it lets
23:43 - 23:47

you do more things than just block ads.
It also lets you like block connections
23:47 - 23:51

to certain domains, block connections
for certain types of resources
23:51 - 23:56

So I have mine set up in what they call
the Advanced Mode, where basically
23:56 - 24:02

you can disable basically all network requests.
But it's not just Network requests,
24:02 - 24:07

It's also like I have disabled all inline
scripts on every page and all
24:07 - 24:12

third-party images and resources, and then
you can sort of create a whitelist
24:12 - 24:16

for every page so it gives you really
low-level tools around how to
24:16 - 24:20

how to improve the security of your browsing.
But you can also set it in not the
24:20 - 24:24

advanced mode, and then it does much of
the same as a regular ad blocker would
24:24 - 24:28

do, although in a fairly efficient way
if you're looking at an ad blocker it's
24:28 - 24:32

probably the one to use and it
works on like every browser
24:32 - 24:34

That would be my top pick I think,
24:39 - 24:44

I think probably the one I
use like the most actively
24:44 - 24:50

is one called Stylus. It lets you modify
the CSS or like the stylesheets
24:50 - 24:55

that webpages have. And it's pretty
neat, because sometimes you're
24:55 - 24:59

looking at a website and you want
to hide some part of the website
24:59 - 25:04

you don't care about. Like maybe a ad, maybe
some sidebar you're not finding useful
25:04 - 25:06

The thing is, at the end of
the day these things are
25:06 - 25:10

displaying in your browser, and you
have control of what code is
25:10 - 25:13

executing and similar to what Jon was
saying, like you can customize this
25:13 - 25:18

to no end, and what I have for a lot of
web pages like hide this this part, or
25:18 - 25:23

also trying to make like dark modes for
them like you can change pretty much the
25:23 - 25:27

color for every single website. And what
is actually pretty neat is that there's
25:27 - 25:31

like a repository online of people that
have contributed this is stylesheets
25:31 - 25:35

for the websites. So someone probably
has (done) one for GitHub
25:35 - 25:39

Like I want dark GitHub and someone has
already contributed one that makes
25:39 - 25:45

that much more pleasing to browse. Apart
from that, one that it's not really
25:45 - 25:49

fancy, but I have found incredibly helpful
is one that just takes a screenshot an
25:49 - 25:53

entire website. And It will
scroll for you and make
25:53 - 25:58

compound image of the entire website and that's
really great for when you're trying to
25:58 - 26:00

print a website and is just terrible.
26:00 - 26:01

(It's built into Firefox)
26:01 - 26:03

oh interesting
26:03 - 26:06

oh now that you mention builtin to Firefox,
another one that I really like about
26:06 - 26:09

Firefox is the multi account containers
26:09 - 26:11

(Oh yeah, it's fantastic)
26:11 - 26:12

Which kind of lets you
26:12 - 26:17

By default a lot of web browsers, like
for example Chrome, have this
26:17 - 26:21

notion of like there's session that you
have, where you have all your cookies
26:21 - 26:25

and they are kind of all shared from the
different websites in the sense of
26:25 - 26:31

you keep opening new tabs and unless you go into
incognito you kind of have the same profile
26:31 - 26:34

And that profile is the same for
all websites, there is this
26:34 - 26:36

Is it an extension or is it built in?
26:36 - 26:41

(it's a mix, it's complicated)
26:41 - 26:46

So I think you actually have to say you want
to install it or enable it and again
26:46 - 26:50

the name is Multi Account Containers and
these let you tell Firefox to have
26:50 - 26:54

separate isolated sessions. So
for example, you want to say
26:54 - 26:59

I have a separate sessions for whenever I
visit to Google or whenever I visit Amazon
26:59 - 27:02

and that can be pretty neat, because then you can
27:02 - 27:08

At a browser level it's ensuring that no information
sharing is happening between the two of them
27:08 - 27:12

And it's much more convenient than
having to open a incognito window
27:12 - 27:14

where it's gonna clean all the time the stuff
27:14 - 27:17

(One thing to mention is Stylus vs Stylish)
27:18 - 27:20

Oh yeah, I forgot about that
27:20 - 27:25

One important thing is the browser extension
for side loading CSS Stylesheets
27:25 - 27:32

it's called a Stylus and that's different
from the older one that was
27:32 - 27:37

called Stylish, because that one got
bought at some point by some shady
27:37 - 27:41

company, that started abusing it not only to have
27:41 - 27:46

that functionality, but also to read your
entire browser history and send that
27:46 - 27:48

back to their servers so they could data mine it.
27:48 - 27:54

So, then people just built this open-source alternative
that is called Stylus, and that's the one
27:54 - 27:59

we recommend. Said that, I think the repository
for styles is the same for the
27:59 - 28:04

two of them, but I would have
to double check that.
28:04 - 28:06

Do you have any browser plugins Anish?
28:06 - 28:09

Yes, so I also have some recommendations
for browser plugins
28:09 - 28:14

I also use uBlock Origin and I also use Stylus,
28:14 - 28:19

but one other one that I'd recommend is
integration with a password manager
28:19 - 28:22

So this is a topic that we have in
the lecture notes for the security
28:22 - 28:25

lecture, but we didn't really get to talk
about in detail. But basically password
28:25 - 28:28

managers do a really good job of increasing
your security when working
28:28 - 28:32

with online accounts, and having browser
integration with your password manager
28:32 - 28:34

can save you a lot of time like you
can open up a website then it can
28:34 - 28:37

autofill your login information for you
sir and you go and copy and paste it
28:37 - 28:40

back and forth between a separate program
if it's not integrated with your
28:40 - 28:43

web browser, and it can also, this integration,
can save you from certain
28:43 - 28:48

attacks that would otherwise be possible if
you were doing this manual copy pasting.
28:48 - 28:51

For example, phishing attacks. So
you find a website that looks very
28:51 - 28:54

similar to Facebook and you go to log in
with your facebook login credentials and
28:54 - 28:57

you go to your password manager and copy
paste the correct credentials into this
28:57 - 29:00

funny web site and now all of a sudden
it has your password but if you have
29:00 - 29:03

browser integration then the extension
can automatically check
29:03 - 29:07

like. Am I on F A C E B O O K.com,or
is it some other domain
29:07 - 29:11

that maybe look similar and it will not enter
the login information if it's the wrong domain
29:11 - 29:16

so browser extension for
password managing is good
29:16 - 29:18

Yeah I agree
29:19 - 29:21

Next question
29:21 - 29:24

What are other useful data wrangling tools?
29:24 - 29:32

So in yesterday's lecture, I mentioned curl, so
curl is a fantastic tool for just making web
29:32 - 29:36

requests and dumping them to your terminal.
You can also use it for things
29:36 - 29:41

like uploading files which is really handy.
29:41 - 29:48

In the exercises of that lecture we also talked about
JQ and pup which are command line tools that let you
29:48 - 29:53

basically write queries over JSON
and HTML documents respectively
29:53 - 30:00

that can be really handy. Other
data wrangling tools?
30:00 - 30:04

Ah Perl, the Perl programming language is
30:04 - 30:08

often referred to as a write only
programming language because it's
30:08 - 30:13

impossible to read even if you wrote it.
But it is fantastic at doing just like
30:13 - 30:22

straight up text processing, like nothing
beats it there, so maybe worth learning
30:22 - 30:24

some very rudimentary Perl just
to write some of those scripts
30:24 - 30:29

It's easier often than writing some like hacked-up
combination of grep and awk and sed,
30:29 - 30:36

and it will be much faster to just tack something
up than writing it up in Python, for example
30:36 - 30:44

but apart from that, other data wrangling
30:44 - 30:47

No, not off the top of my head really
30:47 - 30:54

column -t, if you pipe any white space separated
30:54 - 30:59

input into column -t it will align all
the white space of the columns so that
30:59 - 31:06

you get nicely aligned columns that's, and
head and tail but we talked about those
31:09 - 31:14

I think a couple of additions to that,
that I find myself using commonly
31:14 - 31:20

one is vim. Vim can be pretty useful
for like data wrangling on itself
31:20 - 31:22

Sometimes you might find that the operation
that you're trying to do is
31:22 - 31:28

hard to put down in terms of piping
different operators but if you
31:28 - 31:33

can just open the file and just record
31:33 - 31:37

a couple of quick vim macros to do what you
want it to do, it might be like much,
31:37 - 31:42

much easier. That's one, and then the other
one, if you're dealing with tabular
31:42 - 31:46

data and you want to do more complex operations
like sorting by one column,
31:46 - 31:51

then grouping and then computing some sort
of statistic, I think a lot of that
31:51 - 31:56

workload I ended up just using Python
and pandas because it's built for that
31:56 - 32:00

And one of the pretty neat features that
I find myself also using is that it
32:00 - 32:04

will export to many different formats.
So this intermediate state
32:04 - 32:09

has its own kind of pandas dataframe
object but it can
32:09 - 32:14

export to HTM, LaTeX, a lot of different
like table formats so if your end
32:14 - 32:20

product is some sort of summary table, then pandas
I think it's a fantastic choice for that
32:21 - 32:25

I would second the vim and also
Python I think those are
32:25 - 32:29

two of my most used data wrangling tools.
For the vim one, last year we had a demo
32:29 - 32:32

in the series in the lecture notes, but
we didn't cover it in class we had a
32:32 - 32:38

demo of turning an XML file into a JSON version
of that same data using only vim macros
32:38 - 32:40

And I think that's actually the
way I would do it in practice
32:40 - 32:43

I don't want to go find a tool that does
this conversion it is actually simple
32:43 - 32:45

to encode as a vim macro,
then I just do it that way
32:45 - 32:49

And then also Python especially in an interactive
tool like a Jupyter notebook
32:49 - 32:51

is a really great way of doing data wrangling
32:51 - 32:53

A third tool I'd mention which
I don't remember if we
32:53 - 32:55

covered in the data wrangling
lecture or elsewhere
32:55 - 32:59

is a tool called pandoc which can do transformations
between different text
32:59 - 33:03

document formats so you can convert from
plaintext to HTML or HTML to markdown
33:03 - 33:07

or LaTeX to HTML or many other formats
it actually it supports a large
33:07 - 33:10

list of input formats and a
large list of output formats
33:10 - 33:16

I think there's one last one which I mentioned briefly
in the lecture on data wrangling which is
33:16 - 33:20

the R programming language, it's
an awful (I think it's an awful)
33:20 - 33:25

language to program in. And i would never
use it in the middle of a data wrangling
33:25 - 33:31

pipeline, but at the end, in order to like produce
pretty plots and statistics R is great
33:31 - 33:36

Because R is built for doing
statistics and plotting
33:36 - 33:41

there's a library for are called
ggplot which is just amazing
33:41 - 33:47

ggplot2 i guess technically It's
great, it produces very
33:47 - 33:51

nice visualizations and it lets you do,
it does very easily do things like
33:51 - 33:58

If you have a data set that has like multiple
facets like it's not just X and Y
33:58 - 34:03

it's like X Y Z and some other variable,
and then you want to plot like the
34:03 - 34:08

throughput grouped by all of those parameters
at the same time and produce
34:08 - 34:12

a visualization. R very easily let's you
do this and I haven't seen anywhere
34:12 - 34:15

that lets you do that as easily
34:17 - 34:18

Next question,
34:18 - 34:21

What's the difference between
Docker and a virtual machine
34:23 - 34:28

What's the easiest way to explain this? So docker
34:28 - 34:31

starts something called containers and
docker is not the only program that
34:31 - 34:37

starts containers. There are many others
and usually they rely on some feature of
34:37 - 34:40

the underlying kernel in the case of
docker they use something called LXC
34:40 - 34:48

which are Linux containers and the basic
premise there is if you want to start
34:48 - 34:53

what looks like a virtual machine that
is running roughly the same operating
34:53 - 34:57

system as you are already running on your
computer then you don't really need
34:57 - 35:05

to run another instance of the kernel
really that other virtual machine can
35:05 - 35:10

share a kernel. And you can just use the
kernels built in isolation mechanisms to
35:10 - 35:14

spin up a program that thinks it's
running on its own hardware but in
35:14 - 35:19

reality it's sharing the kernel and so this
means that containers can often run
35:19 - 35:23

with much lower overhead than a full virtual
machine will do but you should
35:23 - 35:26

keep in mind that it also has somewhat weaker
isolation because you are sharing
35:26 - 35:31

a kernel between the two if you spin up
a virtual machine the only thing that's
35:31 - 35:36

shared is sort of the hardware and to
some extent the hypervisor, whereas
35:36 - 35:41

with a docker container you're sharing
the full kernel and the that is a
35:41 - 35:45

different threat model that you
might have to keep in mind
35:47 - 35:52

One another small note there as Jon pointed
out, to use containers something
35:52 - 35:56

like Docker you need the underlying operating
system to be roughly the same
35:56 - 36:00

as whatever the program that's running
on top of the container expects and so
36:00 - 36:04

if you're using macOS for example, the
way you use docker is you run Linux
36:04 - 36:08

inside a virtual machine and then you can
run Docker on top of Linux so maybe
36:08 - 36:12

if you're going for containers in order
to get better performance your trading
36:12 - 36:15

isolation for performance if you're running
on Mac OS that may not work out
36:15 - 36:17

exactly as expected
36:17 - 36:21

And one last note is that there
is a slight difference, so
36:21 - 36:26

with Docker and containers,
one of the gotchas you have
36:26 - 36:29

to be familiar with is that containers
are more similar to virtual
36:29 - 36:33

machines in the sense of that they will
persist all the storage that you
36:33 - 36:36

have where Docker by default won't have that.
36:36 - 36:38

Like Docker is supposed to be running
36:38 - 36:42

So the main idea is like I want
to run some software and
36:42 - 36:46

I get the image and it runs and if you
want to have any kind of persistent
36:46 - 36:50

storage that links to the host system
you have to kind of manually specify
36:50 - 36:56

that, whereas a virtual machine is using
some virtual disk that is being provided
36:56 - 37:03

Next question
37:03 - 37:05

What are the advantages of each operating system
37:05 - 37:09

and how can we choose between them?
For example, choosing the best Linux
37:09 - 37:11

distribution for our purposes
37:14 - 37:17

I will say that for many, many tasks the
37:17 - 37:20

specific Linux distribution that you're
running is not that important
37:20 - 37:24

the thing is, it's just what kind of
37:24 - 37:28

knowing that there are different types
or like groups of distributions,
37:28 - 37:32

So for example, there are some distributions
that have really frequent updates
37:32 - 37:39

but they kind of break more easily. So for
example Arch Linux has a rolling update
37:39 - 37:44

way of pushing updates, where things might
break but they're fine with the things
37:44 - 37:48

being that way. Where maybe where you
have some really important web server
37:48 - 37:51

that is hosting all your business
analytics you want that thing
37:51 - 37:56

to have like a much more steady way of
updates. So that's for example why you
37:56 - 37:58

will see distributions like Debian being
37:58 - 38:03

much more conservative about what they push, or
even for example Ubuntu makes a difference
38:03 - 38:07

between the Long Term Releases
that they are only update every
38:07 - 38:12

two years and the more periodic
releases of one there is a
38:12 - 38:17

it's like two a year that they make.
So, kind of knowing that there's the
38:17 - 38:21

difference apart from that some distributions
have different ways
38:21 - 38:27

of providing the binaries
to you and the way they
38:27 - 38:34

have the repositories so I think a lot of Red
Hat Linux don't want non free drivers in
38:34 - 38:37

their official repositories where I
think Ubuntu is fine with some of
38:37 - 38:42

them, apart from that I think like just
a lot of what is core to most Linux
38:42 - 38:47

distros is kind of shared between them
and there's a lot of learning in the
38:47 - 38:51

common ground. So you don't have
to worry about the specifics
38:52 - 38:56

Keeping with the theme of this class being somewhat
opinionated, I'm gonna go ahead and say
38:56 - 39:00

that if you're using Linux especially for
the first time choose something like
39:00 - 39:04

Ubuntu or Debian. So you Ubuntu to is a
Debian based distribution but maybe is a
39:04 - 39:07

little bit more friendly, Debian is a little
bit more minimalist. I use Debian
39:07 - 39:10

and all my servers, for example. And I use
Debian desktop on my desktop computers
39:10 - 39:15

that run Linux if you're going for maybe
trying to learn more things and you want
39:15 - 39:19

a distribution that trades stability for
having more up-to-date software maybe
39:19 - 39:22

at the expense of you having to fix a
broken distribution every once in a
39:22 - 39:27

while then maybe you can consider something
like Arch Linux or Gentoo
39:27 - 39:33

or Slackware. Oh man, I'd say that like
if you're installing Linux and just like
39:33 - 39:35

want to get work done Debian is a great choice
39:36 - 39:38

Yeah I think I agree with that.
39:38 - 39:41

The other observation is like
you couldn't install BSD
39:41 - 39:47

BSD has gotten, has come a long way from
where it was. There's still a bunch of
39:47 - 39:51

software you can't really get for BSD but
it gives you a very well-documented
39:51 - 39:56

experience and and one thing that's different
about BSD compared to Linux is
39:56 - 40:03

that in an BSD when you install BSD you
get a full operating system, mostly
40:03 - 40:08

So many of the programs are maintained by
the same team that maintains the kernel
40:08 - 40:11

and everything is sort of upgraded together,
which is a little different
40:11 - 40:13

than how thanks work in the Linux world it does
40:13 - 40:17

mean that things often move a little bit
slower. I would not use it for things
40:17 - 40:22

like gaming either, because drivers support
is meh. But it is an interesting
40:22 - 40:31

environment to look at. And then for things
like Mac OS and Windows I think
40:31 - 40:36

If you are a programmer, I don't know why
you are using Windows unless you are
40:36 - 40:42

building things for Windows; or you want
to be able to do gaming and stuff
40:42 - 40:47

but in that case, maybe try dual booting,
even though that's a pain too
40:47 - 40:52

Mac OS is a is a good sort of middle point
between the two where you get a system
40:52 - 40:58

that is like relatively nicely polished
for you. But you still have access to
40:58 - 41:01

some of the lower-level bits
at least to a certain extent.
41:01 - 41:07

it's also really easy to dual boot Mac OS and Windows
it is not quite the case with like Mac OS and
41:07 - 41:10

Linux or Linux and Windows
41:14 - 41:16

Alright, for the rest of the
questions so these are
41:16 - 41:19

all 0 upvote questions so maybe we can go
through them quickly in the last five
41:19 - 41:23

or so minutes of class. So the next
one is Vim versus Emacs? Vim!
41:23 - 41:31

Easy answer, but a more serious answer is like I think
all three of us use vim as our primary editor
41:31 - 41:35

I use Emacs for some research specific
stuff which requires Emacs but
41:35 - 41:39

at a higher level both editors have interesting
ideas behind them and if you
41:39 - 41:43

have the time is worth exploring both
to see which fits you better and also
41:43 - 41:47

you can use Emacs and run it in a vim
emulation mode. I actually know a
41:47 - 41:49

good number of people who do that so
they get access to some of the cool
41:49 - 41:53

Emacs functionality and some of the cool
philosophy behind that like Emacs is
41:53 - 41:55

programmable through Lisp which is kind of cool.
41:55 - 41:59

Much better than vimscript, but people like
vim's modal editing, so there's an
41:59 - 42:04

emacs plugin called evil mode which gives
you vim modal editing within Emacs so
42:04 - 42:08

it's not necessarily a binary choice you
can kind of combine both tools if you
42:08 - 42:11

want to. And it's worth exploring
both if you have the time.
42:11 - 42:13

Next question
42:13 - 42:16

Any tips or tricks for machine
learning applications?
42:19 - 42:22

I think, like knowing how
42:22 - 42:25

a lot of these tools, mainly the data wrangling
42:25 - 42:30

a lot of the shell tools, it's really
important because it seems a lot
42:30 - 42:34

of what you're doing as machine learning
researcher is trying different things
42:34 - 42:39

but I think one core aspect of doing that,
and like a lot of scientific work is being
42:39 - 42:45

able to have reproducible results
and logging them in a sensible way
42:45 - 42:48

So for example, instead of trying to come
up with really hacky solutions of how
42:48 - 42:51

you name your folders to make
sense of the experiments
42:51 - 42:53

Maybe it's just worth having for example
42:53 - 42:56

what I do is have like a JSON
file that describes the
42:56 - 43:00

entire experiment I know like all the parameters
that are within and then I can
43:00 - 43:05

really quickly, using the tools that
we have covered, query for all the
43:05 - 43:10

experiments that have some specific
purpose or use some data set
43:10 - 43:15

Things like that. Apart from that, the other
side of this is, if you are running
43:15 - 43:20

kind of things for training machine
learning applications and you
43:20 - 43:24

are not already using some sort of
cluster, like university or your
43:24 - 43:28

company is providing and you're just kind
of manually sshing, like a lot of
43:28 - 43:31

labs do, because that's kind of the easy way
43:31 - 43:37

It's worth automating a lot of that job
because it might not seem like it but
43:37 - 43:41

manually doing a lot of these operations
takes away a lot of your time and also
43:41 - 43:45

kind of your mental energy
for running these things
43:49 - 43:52

Anymore vim tips?
43:52 - 43:57

I have one. So in the vim lecture we tried
not to link you to too many different
43:57 - 44:00

vim plugins because we didn't want that
lecture to be overwhelming but I think
44:00 - 44:03

it's actually worth exploring vim plugins
because there are lots and lots
44:03 - 44:07

of really cool ones out there.
One resource you can use is the
44:07 - 44:11

different instructors dotfiles like a lot
of us, I think I use like two dozen
44:11 - 44:14

vim plugins and I find a lot of them quite
helpful and I use them every day
44:14 - 44:18

we all use slightly different subsets of
them. So go look at what we use or look
44:18 - 44:22

at some of the other resources we've linked
to and you might find some stuff useful
44:23 - 44:27

A thing to add to that is, I don't think
we went into a lot detail in the
44:27 - 44:32

lecture, correct me if I'm wrong. It's
getting familiar with the leader key
44:32 - 44:35

Which is kind of a special key
that a lot of programs will
44:35 - 44:39

especially plugins, that will link to
and for a lot of the common operations
44:39 - 44:45

vim has short ways of doing it, but you
can just figure out like quicker
44:45 - 44:50

versions for doing them. So for example, like
I know that you can do like semicolon WQ
44:50 - 44:56

to save and exit or that you
can do like capital ZZ but I
44:56 - 44:59

just actually just do leader (which for
me is the space) and then W. And I have
44:59 - 45:04

done that for a lot of a lot of kind of
common operations that I keep doing all
45:04 - 45:08

the time. Because just saving one keystroke
for an extremely common operation
45:08 - 45:11

is just saving thousands a month
45:11 - 45:13

Yeah just to expand a little bit
45:13 - 45:17

on what the leader key is so in vim you
can bind some keys I can do like ctrl J
45:17 - 45:20

does something like holding one key and
then pressing another I can bind that to
45:20 - 45:24

something or I can bind a single keystroke
to something. What the leader
45:24 - 45:26

key lets you do, is bind
45:26 - 45:28

So you can assign any key
to be the leader key and
45:28 - 45:33

then you can assign leader followed by
some other key to some action so for
45:33 - 45:37

example like Jose's leader key is space
and they can combine space and then
45:37 - 45:42

releasing space followed by some other
key to an arbitrary vim command so it
45:42 - 45:46

just gives you yet another way of binding
like a whole set of key combinations.
45:46 - 45:50

Leader key plus kind of any key on
the keyboard to some functionality
45:50 - 45:54

I think I've I forget whether
we covered macros in the vim
45:54 - 45:59

uh sure but like vim macros are worth
learning they're not that complicated
45:59 - 46:03

but knowing that they're there and knowing
how to use them is going to save
46:03 - 46:10

you so much time. The other one is something
called marks. So in vim you can
46:10 - 46:13

press m and then any letter on your keyboard
to make a mark in that file and
46:13 - 46:18

then you can press apostrophe on the
same letter to jump back to the same
46:18 - 46:22

place. This is really useful if you're
like moving back and forth
46:22 - 46:25

between two different parts of your code
for example. You can mark one as A and
46:25 - 46:30

one as B and you can then jump between
them with tick A and tick B.
46:30 - 46:35

There's also Ctrl+O which jumps to the previous
place you were in the file no matter
46:35 - 46:41

what caused you to move. So for example
if I am in a some line and then I jump
46:41 - 46:45

to B and then I jump to A, Ctrl+O will
take me back to B and then back to the
46:45 - 46:49

place I originally was. This can also be
handy for things like if you're doing a
46:49 - 46:53

search then the place that you
started the search is a part of
46:53 - 46:56

that stack. So I can do a search I can
then like step through the results
46:56 - 47:01

and like change them and then Ctrl+O
all the way back up to the search
47:01 - 47:06

Ctrl+O also lets you move across files so
if I go from one file to somewhere else in
47:06 - 47:10

different file and somewhere else in the
first file Ctrl+O will move me back
47:10 - 47:15

through that stack and then there's
Ctrl+I to move forward in that
47:15 - 47:21

stack and so it's not as though you
pop it and it goes away forever
47:21 - 47:27

The command colon earlier is really handy.
So, colon earlier gives you an earlier
47:27 - 47:33

version of the same file and it it does
this based on time not based on actions
47:33 - 47:37

so for example if you press a bunch of like
undo and redo and make some changes
47:37 - 47:43

and stuff, earlier will take a literally
earlier as in time version of your file
47:43 - 47:47

and restore it to your buffer. This can
sometimes be good if you like undid and
47:47 - 47:51

then rewrote something and then realize
you actually wanted the version that was
47:51 - 47:55

there before you started undoing earlier
let's you do this. And there's a plug-in
47:55 - 48:02

called undo tree or something like
that There are several of these,
48:02 - 48:06

that let you actually explore the full
tree of undo history the vim keeps
48:06 - 48:09

because it doesn't just keep a linear history
it actually keeps the full tree
48:09 - 48:13

and letting you explore that might in
some cases save you from having to
48:13 - 48:16

re-type stuff you typed in the past or
stuff you just forgot exactly what you
48:16 - 48:21

had there that used to work and no longer
works. And this is one final one I
48:21 - 48:27

want to mention which is, we mentioned
how in vim you have verbs and nouns
48:27 - 48:33

right to your verbs like delete or yank
and then you have nouns like next of
48:33 - 48:37

this character or percent to swap brackets
and that sort of stuff the
48:37 - 48:45

search command is a noun so you can do
things like D slash and then a string
48:45 - 48:50

and it will delete up to the next match
of that pattern this is extremely useful
48:50 - 48:54

and I use it all the time
48:58 - 49:04

One another neat addition on the undo stuff
that I find incredibly valuable in
49:04 - 49:08

an everyday basis is that like one of
the built-in functionalities of vim
49:08 - 49:14

is that you can specify an undo directory
and if you have a specified an
49:14 - 49:18

undo directory by default vim, if you
don't have this enabled, whenever you
49:18 - 49:23

enter a file your undo history is
clean, there's nothing in there
49:23 - 49:26

and as you make changes and then
undo them you kind of create this
49:26 - 49:33

history but as soon as you exit the
file that's lost. Sorry, as soon
49:33 - 49:37

as you exit vim, that's lost. However
if you have an undodir, vim is
49:37 - 49:42

gonna persist all those changes into
this directory so no matter how many
49:42 - 49:46

times you enter and leave that history
is persisted and it's incredibly
49:46 - 49:48

helpful because even like
49:48 - 49:50

it can be very helpful for
some files that you modify
49:50 - 49:55

often because then you can kind of keep
the flow. But it's also sometimes really
49:55 - 50:00

helpful if you modify your bashrc see and
something broke like five days later and
50:00 - 50:03

then you've vim again. Like what actually
did I change ,if you don't
50:03 - 50:07

have say like version control, then
you can just check the undos and
50:07 - 50:11

that's actually what happened. And
the last one, it's also really
50:11 - 50:15

worth familiarizing yourself with registers
and what different special
50:15 - 50:20

registers vim uses. So for example if
you want to copy/paste really that's
50:20 - 50:26

gone into in a specific register and if you
want to for example use the a OS a copy
50:26 - 50:30

like the OS clipboard, you should
be copying or yanking
50:30 - 50:36

copying and pasting from a different register
and there's a lot of them and yeah
50:36 - 50:41

I think that you should explore, there's
a lot of things to know about registers
50:42 - 50:45

The next question is asking about two-factor
authentication and I'll just give
50:45 - 50:48

a very quick answer to this one in the interest
of time. So it's worth using two
50:48 - 50:52

factor auth for anything security sensitive
so I use it for my GitHub
50:52 - 50:57

account and for my email and stuff like
that. And there's a bunch of different
50:57 - 51:01

types of two-factor auth. From SMS based
to factor auth where you get special
51:01 - 51:05

like a number texted to you when you try
to log in you have to type that number
51:05 - 51:09

and to other tools like universal to
factor this is like those Yubikeys
51:09 - 51:11

that you plug into your you have
to tap it every time you login
51:11 - 51:18

so not all, (yeah Jon is holding a
Yubikey), not all two-factor auth is
51:18 - 51:22

created equal and you really want to be
using something like U2F rather than SMS
51:22 - 51:25

based to factor auth. There something
based on one-time pass codes that you
51:25 - 51:29

have to type in we don't have time to get
into the details of why some methods
51:29 - 51:32

are better than others but at a high
level use U2F and the Internet has
51:32 - 51:38

plenty of explanations for why other
methods are not a great idea
51:38 - 51:42

Last question, any comments on differences
between web browsers?
51:48 - 51:50

Yes
51:55 - 52:00

Differences between web browsers, there
are fewer and fewer differences between
52:00 - 52:06

web browsers these day. At this point
almost all web browsers are chrome
52:06 - 52:10

Either because you're using Chrome or
because you're using a browser that's
52:10 - 52:16

using the same browser engine as Chrome.
It's a little bit sad, one might say, but
52:16 - 52:21

I think these days whether you choose
52:21 - 52:24

Chrome is a great browser for security reasons
52:24 - 52:28

if you want to have something
that's more customizable or
52:28 - 52:39

you don't want to be tied to Google then
use Firefox, don't use Safari it's a
52:39 - 52:46

worse version of Chrome. The new Internet
Explorer edge is pretty decent and also
52:46 - 52:51

uses the same browser engine as
Chrome and that's probably fine
52:51 - 52:55

although avoid it if you can because it
has some like legacy modes you don't
52:55 - 52:58

want to deal with. I think that's
52:58 - 53:03

Oh, there's a cool new browser called flow
53:03 - 53:06

that you can't use for anything useful
yet but they're actually writing
53:06 - 53:09

their own browser engine and that's really neat
53:09 - 53:15

Firefox also has this project called servo which is
they're really implementing their browser engine
53:15 - 53:20

in Rust in order to write it to be like
super concurrent and what they've done
53:20 - 53:25

is they've started to take modules
from that version and port them
53:25 - 53:29

over to gecko or integrate them with gecko
which is the main browser engine
53:29 - 53:32

for Firefox just to get those
speed ups there as well
53:32 - 53:37

and that's a neat neat thing
you can be watching out for
53:39 - 53:42

That is all the questions, hey we did it. Nice
53:42 - 53:51

I guess thanks for taking the missing semester
class and let's do it again next year

Title:: Lecture 11: Q&A (2020)
Description:: more » « less
Video Language:: English
Duration:: 53:53

	Hamir Mahal edited English subtitles for Lecture 11: Q&A (2020)
	Hamir Mahal edited English subtitles for Lecture 11: Q&A (2020)
	Hamir Mahal edited English subtitles for Lecture 11: Q&A (2020)
	Hamir Mahal edited English subtitles for Lecture 11: Q&A (2020)
	Amara Bot edited English subtitles for Lecture 11: Q&A (2020)

English subtitles

Revisions

Revision 5 Edited

Hamir Mahal

Lecture 11: Q&A (2020)

Revisions

Our website uses cookies

Operating cookies (Required)