WEBVTT

00:00:00.794 --> 00:00:01.666
- [Narrator] In the last video we were

00:00:01.666 --> 00:00:03.543
looking at this particular function.

00:00:03.543 --> 00:00:04.977
It's a very non linear function.

00:00:04.977 --> 00:00:06.960
And we were picturing
it as a transformation

00:00:06.960 --> 00:00:10.641
that takes every point x,
y in space to the point

00:00:10.641 --> 00:00:13.401
x plus sign y, y plus sign of x.

00:00:13.401 --> 00:00:16.634
And moreover, we zoomed
in on a specific point.

00:00:16.634 --> 00:00:17.731
And let me actually write down what

00:00:17.731 --> 00:00:20.958
point we zoomed in on, it was (-2,1).

00:00:20.958 --> 00:00:25.125
That's something we're gonna
want to record here (-2,1).

00:00:25.978 --> 00:00:27.948
And I added couple extra
grid lines around it

00:00:27.948 --> 00:00:30.531
just so we can see in detail
what the transformation

00:00:30.531 --> 00:00:33.803
does to points that are in the
neighborhood of that point.

00:00:33.803 --> 00:00:35.732
And over here, this
square shows the zoomed

00:00:35.732 --> 00:00:37.773
in version of that neighborhood.

00:00:37.773 --> 00:00:39.853
And what we saw is that even though the

00:00:39.853 --> 00:00:41.716
function as a whole, as a transformation,

00:00:41.716 --> 00:00:44.565
looks rather complicated,
around that one point,

00:00:44.565 --> 00:00:47.426
it looks like a linear function.

00:00:47.426 --> 00:00:49.706
It's locally linear so
what I'll show you here

00:00:49.706 --> 00:00:52.907
is what matrix is gonna
tell you the linear

00:00:52.907 --> 00:00:55.131
function that this looks like.

00:00:55.131 --> 00:00:57.730
And this is gonna be kind
of two by two matrix.

00:00:57.730 --> 00:01:00.347
I'll make a lot of room
for ourselves here.

00:01:00.347 --> 00:01:02.788
It'll be a two by two matrix and the way

00:01:02.788 --> 00:01:05.658
to think about it is to first go back

00:01:05.658 --> 00:01:07.823
to our original setup
before the transformation.

00:01:07.823 --> 00:01:10.334
And think of just a
tiny step to the right.

00:01:10.334 --> 00:01:13.576
What I'm gonna think of
as a little, partial x.

00:01:13.576 --> 00:01:16.118
A tiny step in the x direction.

00:01:16.118 --> 00:01:18.301
And what that turns into
after the transformation

00:01:18.301 --> 00:01:21.733
is gonna be some tiny
step in the output space.

00:01:21.733 --> 00:01:23.457
And here let me actually kind of draw on

00:01:23.457 --> 00:01:25.743
what that tiny step turned into.

00:01:25.743 --> 00:01:27.150
It's no longer purely in the x direction.

00:01:27.150 --> 00:01:28.486
It has some rightward component.

00:01:28.486 --> 00:01:30.357
But now also some downward component.

00:01:30.357 --> 00:01:32.620
And to be able to represent
this in a nice way,

00:01:32.620 --> 00:01:35.366
what I'm gonna do is
instead of writing the

00:01:35.366 --> 00:01:37.137
entire function as something with

00:01:37.137 --> 00:01:39.081
a vector valued output, I'm gonna go ahead

00:01:39.081 --> 00:01:43.248
and represent this as a two
separate scalar value functions.

00:01:44.116 --> 00:01:48.283
I'm gonna write the scalar
value functions f1 of x, y.

00:01:50.073 --> 00:01:52.964
So I'm just giving a
name to x plus sign y.

00:01:52.964 --> 00:01:56.083
And f2 of x, y, again all I'm doing is

00:01:56.083 --> 00:02:00.348
giving a name to the functions
we already have written down.

00:02:00.348 --> 00:02:02.296
When I look at this vector, the

00:02:02.296 --> 00:02:04.334
consequence of taking a tiny d, x step

00:02:04.334 --> 00:02:06.812
in the input space that corresponds to

00:02:06.812 --> 00:02:09.423
some two d movement in the output space.

00:02:09.423 --> 00:02:11.189
And the x component of that movement.

00:02:11.189 --> 00:02:12.896
Right if I was gonna draw this out

00:02:12.896 --> 00:02:15.976
and say hey, what's the x
component of that movement.

00:02:15.976 --> 00:02:18.071
That's something we think of as a little

00:02:18.071 --> 00:02:22.238
partial change in f1, the
x component of our output.

00:02:23.179 --> 00:02:24.560
And if we divide this, if we take you know

00:02:24.560 --> 00:02:26.913
partial f1 divided by the size of that

00:02:26.913 --> 00:02:28.717
initial tiny change, it basically scales

00:02:28.717 --> 00:02:31.408
it up to be a normal sized vector.

00:02:31.408 --> 00:02:32.938
Not a tiny nudge but something that's more

00:02:32.938 --> 00:02:34.480
constant that doesn't shrink as we

00:02:34.480 --> 00:02:36.617
zoom in further and further.

00:02:36.617 --> 00:02:39.149
And then similarly the
change in the y direction,

00:02:39.149 --> 00:02:40.784
right the vertical component of that step

00:02:40.784 --> 00:02:43.185
that was still caused by the dx.

00:02:43.185 --> 00:02:45.493
Right, it's still caused by that initial

00:02:45.493 --> 00:02:46.933
step to the right, that is gonna be

00:02:46.933 --> 00:02:49.516
the tiny, partial change in f2.

00:02:50.607 --> 00:02:52.667
The y component of the output cause

00:02:52.667 --> 00:02:54.527
here we're all just
looking in the output space

00:02:54.527 --> 00:02:58.694
that was caused by a partial
change in the x direction.

00:03:00.749 --> 00:03:02.220
And again I kind of
like to think about this

00:03:02.220 --> 00:03:03.957
we're dividing by a tiny amount.

00:03:03.957 --> 00:03:07.065
This partial f2 is really
a tiny, tiny nudge.

00:03:07.065 --> 00:03:08.847
But by dividing by the size of the initial

00:03:08.847 --> 00:03:11.210
tiny nudge that caused it, we're getting

00:03:11.210 --> 00:03:12.483
something that's basically a number.

00:03:12.483 --> 00:03:13.471
Something that doesn't shrink when

00:03:13.471 --> 00:03:15.592
we consider more and
more zoomed in versions.

00:03:15.592 --> 00:03:17.517
So that, that's all what happens when

00:03:17.517 --> 00:03:19.922
we take a tiny step in the x direction.

00:03:19.922 --> 00:03:23.066
But another thing you could
do, another thing you can

00:03:23.066 --> 00:03:25.960
consider is a tiny step
in the y direction.

00:03:25.960 --> 00:03:27.835
Right cause we wanna know, hey, if

00:03:27.835 --> 00:03:31.153
you take a single step
some tiny unit upward,

00:03:31.153 --> 00:03:35.320
what does that turn into
after the transformation.

00:03:37.644 --> 00:03:41.584
And what that looks like is this vector

00:03:41.584 --> 00:03:43.398
that still has some upward component.

00:03:43.398 --> 00:03:45.328
But it also has a rightward component.

00:03:45.328 --> 00:03:46.413
And now I'm gonna write its components

00:03:46.413 --> 00:03:48.815
as the second column of the matrix.

00:03:48.815 --> 00:03:50.092
Because as we know when
you're representing

00:03:50.092 --> 00:03:52.439
a linear transformation with a matrix,

00:03:52.439 --> 00:03:54.012
the first column tells you where the first

00:03:54.012 --> 00:03:55.596
basis vector goes and the second column

00:03:55.596 --> 00:03:58.195
shows where the second basis vector goes.

00:03:58.195 --> 00:03:59.756
If that feels unfamiliar, either

00:03:59.756 --> 00:04:01.245
check out the refresher video or

00:04:01.245 --> 00:04:03.959
maybe go and look at some of
the linear algebra content.

00:04:03.959 --> 00:04:06.017
But to figure out the
coordinates of this guy,

00:04:06.017 --> 00:04:08.085
we do basically the same thing.

00:04:08.085 --> 00:04:10.671
Let's say first of all, the
change in the x direction

00:04:10.671 --> 00:04:14.682
here, the x component
of this nudge vector.

00:04:14.682 --> 00:04:18.921
That's gonna be given as a
partial change to f1, right,

00:04:18.921 --> 00:04:21.077
to the x component of the output.

00:04:21.077 --> 00:04:22.271
Here we're looking in the outputs base.

00:04:22.271 --> 00:04:25.241
We're dealing with f1, f1 and f2

00:04:25.241 --> 00:04:26.713
and we're asking what that change was

00:04:26.713 --> 00:04:30.880
that was caused by a tiny
change in the y direction.

00:04:32.140 --> 00:04:35.315
So the change in f1 caused
by some tiny step in the y

00:04:35.315 --> 00:04:38.981
direction divided by the
size of that tiny step.

00:04:38.981 --> 00:04:41.811
And then the y component
of our output here.

00:04:41.811 --> 00:04:43.891
The y component of the
step in the outputs base

00:04:43.891 --> 00:04:45.888
that was caused by the initial tiny

00:04:45.888 --> 00:04:47.876
step upward in the input space.

00:04:47.876 --> 00:04:50.376
Well that is the change of f2,

00:04:52.343 --> 00:04:55.537
second component of our
output as caused by dy.

00:04:55.537 --> 00:04:58.953
As caused by that little partial y.

00:04:58.953 --> 00:05:00.477
And of course all of this is very specific

00:05:00.477 --> 00:05:02.545
to the point that we started at right.

00:05:02.545 --> 00:05:05.447
We started at the point (-2,1).

00:05:05.447 --> 00:05:07.527
So each of these partial derivatives

00:05:07.527 --> 00:05:09.287
is something that really we're saying,

00:05:09.287 --> 00:05:12.714
don't take the function,
evaluate it at the point (2,-1),

00:05:12.714 --> 00:05:16.506
and when you evaluate each one of these

00:05:16.506 --> 00:05:18.900
at the point (2,-1)
you'll get some number.

00:05:18.900 --> 00:05:19.972
And that will give you a very

00:05:19.972 --> 00:05:21.753
concrete two by two matrix that's gonna

00:05:21.753 --> 00:05:24.065
represent the linear
transformation that this

00:05:24.065 --> 00:05:26.947
guy looks like once you've zoomed in.

00:05:26.947 --> 00:05:28.268
So this matrix here that's full of all

00:05:28.268 --> 00:05:31.582
of the partial derivatives
has a very special name.

00:05:31.582 --> 00:05:35.553
It's called as you may
have guessed, the Jacobian.

00:05:35.553 --> 00:05:38.775
Or more fully you'd call
it the Jacobian Matrix.

00:05:38.775 --> 00:05:39.980
And one way to think about it is that it

00:05:39.980 --> 00:05:43.186
carries all of the partial
differential information right.

00:05:43.186 --> 00:05:45.489
It's taking into account
both of these components

00:05:45.489 --> 00:05:48.069
of the output and both possible inputs.

00:05:48.069 --> 00:05:49.240
And giving you a kind of a grid of

00:05:49.240 --> 00:05:51.194
what all the partial derivatives are.

00:05:51.194 --> 00:05:52.260
But as I hope you see, it's much

00:05:52.260 --> 00:05:54.526
more than just a way of recording

00:05:54.526 --> 00:05:56.402
what all the partial derivatives are.

00:05:56.402 --> 00:05:57.893
There's a reason for organizing it

00:05:57.893 --> 00:05:59.786
like this in particular and it really

00:05:59.786 --> 00:06:02.439
does come down to this
idea of local linearity.

00:06:02.439 --> 00:06:04.541
If you understand that the Jacobian Matrix

00:06:04.541 --> 00:06:06.148
is fundamentally supposed to represent

00:06:06.148 --> 00:06:08.689
what a transformation
looks like when you zoom

00:06:08.689 --> 00:06:11.462
in near a specific point,
almost everything else

00:06:11.462 --> 00:06:13.999
about it will start to fall in place.

00:06:13.999 --> 00:06:14.981
And in the next video, I'll go ahead

00:06:14.981 --> 00:06:16.429
and actually compute this just to

00:06:16.429 --> 00:06:18.135
show you what the process looks like.

00:06:18.135 --> 00:06:19.383
And how the result we get kind of

00:06:19.383 --> 00:06:20.760
matches with the picture we're

00:06:20.760 --> 00:06:21.593
looking at, see you then.