-
Title:
Algorithms Requiring Rescaling Solution - Intro to Machine Learning
-
Description:
-
So Katie, what do you think?
-
Which ones are the right answers here?
-
>> The right answers, so the, the ones that do need rescaled features will
-
be the SVM and the k-means clustering.
-
>> So both and, support vector machines in, in, in k-means clustering, you're
-
really trading off one dimension to the other when you calculate the distance.
-
So take, for example, support vector machines.
-
And you look at the separation line that maximizes distance.
-
In there, you calculate a distance.
-
And that distance calculation, trade-offs one dimension against the other.
-
So we make one twice as big as the other, it counts for twice as much.
-
The same is true, coincidentally, for
-
k-means clustering, where you have a cluster center.
-
And you compute the distance of the cluster center, to all the data points.
-
And that distance itself has exactly the same characterization.
-
If you make one variable twice as big, it's going to count for twice as much.
-
So, as a result, support vector machines and
-
k-means both are affected by feature rescaling.
-
So, Katie, tell me about the decision trees and linear regression.
-
Why aren't they included?
-
Decision trees aren't going to give you a diagonal line like that, right?
-
They're going to give you a series of vertical and horizontal lines.
-
So there's no trade off.
-
You just, make a cut in one direction, and then a cut in another.
-
So, you don't have to worry about what's going on in one dimension,
-
when you're doing something with the other one.
-
>> So if you squeeze this area little area over here to half the size,
-
because you rescale the feature where the image line lies.
-
Well, it'll lie in a different place but
-
the separation is chronologically the same as before.
-
It scales with it,
-
so there's no trade-off between these two different variables.
-
And how about, linear regression?
-
>> Something similar happens in linear regression.
-
Remember that in linear regression,
-
each of our features is going to have a coefficient that's associated with it.
-
And that coefficient and that feature always go together.
-
What's going on with feature A doesn't effect anything with
-
the coefficient of feature B.
-
So they're separated in the same way.
-
>> In fact, if you were to double the variable scale of one specific variable,
-
that feature will just become half as big.
-
And the output would be exactly the same as before.
-
So it's really interesting to see, and for some algorithms,
-
rescaling is really a potential if we can use it, for others, don't even bother.