
One very powerful place that you can use regularization, is in regression.

Regularization is a method for

automatically penalizing the extra features that you use in your model.

So, let me make this a little bit more concrete.

There's a type of regularized regression called Lasso Regression.

And, here's the rough formula for the Lasso Regression.

A regular linear regression would say,

I just want to minimize the sum of the squared errors in my fit.

I want to minimize the distance between my fit, and

any given data point, or the square of that distance.

What Lasso Regression says is yeah, we want a small sum of squared error.

But, in addition to minimizing the sum of the squared errors,

I also want to minimize the number of features that I'm using.

And, so I'm going to add in a second term here, in which I

have a penalty parameter, and I have the coefficients of my regression.

So, this is basically the term that describes how many features I'm using.

So, here's the result of this formulation.

When I'm performing my fit, I'm considering both the errors that come from that

fit, and also the number of features that are being used.

And, so let's say I'm comparing two different fits,

that have different number of features in them.

The one that has more features included,

will almost certainly have a smaller sum of the squared error.

because, it can fit more precisely to the points.

But, I pay a penalty for using that extra feature.

And, that comes in the second term with the, with the penalty term, and

the coefficients of regression that I'm going to get for

that additional feature that I'm using.

And, so what this is saying is that the gain that I get,

in terms of the, the precision,

the goodness of fit of my regression, has to be a bigger gain than the, the loss

that I take as a result of having that additional feature in my regression.

So, this precisely formulates, in a mathematical way, the trade off between

having small errors and having a simpler fit that's using fewer features.

And, so what Lasso Regression does,

is it automatically takes into account this penalty parameter.

And, in so doing, it helps you actually figure out which features that

are the ones that have the most important effect on your regression.

And, once it's found those features, it can actually eliminate or

set to zero, the coefficients for the features that basically don't help