
Title:
2901 Find_Weight

Description:

In this unit here, we'll have fun.

Somehow in the last couple of days on Facebook,

a discussion brought out what Sebastian's weight is.

And I decided rather than telling people how much I weigh I turned this into statistics.

And upfront I want you to put everything together what we've done so far

using programming and since programming has been optional in this class

consider this unit optional but it'd be great if you had a chance to try it.

It's not that hard and at the end of the day you'll know something about me

that I rarely discuss in public.

Through a comment I made in class on Facebook a discussion erupted in our

Facebook STATS 101 discussion group what my actual weight is.

And here is the form that I posted.

They were asked to submit their best estimate how much I weight in kilograms,

and also to submit how much they thought I weighed a year ago.

And within a few hours, there was a good number of guesses

including this one over here that's about as much as the planet Pluto weighs

and also some negative guesses.

These are both the negative weight of Pluto each.

But other than that, there were lots of really good guesses.

And you can see in kilograms, some people think I weigh 80 or 65, others think I weigh 250.

I took the good guesses and added them into a large list called weight.

That's just below 100 of those and now I want to do a statistics on those.

The very first thing I did is I printed the mean estimate and it turns out to be negative.

It's 2.10x10²⁰, and it's a typical situation in statistics.

When you look at those numbers, most of them are actually pretty good guesses.

But these extreme guesses of 10²² or 10²² over here completely affect

and screw up the actual statistics.

Now, you've learned how to deal with this. You know everything about statistics.

What I want you to do is to now code a piece of software called calculate_weight

that has 3 things, and I think you can do all three of them yourself.

First, I want you to remove the outliers by only extracting data

between the lower and upper quartile.

It turns out the number of data points make it well defined

what the lower and upper quartile is.

And all test cases we run through have a welldefined number of data points.

And all the test cases we'll be using will have the property

that the lower and upper quartile are welldefined elements.

Then, I want you to fit a Gaussian using the maximum likelihood estimator.

And from there, I want you to compute the value x that corresponds

to the standard score z, so I'll be giving you not just the weight statistics or the weight data

but also where my extra weight is.

If you plug in the standard score of 2, which is two standard deviations below the mean

of the data that we will estimate, you'll find out my extra weight that I took this morning.

It's amazingly accurate.

But definitely, the data that you guys provided for this was overestimating my weight

and I'm happy to report by two standard deviations.

All these formulas are known,

and I think you have all the coding skills necessary from the past to fill these gaps.

Obviously, the first step is the hardest.

And when you're done with it, this command over here will give you the correct answer.