## ← Scatterplot Transformation - Data Analysis with R

• 2 Followers
• 29 Lines

### Get Embed Code x Embed video Use the following code to embed this video. See our usage guide for more details on embedding. Paste this in your document somewhere (closest to the closing body tag is preferable): ```<script type="text/javascript" src='https://amara.org/embedder-iframe'></script> ``` Paste this inside your HTML body, where you want to include the widget: ```<div class="amara-embed" data-url="http://www.youtube.com/watch?v=h1wbEPuADz0" data-team="udacity"></div> ``` 5 Languages

Showing Revision 2 created 05/25/2016 by Udacity Robot.

1. Now that we have a better understanding of our variables, and
2. the overall demand for diamonds, let's replot the data. This time
3. we'll put price on a log10 scale, and here's what it
4. looks like. This plot looks better than before. On the log
5. scale, the prices look less dispersed at the high end of
6. Carat size and price, but actually we can do better. Let's
7. try using the cube root of Carat in light of our
8. speculation about flaws being exponentially more
9. likely in diamonds with more volume.
10. Remember, volume is on a cubic scale. First, we need
11. a function to transform the Carat variable. If you'd like to
13. the links in the instructor notes. This may seem like a
14. lot of code, but really, there's only one new piece here.
15. It's this cube root trans-function. It's a function that takes the
16. cube root of any input variable, and it also has an
17. inverse function to undo that operation, which we need to display
18. the plot correctly. Then when we get to
19. our actual ggplot command. What we'll do is we'll
20. use the scale_x_continuous argument to transform the x
21. axis with this cube root transformation function. Keep in
22. mind we're also transforming the y axis with
23. this log10 transformation that we discussed previously. And, let's
24. see what this plot looks like. Taking a
25. look at the plot, we can actually see that
26. with these transformations that we used to get
27. our data on this nice scale. Things look almost
28. linear. We can now move forward and see
29. about modelling our data using just a linear model.