Return to Video

Ordered Factors - Data Analysis with R

  • 0:00 - 0:04
    Let's look more closely to these factor variables. For
  • 0:04 - 0:07
    now i want to draw your attention to the age.range variable
  • 0:07 - 0:10
    right here. Notice that it says that we have a
  • 0:10 - 0:13
    factor variable with seven different levels. We can examine the
  • 0:13 - 0:17
    levels of a variable, by typing in the command levels
  • 0:17 - 0:19
    and then putting it the variable right here. In the
  • 0:19 - 0:22
    console we can see the seven levels of the age.range
  • 0:22 - 0:26
    variable. Now, instead of creating a table of the age.range
  • 0:26 - 0:29
    variable, let's create a plot that shows how many users
  • 0:29 - 0:32
    are in each bin. That is, we want to figure out how
  • 0:32 - 0:34
    many surveyed respondents are between the ages of 18 and
  • 0:34 - 0:38
    24, 25 and 34, and so on. I'm going to create this
  • 0:38 - 0:42
    plot using the ggplot2 package, and the qplot function that
  • 0:42 - 0:45
    comes with it. Again, don't worry about understanding this code too
  • 0:45 - 0:48
    much, we'll have practice with this in the next lesson.
  • 0:48 - 0:51
    When I run this code, I get my plot over here.
  • 0:51 - 0:53
    Zooming in on this plot, I want you to notice
  • 0:53 - 0:56
    that the age groups appear to be in order. This is
  • 0:56 - 0:59
    true for everyone except the survey takers who are under
  • 0:59 - 1:02
    the age of 18. Now, it would be really helpful if
  • 1:02 - 1:05
    this bar was really oriented over here. That way we
  • 1:05 - 1:09
    could make comparisons across the groups more easily. Now this is
  • 1:09 - 1:13
    why we would want to have ordered factors. The variable age.range just
  • 1:13 - 1:16
    contains factors with seven levels, but these levels aren't arranged in
  • 1:16 - 1:20
    any particular order. Sometimes you want to introduce order into our
  • 1:20 - 1:23
    data set. So that way we can make more readable plots.
  • 1:23 - 1:26
    So, knowing a little bit about ordered factors, let's see
  • 1:26 - 1:30
    if you can answer this next question. If you haven't already
  • 1:30 - 1:32
    done so, download the Reddit survey data and look at
  • 1:32 - 1:36
    its structure. After you looked at the structure of the variables,
  • 1:36 - 1:39
    try and answer this question. Which of these variables in
  • 1:39 - 1:42
    the data set could also be converted to an ordered factor?
  • 1:42 - 1:44
    Just like H.Range.
  • 1:44 - 1:46
    >> Check any of the variables that apply.
Title:
Ordered Factors - Data Analysis with R
Video Language:
English
Team:
Udacity
Project:
UD651: Exploratory Data Analysis
Duration:
01:47

English subtitles

Revisions Compare revisions