Return to Video

Inside OKCupid: The math of online dating - Christian Rudder

  • 0:18 - 0:20
    Hello, my name is Christian Rudder,
  • 0:20 - 0:22
    and I was one of the founders of OkCupid.
  • 0:22 - 0:25
    It's now one of the biggest
    dating sites in the United States.
  • 0:25 - 0:27
    Like most everyone at the site,
    I was a math major,
  • 0:27 - 0:31
    As you may expect, we're known
    for the analytic approach we take to love.
  • 0:31 - 0:32
    We call it our matching algorithm.
  • 0:32 - 0:35
    Basically, OkCupid's matching
    algorithm helps us decide
  • 0:35 - 0:37
    whether two people should go on a date.
  • 0:37 - 0:39
    We built our entire business around it.
  • 0:39 - 0:41
    Now, algorithm is a fancy word,
  • 0:41 - 0:43
    and people like to drop it
    like it's this big thing.
  • 0:43 - 0:46
    But really, an algorithm
    is just a systematic,
  • 0:46 - 0:48
    step-by-step way to solve a problem.
  • 0:48 - 0:50
    It doesn't have to be fancy at all.
  • 0:50 - 0:51
    Here in this lesson,
  • 0:51 - 0:54
    I'm going to explain how we arrived
    at our particular algorithm,
  • 0:54 - 0:56
    so you can see how it's done.
  • 0:56 - 0:58
    Now, why are algorithms even important?
  • 0:58 - 0:59
    Why does this lesson even exist?
  • 0:59 - 1:03
    Well, notice one very significant
    phrase I used above:
  • 1:03 - 1:05
    they are a step-by-step
    way to solve a problem,
  • 1:05 - 1:08
    and as you probably know, computers
    excel at step-by-step processes.
  • 1:08 - 1:10
    A computer without an algorithm
  • 1:10 - 1:13
    is basically an expensive paperweight.
  • 1:13 - 1:16
    And since computers are such
    a pervasive part of everyday life,
  • 1:16 - 1:17
    algorithms are everywhere.
  • 1:19 - 1:22
    The math behind OkCupid's matching
    algorithm is surprisingly simple.
  • 1:22 - 1:26
    It's just some addition, multiplication,
    a little bit of square roots.
  • 1:26 - 1:28
    The tricky part in designing it
  • 1:28 - 1:30
    was figuring out how to take
    something mysterious,
  • 1:30 - 1:31
    human attraction,
  • 1:31 - 1:34
    and break it into components
    that a computer can work with.
  • 1:34 - 1:37
    The first thing we needed
    to match people up was data,
  • 1:37 - 1:39
    something for the algorithm to work with.
  • 1:39 - 1:42
    The best way to get data quickly
    from people is to just ask for it.
  • 1:42 - 1:45
    So we decided that OkCupid
    should ask users questions,
  • 1:45 - 1:47
    stuff like, "Do you want
    to have kids one day?"
  • 1:47 - 1:49
    "How often do you brush your teeth?"
  • 1:49 - 1:50
    "Do you like scary movies?"
  • 1:51 - 1:53
    And big stuff like,
    "Do you believe in God?"
  • 1:54 - 1:57
    Now, a lot of the questions
    are good for matching like with like,
  • 1:57 - 1:59
    that is, when both people
    answer the same way.
  • 1:59 - 2:02
    For example, two people
    who are both into scary movies
  • 2:02 - 2:05
    are probably a better match
    than one person who is and one who isn't.
  • 2:05 - 2:07
    But what about a question like,
  • 2:07 - 2:09
    "Do you like to be
    the center of attention?"
  • 2:09 - 2:11
    If both people in a relationship
    are saying yes to this,
  • 2:11 - 2:13
    they're going to have massive problems.
  • 2:13 - 2:15
    We realized this early on,
  • 2:15 - 2:18
    and so we decided we needed
    a bit more data from each question.
  • 2:18 - 2:21
    We had to ask people to specify
    not only their own answer,
  • 2:21 - 2:23
    but the answer they wanted
    from someone else.
  • 2:23 - 2:25
    That worked really well.
  • 2:25 - 2:26
    But we needed one more dimension.
  • 2:26 - 2:29
    Some questions tell you more
    about a person than others.
  • 2:29 - 2:32
    For example, a question
    about politics, something like,
  • 2:32 - 2:35
    "Which is worse:
    book burning or flag burning?"
  • 2:35 - 2:37
    might reveal more about someone
    than their taste in movies.
  • 2:37 - 2:40
    And it doesn't make sense
    to weigh all things equally,
  • 2:40 - 2:42
    so we added one final data point.
  • 2:42 - 2:44
    For everything that OkCupid asks you,
  • 2:44 - 2:47
    you have a chance to tell us
    the role it plays in your life.
  • 2:47 - 2:49
    And this ranges
    from irrelevant to mandatory.
  • 2:49 - 2:53
    So now, for every question,
    we have three things for our algorithm:
  • 2:53 - 2:54
    first, your answer;
  • 2:55 - 2:59
    second, how you want someone else --
    your potential match -- to answer;
  • 2:59 - 3:02
    and third, how important
    the question is to you at all.
  • 3:03 - 3:04
    With all this information,
  • 3:04 - 3:07
    OkCupid can figure out
    how well two people will get along.
  • 3:07 - 3:10
    The algorithm crunches the numbers
    and gives us a result.
  • 3:10 - 3:11
    As a practical example,
  • 3:11 - 3:14
    let's look at how we'd match you
    with another person.
  • 3:14 - 3:15
    Let's call him "B."
  • 3:16 - 3:20
    Your match percentage with B is based
    on questions you've both answered.
  • 3:20 - 3:22
    Let's call that set
    of common questions "s."
  • 3:23 - 3:25
    As a very simple example,
    we use a small set "s"
  • 3:25 - 3:27
    with just two questions in common,
  • 3:27 - 3:28
    and compute a match from that.
  • 3:28 - 3:30
    Here are our two example questions.
  • 3:30 - 3:33
    The first one, let's say, is,
    "How messy are you?"
  • 3:33 - 3:35
    And the answer possibilities are:
  • 3:35 - 3:38
    very messy, average and very organized.
  • 3:38 - 3:40
    And let's say you answered
    "very organized,"
  • 3:40 - 3:43
    and you'd like someone else
    to answer "very organized,"
  • 3:43 - 3:45
    and the question is very important to you.
  • 3:45 - 3:47
    Basically, you're a neat freak.
  • 3:47 - 3:50
    You're neat, you want someone else
    to be neat, and that's it.
  • 3:50 - 3:52
    And let's say B is a little bit different.
  • 3:52 - 3:54
    He answered "very organized" for himself,
  • 3:54 - 3:57
    but "average" is OK with him
    as an answer from someone else,
  • 3:57 - 3:59
    and the question is only
    a little important to him.
  • 3:59 - 4:02
    Let's look at the second question,
    from our previous example:
  • 4:02 - 4:04
    "Do you like to be
    the center of attention?"
  • 4:04 - 4:06
    The answers are "yes" and "no."
  • 4:06 - 4:09
    You've answered "no," you want
    someone else to answer "no,"
  • 4:09 - 4:11
    and the question is only
    a little important to you.
  • 4:11 - 4:13
    Now B, he's answered "yes."
  • 4:13 - 4:15
    He wants someone else to answer "no,"
  • 4:15 - 4:17
    because he wants the spotlight on him,
  • 4:17 - 4:19
    and the question is somewhat
    important to him.
  • 4:19 - 4:21
    So, let's try to compute all of this.
  • 4:22 - 4:24
    Our first step is, since we use
    computers to do this,
  • 4:24 - 4:26
    we need to assign numerical values
  • 4:26 - 4:29
    to ideas like "somewhat
    important" and "very important,"
  • 4:29 - 4:31
    because computers need
    everything in numbers.
  • 4:31 - 4:34
    We at OkCupid decided
    on the following scale:
  • 4:34 - 4:36
    "Irrelevant" is worth 0.
  • 4:36 - 4:38
    "A little important" is worth 1.
  • 4:39 - 4:40
    "Somewhat important" is worth 10.
  • 4:41 - 4:43
    "Very important" is 50.
  • 4:43 - 4:46
    And "absolutely mandatory" is 250.
  • 4:46 - 4:49
    Next, the algorithm makes
    two simple calculations.
  • 4:49 - 4:52
    The first is: How much did
    B's answers satisfy you?
  • 4:52 - 4:56
    That is, how many possible points
    did B score on your scale?
  • 4:56 - 4:59
    Well, you indicated that B's answer
    to the first question,
  • 4:59 - 5:00
    about messiness,
  • 5:00 - 5:02
    was very important to you.
  • 5:02 - 5:04
    It's worth 50 points and B got that right.
  • 5:04 - 5:06
    The second question is worth only 1,
  • 5:06 - 5:08
    because you said
    it was only a little important.
  • 5:08 - 5:10
    B got that wrong,
  • 5:10 - 5:12
    so B's answers were 50
    out of 51 possible points.
  • 5:12 - 5:15
    That's 98% satisfactory. Pretty good.
  • 5:15 - 5:19
    The second question the algorithm
    looks at is: How much did you satisfy B?
  • 5:19 - 5:22
    Well, B placed 1 point on your answer
    to the messiness question
  • 5:22 - 5:24
    and 10 on your answer to the second.
  • 5:25 - 5:28
    Of those 11, that's 1 plus 10,
    you earned 10 --
  • 5:28 - 5:31
    you guys satisfied each other
    on the second question.
  • 5:31 - 5:35
    So your answers were 10 out of 11
    equals 91 percent satisfactory to B.
  • 5:35 - 5:36
    That's not bad.
  • 5:36 - 5:39
    The final step is to take
    these two match percentages
  • 5:39 - 5:41
    and get one number for the both of you.
  • 5:41 - 5:43
    To do this, the algorithm
    multiplies your scores,
  • 5:43 - 5:45
    then takes the nth root,
  • 5:45 - 5:47
    where "n" is the number of questions.
  • 5:47 - 5:50
    Because s, which is the number
    of questions in this sample,
  • 5:50 - 5:52
    is only 2,
  • 5:52 - 5:56
    we have: match percentage
    equals the square root
  • 5:56 - 5:58
    of 98 percent times 91 percent.
  • 5:58 - 6:00
    That equals 94 percent.
  • 6:00 - 6:04
    That 94 percent is your match
    percentage with B.
  • 6:04 - 6:07
    It's a mathematical expression
    of how happy you'd be with each other,
  • 6:07 - 6:08
    based on what we know.
  • 6:08 - 6:10
    Now, why does the algorithm multiply,
  • 6:10 - 6:13
    as opposed to, say, average
    the two match scores together,
  • 6:13 - 6:14
    and do the square-root business?
  • 6:14 - 6:17
    In general, this formula
    is called the geometric mean.
  • 6:17 - 6:19
    It's a great way to combine
    values that have wide ranges
  • 6:20 - 6:21
    and represent very different properties.
  • 6:21 - 6:24
    In other words, it's perfect
    for romantic matching.
  • 6:24 - 6:27
    You've got wide ranges and you've got
    tons of different data points,
  • 6:27 - 6:31
    like I said, about movies, politics,
    religion -- everything.
  • 6:31 - 6:32
    Intuitively, too, this makes sense.
  • 6:32 - 6:35
    Two people satisfying
    each other 50 percent
  • 6:35 - 6:39
    should be a better match
    than two others who satisfy 0 and 100,
  • 6:39 - 6:41
    because affection needs to be mutual.
  • 6:41 - 6:44
    After adding a little correction
    for margin of error,
  • 6:44 - 6:46
    in the case where we have
    a small number of questions,
  • 6:46 - 6:48
    like we do in this example,
  • 6:48 - 6:49
    we're good to go.
  • 6:49 - 6:51
    Any time OkCupid matches two people,
  • 6:51 - 6:53
    it goes through the steps
    we just outlined.
  • 6:53 - 6:55
    First it collects data about your answers,
  • 6:55 - 6:58
    then it compares your choices
    and preferences to other people's
  • 6:58 - 7:00
    in simple, mathematical ways.
  • 7:00 - 7:03
    This, the ability to take
    real-world phenomena
  • 7:03 - 7:05
    and make them something
    a microchip can understand,
  • 7:05 - 7:09
    is, I think, the most important skill
    anyone can have these days.
  • 7:09 - 7:11
    Like you use sentences
    to tell a story to a person,
  • 7:11 - 7:14
    you use algorithms
    to tell a story to a computer.
  • 7:14 - 7:17
    If you learn the language,
    you can go out and tell your stories.
  • 7:17 - 7:19
    I hope this will help you do that.
Title:
Inside OKCupid: The math of online dating - Christian Rudder
Speaker:
Christian Rudder
Description:

View full lesson: http://ed.ted.com/lessons/inside-okcupid-the-math-of-online-dating-christian-rudder

When two people join a dating website, they are matched according to shared interests and how they answer a number of personal questions. But how do sites calculate the likelihood of a successful relationship? Christian Rudder, one of the founders of popular dating site OKCupid, details the algorithm behind 'hitting it off.'

Lesson by Christian Rudder, animation by TED-Ed.

more » « less
Video Language:
English
Team:
closed TED
Project:
TED-Ed
Duration:
07:31
Krystian Aparta commented on English subtitles for Inside OKCupid: The math of online dating
Krystian Aparta edited English subtitles for Inside OKCupid: The math of online dating
Bedirhan Cinar approved English subtitles for Inside OKCupid: The math of online dating
Bedirhan Cinar accepted English subtitles for Inside OKCupid: The math of online dating
Bedirhan Cinar edited English subtitles for Inside OKCupid: The math of online dating
Andrea McDonough edited English subtitles for Inside OKCupid: The math of online dating
Andrea McDonough added a translation

English subtitles

Revisions Compare revisions