< Return to Video

Inside OKCupid: The math of online dating - Christian Rudder

  • 0:18 - 0:19
    Hello, my name is Christian Rudder,
  • 0:19 - 0:22
    and I was one of the founders of OK Cupid.
  • 0:22 - 0:25
    It's now one of the biggest dating sites in the United States.
  • 0:25 - 0:26
    Like almost everyone at the site,
  • 0:26 - 0:27
    I was a math major, and, as you might expect,
  • 0:27 - 0:29
    we're known for the analytic approach
  • 0:29 - 0:30
    we have taken to love.
  • 0:30 - 0:32
    We call it our matching algorithm.
  • 0:32 - 0:33
    Basically OK Cupid's matching algorithm
  • 0:33 - 0:36
    helps us decide whether two people should go on a date.
  • 0:36 - 0:39
    We built our entire business around it.
  • 0:39 - 0:41
    Now, algorithm is a fancy word,
  • 0:41 - 0:43
    and people like to drop it like it's this big thing,
  • 0:43 - 0:45
    but, really, an algorithm is just a systematic,
  • 0:45 - 0:48
    step-by-step way to solve a problem.
  • 0:48 - 0:50
    It doesn't have to be fancy at all.
  • 0:50 - 0:52
    Here, in this lesson, I'm going to explain
  • 0:52 - 0:54
    how we arrived at our particular algorithm
  • 0:54 - 0:56
    so you can see how it's done.
  • 0:56 - 0:58
    Now, why are algorithms even important?
  • 0:58 - 0:59
    Why does this lesson even exist?
  • 0:59 - 1:02
    Well, notice one very significant phrase I used above:
  • 1:02 - 1:05
    they are a step-by-step way to solve a problem,
  • 1:05 - 1:06
    and, as you probably know,
  • 1:06 - 1:08
    computers excel at step-by-step processes.
  • 1:08 - 1:10
    A computer without an algorithm
  • 1:10 - 1:13
    is basically an expensive paperweight.
  • 1:13 - 1:15
    And since computers are such a pervasive part of everyday life,
  • 1:15 - 1:17
    algorithms are everywhere.
  • 1:19 - 1:20
    The math behind OK Cupid's matching algorithm
  • 1:20 - 1:22
    is surprisingly simple.
  • 1:22 - 1:23
    It's just some addition,
  • 1:23 - 1:24
    multiplication,
  • 1:24 - 1:25
    a little bit of square roots.
  • 1:25 - 1:28
    The tricky part in designing it, though,
  • 1:28 - 1:30
    was figuring out how to take something mysterious,
  • 1:30 - 1:31
    human attraction,
  • 1:31 - 1:34
    and break it into components that a computer can work with.
  • 1:34 - 1:36
    Well, the first thing we needed to match people up was data,
  • 1:36 - 1:38
    something for the algorithm to work with.
  • 1:38 - 1:40
    The best way to get data quickly from people
  • 1:40 - 1:42
    is to just ask for it.
  • 1:42 - 1:44
    So, we decided that OK Cupid should ask users questions,
  • 1:44 - 1:47
    stuff like, "Do you want to have kids one day?"
  • 1:47 - 1:49
    and "How often do you brush your teeth?",
  • 1:49 - 1:50
    "Do you like scary movies?"
  • 1:50 - 1:54
    and big stuff like "Do you believe in God?"
  • 1:54 - 1:55
    Now, a lot of the questions are good
  • 1:55 - 1:56
    for matching like with like,
  • 1:56 - 1:59
    that is when both people answer the same way.
  • 1:59 - 2:01
    For example, two people who are both into scary movies
  • 2:01 - 2:03
    are probably a better match
  • 2:03 - 2:04
    than one person who is
  • 2:04 - 2:05
    and one person who isn't.
  • 2:05 - 2:06
    But what about a question like,
  • 2:06 - 2:08
    "Do you like to be the center of attention?"
  • 2:08 - 2:11
    If both people in a relationship are saying yes to this,
  • 2:11 - 2:13
    then they are going to have massive problems.
  • 2:13 - 2:14
    We realized this early on,
  • 2:14 - 2:16
    and so we decided we needed
  • 2:16 - 2:18
    a bit more data from each question.
  • 2:18 - 2:20
    We had to ask people to specify not only their own answer,
  • 2:20 - 2:23
    but the answer they wanted from someone else.
  • 2:23 - 2:24
    That worked really well,
  • 2:24 - 2:26
    but we needed one more dimension.
  • 2:26 - 2:29
    Some questions tell you more about a person than others.
  • 2:29 - 2:32
    For example, a question about politics, something like,
  • 2:32 - 2:35
    "Which is worse: book burning or flag burning?"
  • 2:35 - 2:37
    might reveal more about someone than their taste in movies.
  • 2:37 - 2:39
    And it doesn't make sense to weigh all things equally,
  • 2:39 - 2:42
    so we added one final data point.
  • 2:42 - 2:43
    For everything that OK Cupid asks you,
  • 2:43 - 2:45
    you have a chance to tell us
  • 2:45 - 2:46
    the role it plays in your life,
  • 2:46 - 2:49
    and this ranges from irrelevant to mandatory.
  • 2:49 - 2:51
    So now, for every question,
  • 2:51 - 2:53
    we have three things for our algorithm:
  • 2:53 - 2:54
    first, your answer;
  • 2:54 - 2:56
    second, how you want someone else,
  • 2:56 - 2:57
    your potential match,
  • 2:57 - 2:59
    to answer;
  • 2:59 - 3:02
    and three, how important the question is to you at all.
  • 3:02 - 3:04
    With all this information,
  • 3:04 - 3:07
    OK Cupid can figure out how well two people will get along.
  • 3:07 - 3:09
    The algorithm crunches the numbers and gives us a result.
  • 3:09 - 3:11
    As a practical example,
  • 3:11 - 3:14
    let's look at how we'd match you with another person,
  • 3:14 - 3:16
    let's call him, "B".
  • 3:16 - 3:17
    Your match percentage with B is based on
  • 3:17 - 3:19
    questions you've both answered.
  • 3:19 - 3:22
    Let's call that set of common questions, "s".
  • 3:22 - 3:25
    As a very simple example, we use a small set "s"
  • 3:25 - 3:26
    with just two questions in common
  • 3:26 - 3:28
    and compute a match from that.
  • 3:28 - 3:30
    Here are our two example questions.
  • 3:30 - 3:32
    The first one, let's say, is, "How messy are you?"
  • 3:32 - 3:35
    and the answer possibilities are
  • 3:35 - 3:36
    very messy,
  • 3:36 - 3:36
    average,
  • 3:36 - 3:38
    and very organized.
  • 3:38 - 3:40
    And let's say you answered "very organized,"
  • 3:40 - 3:43
    and you'd like someone else to answer "very organized,"
  • 3:43 - 3:45
    and the question is very important to you.
  • 3:45 - 3:46
    Basically you are a neat freak.
  • 3:46 - 3:47
    You're neat,
  • 3:47 - 3:48
    you want someone else to be neat,
  • 3:48 - 3:49
    and that's it.
  • 3:49 - 3:51
    And let's say B is a little bit different.
  • 3:51 - 3:54
    He answered very organized for himself,
  • 3:54 - 3:55
    but average is OK with him
  • 3:55 - 3:57
    as an answer from someone else,
  • 3:57 - 3:59
    and the question is only a little important to him.
  • 3:59 - 4:00
    Let's look at the second question,
  • 4:00 - 4:02
    it's the one from our previous example:
  • 4:02 - 4:04
    "Do you like to be the center of attention?"
  • 4:04 - 4:05
    The answers are just yes and no.
  • 4:05 - 4:06
    Now you've answered "no,"
  • 4:06 - 4:08
    how you want someone else to answer is "no,"
  • 4:08 - 4:11
    and the questions is only a little important to you.
  • 4:11 - 4:12
    Now B, he's answered "yes,"
  • 4:12 - 4:14
    he wants someone else to answer "no,"
  • 4:14 - 4:16
    because he wants the spotlight on him,
  • 4:16 - 4:19
    and the question is somewhat important to him.
  • 4:19 - 4:22
    So, let's try to compute all of this.
  • 4:22 - 4:23
    Our first step is,
  • 4:23 - 4:24
    since we use computers to do this,
  • 4:24 - 4:26
    we need to assign numerical values
  • 4:26 - 4:29
    to ideas like "somewhat important" and "very important"
  • 4:29 - 4:31
    because computers need everything in numbers.
  • 4:31 - 4:34
    We at OK Cupid decided on the following scale:
  • 4:34 - 4:36
    irrelevant is worth 0,
  • 4:36 - 4:38
    a little important is worth 1,
  • 4:38 - 4:40
    somewhat important is worth 10,
  • 4:40 - 4:42
    very important is 50,
  • 4:42 - 4:46
    and absolutely mandatory is 250.
  • 4:46 - 4:49
    Next, the algorithm makes two simple calculations.
  • 4:49 - 4:52
    The first is how much did B's answers satisfy you,
  • 4:52 - 4:56
    that is, how many possible points did B score on your scale?
  • 4:56 - 4:58
    Well, you indicated that B's answer
  • 4:58 - 5:00
    to the first question about messiness
  • 5:00 - 5:01
    was very important to you.
  • 5:01 - 5:04
    It's worth 50 points and B got that right.
  • 5:04 - 5:06
    The second question is worth only 1
  • 5:06 - 5:08
    because you said it was only a little important,
  • 5:08 - 5:09
    and B got that wrong.
  • 5:09 - 5:12
    So B's answers were 50 out of 51 possible points.
  • 5:12 - 5:14
    That's 98% satisfactory.
  • 5:14 - 5:15
    It's pretty good.
  • 5:15 - 5:17
    And, the second question of the algorithm looks at
  • 5:17 - 5:19
    is how much did you satisfy B.
  • 5:19 - 5:21
    Well, B placed 1 point on your answer
  • 5:21 - 5:22
    to the messiness question
  • 5:22 - 5:25
    and 10 on your answer to the second.
  • 5:25 - 5:27
    Of those, 11, that's 1 plus 10,
  • 5:27 - 5:28
    you earned 10,
  • 5:28 - 5:31
    you guys satisfied each other on the second question.
  • 5:31 - 5:33
    So your answers were 10 out of 11
  • 5:33 - 5:35
    equals 91% satisfactory to B.
  • 5:35 - 5:36
    That's not bad.
  • 5:36 - 5:38
    The final step is to take these two match percentages
  • 5:38 - 5:40
    and get one number for the both of you.
  • 5:40 - 5:43
    To do this, the algorithm multiplies your scores,
  • 5:43 - 5:44
    then takes the nth root,
  • 5:44 - 5:47
    where n is the number of questions.
  • 5:47 - 5:49
    Because s, which is the number of questions,
  • 5:49 - 5:52
    in this sample, is only 2,
  • 5:52 - 5:54
    we have match percentage equals
  • 5:54 - 5:58
    the square root of 98% times 91%.
  • 5:58 - 6:00
    That equals 94%.
  • 6:00 - 6:03
    That 94% is your match percentage with B.
  • 6:03 - 6:05
    It's a mathematical expression
  • 6:05 - 6:06
    of how happy you'd be with each other
  • 6:06 - 6:08
    based on what we know.
  • 6:08 - 6:10
    Now, why does the algorithm multiply as opposed to, say,
  • 6:10 - 6:12
    average the two match scores together
  • 6:12 - 6:15
    and do the square-root business?
  • 6:15 - 6:16
    In general, this formula is called the geometric mean,
  • 6:16 - 6:18
    which is a great way to combine values
  • 6:18 - 6:19
    that have wide ranges
  • 6:19 - 6:21
    and represent very different properties.
  • 6:21 - 6:23
    In other words, it's perfect for romantic matching.
  • 6:23 - 6:24
    You've got wide ranges
  • 6:24 - 6:26
    and you've got tons of different data points,
  • 6:26 - 6:27
    like I said, about movies,
  • 6:27 - 6:28
    about politics,
  • 6:28 - 6:29
    about religion,
  • 6:29 - 6:30
    about everything.
  • 6:30 - 6:32
    Intuitively, too, this makes sense.
  • 6:32 - 6:35
    Two people satisfying each other 50%
  • 6:35 - 6:36
    should be a better match
  • 6:36 - 6:39
    than two others who satisfy 0 and 100,
  • 6:39 - 6:41
    because affection needs to be mutual.
  • 6:41 - 6:43
    After adding a little correction for margin of error,
  • 6:43 - 6:46
    in the case when we have a very small number of questions,
  • 6:46 - 6:47
    like we do in this example,
  • 6:47 - 6:49
    we're good to go.
  • 6:49 - 6:50
    Any time OK Cupid matches two people,
  • 6:50 - 6:52
    it goes through the steps we just outlined.
  • 6:52 - 6:54
    First it collects data about your answers,
  • 6:54 - 6:57
    then it compares your choices and preferences
  • 6:57 - 7:00
    to other people in simple, mathematical ways.
  • 7:00 - 7:02
    This, the ability to take real world phenomena
  • 7:02 - 7:05
    and make them something a microchip can understand,
  • 7:05 - 7:06
    is, I think,
  • 7:06 - 7:09
    the most important skill anyone can have these days.
  • 7:09 - 7:11
    Like you use sentences to tell a story to a person,
  • 7:11 - 7:14
    you use algorithms to tell a story to a computer.
  • 7:14 - 7:15
    If you learn the language,
  • 7:15 - 7:16
    you can go out and tell your stories.
  • 7:16 - 7:19
    I hope this will help you do that.
Title:
Inside OKCupid: The math of online dating - Christian Rudder
Speaker:
Christian Rudder
Description:

View full lesson: http://ed.ted.com/lessons/inside-okcupid-the-math-of-online-dating-christian-rudder

When two people join a dating website, they are matched according to shared interests and how they answer a number of personal questions. But how do sites calculate the likelihood of a successful relationship? Christian Rudder, one of the founders of popular dating site OKCupid, details the algorithm behind 'hitting it off.'

Lesson by Christian Rudder, animation by TED-Ed.

more » « less
Video Language:
English
Team:
closed TED
Project:
TED-Ed
Duration:
07:31
Krystian Aparta commented on English subtitles for Inside OKCupid: The math of online dating
Krystian Aparta edited English subtitles for Inside OKCupid: The math of online dating
Bedirhan Cinar approved English subtitles for Inside OKCupid: The math of online dating
Bedirhan Cinar accepted English subtitles for Inside OKCupid: The math of online dating
Bedirhan Cinar edited English subtitles for Inside OKCupid: The math of online dating
Andrea McDonough edited English subtitles for Inside OKCupid: The math of online dating
Andrea McDonough added a translation

English subtitles

Revisions Compare revisions