< Return to Video

Jennifer Helsby: Prediction and Control

  • 0:00 - 0:09
    Musik
  • 0:09 - 0:20
    Herald: Who of you is using Facebook? Twitter?
    Diaspora?
  • 0:20 - 0:28
    concerned noise And all of that data
    you enter there
  • 0:28 - 0:34
    gets to server, gets into the hand of somebody
    who's using it
  • 0:34 - 0:39
    and the next talk
    is especially about that,
  • 0:39 - 0:44
    because there's also intelligent machines
    and intelligent algorithms
  • 0:44 - 0:47
    that try to make something
    out of that data.
  • 0:47 - 0:51
    So the post-doc researcher Jennifer Helsby
  • 0:51 - 0:56
    of the University of Chicago,
    which works in this
  • 0:56 - 0:59
    intersection between policy and
    technology,
  • 0:59 - 1:05
    will now ask you the question:
    To who would we give that power?
  • 1:05 - 1:13
    Dr. Helsby: Thanks.
    applause
  • 1:13 - 1:17
    Okay, so, today I'm gonna do a brief tour
    of intelligent systems
  • 1:17 - 1:19
    and how they're currently used
  • 1:19 - 1:22
    and then we're gonna look at some examples
    with respect
  • 1:22 - 1:24
    to the properties that we might care about
  • 1:24 - 1:26
    these systems having,
    and I'll talk a little bit about
  • 1:26 - 1:28
    some of the work that's been done in academia
  • 1:28 - 1:29
    on these topics.
  • 1:29 - 1:32
    And then we'll talk about some
    promising paths forward.
  • 1:32 - 1:37
    So, I wanna start with this:
    Kranzberg's First Law of Technology
  • 1:37 - 1:40
    So, it's not good or bad,
    but it also isn't neutral.
  • 1:40 - 1:43
    Technology shapes our world,
    and it can act as
  • 1:43 - 1:46
    a liberating force-- or an oppressive and
    controlling force.
  • 1:46 - 1:50
    So, in this talk, I'm gonna go
    towards some of the aspects
  • 1:50 - 1:54
    of intelligent systems that might be more
    controlling in nature.
  • 1:54 - 1:56
    So, as we all know,
  • 1:56 - 2:00
    because of the rapidly decreasing cost
    of storage and computation,
  • 2:00 - 2:02
    along with the rise of new sensor technologies,
  • 2:02 - 2:06
    data collection devices
    are being pushed into every
  • 2:06 - 2:08
    aspect of our lives: in our homes, our cars,
  • 2:08 - 2:10
    in our pockets, on our wrists.
  • 2:10 - 2:13
    And data collection systems act as intermediaries
  • 2:13 - 2:15
    for a huge amount of human communication.
  • 2:15 - 2:18
    And much of this data sits in government
  • 2:18 - 2:20
    and corporate databases.
  • 2:20 - 2:23
    So, in order to make use of this data,
  • 2:23 - 2:27
    we need to be able to make some inferences.
  • 2:27 - 2:30
    So, one way of approaching this is I can hire
  • 2:30 - 2:32
    a lot of humans, and I can have these humans
  • 2:32 - 2:35
    manually examine the data, and they can acquire
  • 2:35 - 2:37
    expert knowledge of the domain, and then
  • 2:37 - 2:39
    perhaps they can make some decisions
  • 2:39 - 2:41
    or at least some recommendations
    based on it.
  • 2:41 - 2:43
    However, there's some problems with this.
  • 2:43 - 2:46
    One is that it's slow, and thus expensive.
  • 2:46 - 2:48
    It's also biased. We know that humans have
  • 2:48 - 2:51
    all sorts of biases, both conscious and unconscious,
  • 2:51 - 2:53
    and it would be nice to have a system
    that did not have
  • 2:53 - 2:55
    these inaccuracies.
  • 2:55 - 2:57
    It's also not very transparent: I might
  • 2:57 - 2:59
    not really know the factors that led to
  • 2:59 - 3:01
    some decisions being made.
  • 3:01 - 3:03
    Even humans themselves
    often don't really understand
  • 3:03 - 3:05
    why they came to a given decision, because
  • 3:05 - 3:08
    of their being emotional in nature.
  • 3:08 - 3:12
    And, thus, these human decision making systems
  • 3:12 - 3:13
    are often difficult to audit.
  • 3:13 - 3:16
    So, another way to proceed is maybe instead
  • 3:16 - 3:18
    I study the system and the data carefully
  • 3:18 - 3:21
    and I write down the best rules
    for making a decision
  • 3:21 - 3:23
    or, I can have a machine
    dynamically figure out
  • 3:23 - 3:25
    the best rules, as in machine learning.
  • 3:25 - 3:29
    So, maybe this is a better approach.
  • 3:29 - 3:32
    It's certainly fast, and thus cheap.
  • 3:32 - 3:34
    And maybe I can construct
    the system in such a way
  • 3:34 - 3:37
    that it doesn't have the biases that are inherent
  • 3:37 - 3:39
    in human decision making.
  • 3:39 - 3:42
    And, since I've written these rules down,
  • 3:42 - 3:43
    or a computer has learned these rules,
  • 3:43 - 3:45
    then I can just show them to somebody, right?
  • 3:45 - 3:47
    And then they can audit it.
  • 3:47 - 3:49
    So, more and more decision making is being
  • 3:49 - 3:51
    done in this way.
  • 3:51 - 3:53
    And so, in this model, we take data
  • 3:53 - 3:56
    we make an inference based on that data
  • 3:56 - 3:58
    using these algorithms, and then
  • 3:58 - 3:59
    we can take actions.
  • 3:59 - 4:02
    And, when we take this more scientific approach
  • 4:02 - 4:04
    to making decisions and optimizing for
  • 4:04 - 4:07
    a desired outcome,
    we can take an experimental approach
  • 4:07 - 4:10
    so we can determine
    which actions are most effective
  • 4:10 - 4:12
    in achieving a desired outcome.
  • 4:12 - 4:14
    Maybe there are some types of communication
  • 4:14 - 4:17
    styles that are most effective
    with certain people.
  • 4:17 - 4:20
    I can perhaps deploy some individualized incentives
  • 4:20 - 4:22
    to get the outcome that I desire.
  • 4:22 - 4:26
    And, maybe even if I carefully design an experiment
  • 4:26 - 4:28
    with the environment in which people make
  • 4:28 - 4:31
    these decisions, perhaps even very small changes
  • 4:31 - 4:34
    can introduce significant changes
    in peoples' behavior.
  • 4:34 - 4:37
    So, through these mechanisms,
    and this experimental approach,
  • 4:37 - 4:40
    I can maximize the probability
    that humans do
  • 4:40 - 4:42
    what I want.
  • 4:42 - 4:45
    So, algorithmic decision making is being used
  • 4:45 - 4:47
    in industry, and is used
    in lots of other areas,
  • 4:47 - 4:50
    from astrophysics to medicine, and is now
  • 4:50 - 4:52
    moving into new domains, including
  • 4:52 - 4:54
    government applications.
  • 4:54 - 4:59
    So, we have recommendation engines like
    Netflix, Yelp, SoundCloud,
  • 4:59 - 5:01
    that direct our attention to what we should
  • 5:01 - 5:04
    watch and listen to.
  • 5:04 - 5:08
    Since 2009, Google uses
    personalized searched results,
  • 5:08 - 5:13
    including if you're not logged in
    into your Google account.
  • 5:13 - 5:15
    And we also have algorithm curation and filtering,
  • 5:15 - 5:18
    as in the case of Facebook News Feed,
  • 5:18 - 5:20
    Google News, Yahoo News,
  • 5:20 - 5:23
    which shows you what news articles, for example,
  • 5:23 - 5:24
    you should be looking at.
  • 5:24 - 5:26
    And this is important, because a lot of people
  • 5:26 - 5:29
    get news from these media.
  • 5:29 - 5:32
    We even have algorithmic journalists!
  • 5:32 - 5:35
    So, automatic systems generate articles
  • 5:35 - 5:37
    about weather, traffic, or sports
  • 5:37 - 5:39
    instead of a human.
  • 5:39 - 5:42
    And, another application that's more recent
  • 5:42 - 5:44
    is the use of predictive systems
  • 5:44 - 5:45
    in political campaigns.
  • 5:45 - 5:47
    So, political campaigns also now take this
  • 5:47 - 5:50
    approach to predict on an individual basis
  • 5:50 - 5:53
    which candidate voters
    are likely to vote for.
  • 5:53 - 5:56
    And then they can target,
    on an individual basis,
  • 5:56 - 5:58
    those that can be persuaded otherwise.
  • 5:58 - 6:01
    And, finally, in the public sector,
  • 6:01 - 6:03
    we're starting to use predictive systems
  • 6:03 - 6:06
    in areas from policing, to health,
    to education and energy.
  • 6:06 - 6:09
    So, there are some advantages to this.
  • 6:09 - 6:13
    So, one thing is that we can automate
  • 6:13 - 6:16
    aspects of our lives
    that we consider to be mundane
  • 6:16 - 6:18
    using systems that are intelligent
  • 6:18 - 6:20
    and adaptive enough.
  • 6:20 - 6:22
    We can make use of all the data
  • 6:22 - 6:24
    and really get the pieces of information we
  • 6:24 - 6:26
    really care about.
  • 6:26 - 6:30
    We can spend money in the most effective way,
  • 6:30 - 6:32
    and we can do this with this experimental
  • 6:32 - 6:34
    approach to optimize actions to produce
  • 6:34 - 6:35
    desired outcomes.
  • 6:35 - 6:37
    So, we can embed intelligence
  • 6:37 - 6:40
    into all of these mundane objects
  • 6:40 - 6:41
    and enable them to make decisions for us,
  • 6:41 - 6:43
    and so that's what we're doing more and more,
  • 6:43 - 6:45
    and we can have an object
    that decides for us
  • 6:45 - 6:47
    what temperature we should set our house,
  • 6:47 - 6:49
    what we should be doing, etc.
  • 6:49 - 6:52
    So, there might be some implications here.
  • 6:52 - 6:56
    We want these systems
    that do work on this data
  • 6:56 - 6:58
    to increase the opportunities
    available to us.
  • 6:58 - 7:00
    But it might be that there are some implications
  • 7:00 - 7:02
    that we have not carefully thought through.
  • 7:02 - 7:03
    This is a new area, and people are only
  • 7:03 - 7:06
    starting to scratch the surface of what the
  • 7:06 - 7:07
    problems might be.
  • 7:07 - 7:10
    In some cases, they might narrow the options
  • 7:10 - 7:11
    available to people,
  • 7:11 - 7:13
    and this approach subjects people to
  • 7:13 - 7:16
    suggestive messaging intended to nudge them
  • 7:16 - 7:17
    to a desired outcome.
  • 7:17 - 7:19
    Some people may have a problem with that.
  • 7:19 - 7:21
    Values we care about are not gonna be
  • 7:21 - 7:24
    baked into these systems by default.
  • 7:24 - 7:26
    It's also the case that some algorithmic systems
  • 7:26 - 7:28
    facilitate work that we do not like.
  • 7:28 - 7:30
    For example, in the case of mass surveillance.
  • 7:30 - 7:32
    And even the same systems,
  • 7:32 - 7:34
    used by different people or organizations,
  • 7:34 - 7:36
    have very different consequences.
  • 7:36 - 7:37
    For example, if I can predict
  • 7:37 - 7:40
    with high accuracy, based on say search queries,
  • 7:40 - 7:42
    who's gonna be admitted to a hospital,
  • 7:42 - 7:44
    some people would be interested
    in knowing that.
  • 7:44 - 7:46
    You might be interested
    in having your doctor know that.
  • 7:46 - 7:48
    But that same predictive model
    in the hands of
  • 7:48 - 7:51
    an insurance company
    has a very different implication.
  • 7:51 - 7:53
    So, the point here is that these systems
  • 7:53 - 7:56
    structure and influence how humans interact
  • 7:56 - 7:58
    with each other, how they interact with society,
  • 7:58 - 8:00
    and how they interact with government.
  • 8:00 - 8:03
    And if they constrain what people can do,
  • 8:03 - 8:05
    we should really care about this.
  • 8:05 - 8:08
    So now I'm gonna go to
    sort of an extreme case,
  • 8:08 - 8:12
    just as an example, and that's this
    Chinese Social Credit System.
  • 8:12 - 8:14
    And so this is probably one of the more
  • 8:14 - 8:17
    ambitious uses of data,
  • 8:17 - 8:19
    that is used to rank each citizen
  • 8:19 - 8:21
    based on their behavior, in China.
  • 8:21 - 8:24
    So right now, there are various pilot systems
  • 8:24 - 8:28
    deployed by various companies doing this in
    China.
  • 8:28 - 8:31
    They're currently voluntary, and by 2020
  • 8:31 - 8:33
    this system is gonna be decided on,
  • 8:33 - 8:35
    or a combination of the systems,
  • 8:35 - 8:37
    that is gonna be mandatory for everyone.
  • 8:37 - 8:41
    And so, in this system, there are some citizens,
  • 8:41 - 8:44
    and a huge range of data sources are used.
  • 8:44 - 8:47
    So, some of the data sources are
  • 8:47 - 8:48
    your financial data,
  • 8:48 - 8:50
    your criminal history,
  • 8:50 - 8:52
    how many points you have
    on your driver's license,
  • 8:52 - 8:55
    medical information-- for example,
    if you take birth control pills,
  • 8:55 - 8:57
    that's incorporated.
  • 8:57 - 9:00
    Your purchase history-- for example,
    if you purchase games,
  • 9:00 - 9:02
    you are down-ranked in the system.
  • 9:02 - 9:04
    Some of the systems, not all of them,
  • 9:04 - 9:07
    incorporate social media monitoring,
  • 9:07 - 9:09
    which makes sense if you're a state like China,
  • 9:09 - 9:11
    you probably want to know about
  • 9:11 - 9:15
    political statements that people
    are saying on social media.
  • 9:15 - 9:18
    And, one of the more interesting parts is
  • 9:18 - 9:22
    social network analysis:
    looking at the relationships between people.
  • 9:22 - 9:24
    So, if you have a close relationship with
    somebody
  • 9:24 - 9:26
    and they have a low credit score,
  • 9:26 - 9:29
    that can have implications on your credit
    score.
  • 9:29 - 9:34
    So, the way that these scores
    are generated is secret.
  • 9:34 - 9:38
    And, according to the call for these systems
  • 9:38 - 9:39
    put out by the government,
  • 9:39 - 9:43
    the goal is to
    "carry forward the sincerity and
  • 9:43 - 9:46
    traditional virtues" and
    establish the idea of a
  • 9:46 - 9:48
    "sincerity culture."
  • 9:48 - 9:49
    But wait, it gets better:
  • 9:49 - 9:52
    so, there's a portal that enables citizens
  • 9:52 - 9:55
    to look up the citizen score of anyone.
  • 9:55 - 9:57
    And many people like this system,
  • 9:57 - 9:58
    they think it's a fun game.
  • 9:58 - 10:01
    They boast about it on social media,
  • 10:01 - 10:04
    they put their score in their dating profile,
  • 10:04 - 10:05
    because if you're ranked highly you're
  • 10:05 - 10:07
    part of an exclusive club.
  • 10:07 - 10:10
    You can get VIP treatment
    at hotels and other companies.
  • 10:10 - 10:12
    But the downside is that, if you're excluded
  • 10:12 - 10:16
    from that club, your weak score
    may have other implications,
  • 10:16 - 10:20
    like being unable to get access
    to credit, housing, jobs.
  • 10:20 - 10:23
    There is some reporting that even travel visas
  • 10:23 - 10:27
    might be restricted
    if your score is particularly low.
  • 10:27 - 10:31
    So, a system like this, for a state, is really
  • 10:31 - 10:35
    the optimal solution
    to the problem of the public.
  • 10:35 - 10:37
    It constitutes a very subtle and insiduous
  • 10:37 - 10:39
    mechanism of social control.
  • 10:39 - 10:41
    You don't need to spend a lot of money on
  • 10:41 - 10:44
    police or prisons if you can set up a system
  • 10:44 - 10:46
    where people discourage one another from
  • 10:46 - 10:49
    anti-social acts like political action
    in exchange for
  • 10:49 - 10:51
    a coupon for a free Uber ride.
  • 10:51 - 10:55
    So, there are a lot of
    legitimate questions here:
  • 10:55 - 10:58
    What protections does
    user data have in this scheme?
  • 10:58 - 11:01
    Do any safeguards exist to prevent tampering?
  • 11:01 - 11:04
    What mechanism, if any, is there to prevent
  • 11:04 - 11:09
    false input data from creating erroneous inferences?
  • 11:09 - 11:10
    Is there any way that people can fix
  • 11:10 - 11:13
    their score once they're ranked poorly?
  • 11:13 - 11:14
    Or does it end up becoming a
  • 11:14 - 11:16
    self-fulfilling prophecy?
  • 11:16 - 11:18
    Your weak score means you have less access
  • 11:18 - 11:22
    to jobs and credit, and now you will have
  • 11:22 - 11:25
    limited access to opportunity.
  • 11:25 - 11:27
    So, let's take a step back.
  • 11:27 - 11:28
    So, what do we want?
  • 11:28 - 11:32
    So, we probably don't want that,
  • 11:32 - 11:34
    but as advocates we really wanna
  • 11:34 - 11:36
    understand what questions we should be asking
  • 11:36 - 11:38
    of these systems. Right now there's
  • 11:38 - 11:40
    very little oversight,
  • 11:40 - 11:41
    and we wanna make sure that we don't
  • 11:41 - 11:44
    sort of sleepwalk our way to a situation
  • 11:44 - 11:47
    where we've lost even more power
  • 11:47 - 11:50
    to these centralized systems of control.
  • 11:50 - 11:52
    And if you're an implementer, we wanna understand
  • 11:52 - 11:54
    what can we be doing better.
  • 11:54 - 11:56
    Are there better ways that we can be implementing
  • 11:56 - 11:58
    these systems?
  • 11:58 - 11:59
    Are there values that, as humans,
  • 11:59 - 12:01
    we care about that we should make sure
  • 12:01 - 12:02
    these systems have?
  • 12:02 - 12:06
    So, the first thing
    that most people in the room
  • 12:06 - 12:08
    might think about is privacy.
  • 12:08 - 12:11
    Which is, of course, of the utmost importance.
  • 12:11 - 12:13
    We need privacy, and there is a good discussion
  • 12:13 - 12:16
    on the importance of protecting
    user data where possible.
  • 12:16 - 12:18
    So, in this talk, I'm gonna focus
    on the other aspects of
  • 12:18 - 12:19
    algorithmic decision making,
  • 12:19 - 12:21
    that I think have got less attention.
  • 12:21 - 12:25
    Because it's not just privacy
    that we need to worry about here.
  • 12:25 - 12:29
    We also want systems that are fair and equitable.
  • 12:29 - 12:30
    We want transparent systems,
  • 12:30 - 12:35
    we don't want opaque decisions
    to be made about us,
  • 12:35 - 12:37
    decisions that might have serious impacts
  • 12:37 - 12:38
    on our lives.
  • 12:38 - 12:40
    And we need some accountability mechanisms.
  • 12:40 - 12:42
    So, for the rest of this talk
  • 12:42 - 12:43
    we're gonna go through each one of these things
  • 12:43 - 12:45
    and look at some examples.
  • 12:45 - 12:48
    So, the first thing is fairness.
  • 12:48 - 12:50
    And so, as I said in the beginning,
    this is one area
  • 12:50 - 12:53
    where there might be an advantage
  • 12:53 - 12:55
    to making decisions by machine,
  • 12:55 - 12:57
    especially in areas where there have
  • 12:57 - 12:59
    historically been fairness issues with
  • 12:59 - 13:02
    decision making, such as law enforcement.
  • 13:02 - 13:06
    So, this is one way that police departments
  • 13:06 - 13:08
    use predictive models.
  • 13:08 - 13:11
    The idea here is police would like to
  • 13:11 - 13:13
    allocate resources in a more effective way,
  • 13:13 - 13:15
    and they would also like to enable
  • 13:15 - 13:17
    proactive policing.
  • 13:17 - 13:20
    So, if you can predict where crimes
    are going to occur,
  • 13:20 - 13:22
    or who is going to commit crimes,
  • 13:22 - 13:25
    then you can put cops in those places,
  • 13:25 - 13:28
    or perhaps following these people,
  • 13:28 - 13:29
    and then the crimes will not occur.
  • 13:29 - 13:31
    So, it's sort of the pre-crime approach.
  • 13:31 - 13:35
    So, there are a few ways of going about this.
  • 13:35 - 13:38
    One way is doing this individual-level prediction.
  • 13:38 - 13:41
    So you take each citizen
    and estimate the risk
  • 13:41 - 13:44
    that each citizen will participate,
    say, in violence
  • 13:44 - 13:45
    based on some data.
  • 13:45 - 13:47
    And then you can flag those people that are
  • 13:47 - 13:49
    considered particularly violent.
  • 13:49 - 13:52
    So, this is currently done.
  • 13:52 - 13:53
    This is done in the U.S.
  • 13:53 - 13:56
    It's done in Chicago,
    by the Chicago Police Department.
  • 13:56 - 13:58
    And they maintain a heat list of individuals
  • 13:58 - 14:01
    that are considered most likely to commit,
  • 14:01 - 14:04
    or be the victim of, violence.
  • 14:04 - 14:07
    And this is done using data
    that the police maintain.
  • 14:07 - 14:10
    So, the features that are used
    in this predictive model
  • 14:10 - 14:12
    include things that are derived from
  • 14:12 - 14:15
    individuals' criminal history.
  • 14:15 - 14:17
    So, for example, have they been involved in
  • 14:17 - 14:18
    gun violence in the past?
  • 14:18 - 14:21
    Do they have narcotics arrests? And so on.
  • 14:21 - 14:23
    But another thing that's incorporated
  • 14:23 - 14:25
    in the Chicago Police Department model is
  • 14:25 - 14:28
    information derived from
    social media network analysis.
  • 14:28 - 14:31
    So, who you interact with,
  • 14:31 - 14:32
    as noted in police data.
  • 14:32 - 14:35
    So, for example, your co-arrestees.
  • 14:35 - 14:36
    When officers conduct field interviews,
  • 14:36 - 14:38
    who are people interacting with?
  • 14:38 - 14:43
    And then this is all incorporated
    into this risk score.
  • 14:43 - 14:45
    So another way to proceed,
  • 14:45 - 14:47
    which is the method that most companies
  • 14:47 - 14:50
    that sell products like this
    to the police have taken,
  • 14:50 - 14:51
    is instead predicting which areas
  • 14:51 - 14:54
    are likely to have crimes committed in them.
  • 14:54 - 14:57
    So, take my city, I put a grid down,
  • 14:57 - 14:58
    and then I use crime statistics
  • 14:58 - 15:00
    and maybe some ancillary data sources,
  • 15:00 - 15:02
    to determine which areas have
  • 15:02 - 15:05
    the highest risk of crimes occurring in them,
  • 15:05 - 15:06
    and I can flag those areas and send
  • 15:06 - 15:08
    police officers to them.
  • 15:08 - 15:11
    So now, let's look at some of the tools
  • 15:11 - 15:14
    that are used for this geographic-level prediction.
  • 15:14 - 15:19
    So, here are 3 companies that sell these
  • 15:19 - 15:23
    geographic-level predictive policing systems.
  • 15:23 - 15:26
    So, PredPol has a system that uses
  • 15:26 - 15:27
    primarily crime statistics:
  • 15:27 - 15:30
    only the time, place, and type of crime
  • 15:30 - 15:33
    to predict where crimes will occur.
  • 15:33 - 15:36
    HunchLab uses a wider range of data sources
  • 15:36 - 15:37
    including, for example, weather
  • 15:37 - 15:40
    and then Hitachi is a newer system
  • 15:40 - 15:42
    that has a predictive crime analytics tool
  • 15:42 - 15:45
    that also incorporates social media.
  • 15:45 - 15:48
    The first one, to my knowledge, to do so.
  • 15:48 - 15:49
    And these systems are in use
  • 15:49 - 15:53
    in 50+ cities in the U.S.
  • 15:53 - 15:57
    So, why do police departments buy this?
  • 15:57 - 15:58
    Some police departments are interesting in
  • 15:58 - 16:00
    buying systems like this, because they're marketed
  • 16:00 - 16:03
    as impartial systems,
  • 16:03 - 16:06
    so it's a way to police in an unbiased way.
  • 16:06 - 16:08
    And so, these companies make
  • 16:08 - 16:09
    statements like this--
  • 16:09 - 16:11
    by the way, the references
    will all be at the end,
  • 16:11 - 16:13
    and they'll be on the slides--
  • 16:13 - 16:13
    So, for example
  • 16:13 - 16:16
    the predictive crime analytics from Hitachi
  • 16:16 - 16:18
    claims that the system is anonymous,
  • 16:18 - 16:19
    because it shows you an area,
  • 16:19 - 16:23
    it doesn't show you
    to look for a particular person.
  • 16:23 - 16:26
    and PredPol reassures people that
  • 16:26 - 16:30
    it eliminates any liberties or profiling concerns.
  • 16:30 - 16:32
    And HunchLab notes that the system
  • 16:32 - 16:35
    fairly represents priorities for public safety
  • 16:35 - 16:39
    and is unbiased by race
    or ethnicity, for example.
  • 16:39 - 16:44
    So, let's take a minute
    to describe in more detail
  • 16:44 - 16:48
    what we mean when we talk about fairness.
  • 16:48 - 16:51
    So, when we talk about fairness,
  • 16:51 - 16:53
    we mean a few things.
  • 16:53 - 16:56
    So, one is fairness with respect to individuals:
  • 16:56 - 16:58
    so if I'm very similar to somebody
  • 16:58 - 17:00
    and we go through some process
  • 17:00 - 17:03
    and there is two very different
    outcomes to that process
  • 17:03 - 17:06
    we would consider that to be unfair.
  • 17:06 - 17:08
    So, we want similar people to be treated
  • 17:08 - 17:10
    in a similar way.
  • 17:10 - 17:13
    But, there are certain protected attributes
  • 17:13 - 17:15
    that we wouldn't want someone
  • 17:15 - 17:17
    to discriminate based on.
  • 17:17 - 17:20
    And so, there's this other property,
    Group Fairness.
  • 17:20 - 17:22
    So, we can look at the statistical parity
  • 17:22 - 17:25
    between groups, based on gender, race, etc.
  • 17:25 - 17:28
    and see if they're treated in a similar way.
  • 17:28 - 17:30
    And we might not expect that in some cases,
  • 17:30 - 17:32
    for example if the base rates in each group
  • 17:32 - 17:35
    are very different.
  • 17:35 - 17:37
    And then there's also Fairness in Errors.
  • 17:37 - 17:40
    All predictive systems are gonna make errors,
  • 17:40 - 17:43
    and if the errors are concentrated,
  • 17:43 - 17:46
    then that may also represent unfairness.
  • 17:46 - 17:50
    And so this concern arose recently with Facebook
  • 17:50 - 17:52
    because people with Native American names
  • 17:52 - 17:54
    had their profiles flagged as fraudulent
  • 17:54 - 17:59
    far more often than those
    with White American names.
  • 17:59 - 18:01
    So these are the sorts of things
    that we worry about
  • 18:01 - 18:02
    and each of these are metrics,
  • 18:02 - 18:04
    and if you're interested more you should
  • 18:04 - 18:06
    check those 2 papers out.
  • 18:06 - 18:11
    So, how can potential issues
    with predictive policing
  • 18:11 - 18:14
    have implications for these principles?
  • 18:14 - 18:19
    So, one problem is
    the training data that's used.
  • 18:19 - 18:21
    Some of these systems only use crime statistics,
  • 18:21 - 18:24
    other systems-- all of them use crime statistics
  • 18:24 - 18:26
    in some way.
  • 18:26 - 18:31
    So, one problem is that crime databases
  • 18:31 - 18:35
    contain only crimes that've been detected.
  • 18:35 - 18:39
    Right? So, the police are only gonna detect
  • 18:39 - 18:41
    crimes that they know are happening,
  • 18:41 - 18:44
    either through patrol and their own investigation
  • 18:44 - 18:46
    or because they've been alerted to crime,
  • 18:46 - 18:49
    for example by a citizen calling the police.
  • 18:49 - 18:52
    So, a citizen has to feel like
    they can call the police,
  • 18:52 - 18:54
    like that's a good idea.
  • 18:54 - 18:59
    So, some crimes suffer
    from this problem less than others:
  • 18:59 - 19:02
    for example, gun violence
    is much easier to detect
  • 19:02 - 19:04
    relative to fraud, for example,
  • 19:04 - 19:08
    which is very difficult to detect.
  • 19:08 - 19:12
    Now the racial profiling aspect
    of this might come in
  • 19:12 - 19:16
    because of biased policing in the past.
  • 19:16 - 19:20
    So, for example, for marijuana arrests,
  • 19:20 - 19:23
    black people are arrested in the U.S. at rates
  • 19:23 - 19:25
    4 times that of white people,
  • 19:25 - 19:28
    even though there is statistical parity
  • 19:28 - 19:31
    with these 2 groups, to within a few percent.
  • 19:31 - 19:36
    So, this is where problems can arise.
  • 19:36 - 19:37
    So, let's go back to this
  • 19:37 - 19:39
    geographic-level predictive policing.
  • 19:39 - 19:42
    So the danger here is that, unless this system
  • 19:42 - 19:44
    is very carefully constructed,
  • 19:44 - 19:47
    this sort of crime area ranking might
  • 19:47 - 19:49
    again become a self-fulling prophecy.
  • 19:49 - 19:51
    If you send police officers to these areas,
  • 19:51 - 19:53
    you further scrutinize them,
  • 19:53 - 19:56
    and then again you're only detecting a subset
  • 19:56 - 19:58
    of crimes, and the cycle continues.
  • 19:58 - 20:02
    So, one obvious issue is that
  • 20:02 - 20:08
    this statement about geographic-based
    crime prediction
  • 20:08 - 20:10
    being anonymous is not true,
  • 20:10 - 20:13
    because race and location are very strongly
  • 20:13 - 20:15
    correlated in the U.S.
  • 20:15 - 20:17
    And this is something that machine-learning
    systems
  • 20:17 - 20:20
    can potentially learn.
  • 20:20 - 20:23
    Another issue is that, for example,
  • 20:23 - 20:26
    for individual fairness, one of my homes
  • 20:26 - 20:28
    sits within one of these boxes.
  • 20:28 - 20:30
    Some of these boxes
    in these systems are very small,
  • 20:30 - 20:33
    for example PredPol is 500ft x 500ft,
  • 20:33 - 20:36
    so it's maybe only a few houses.
  • 20:36 - 20:39
    So, the implications of this system are that
  • 20:39 - 20:41
    you have police officers maybe sitting
  • 20:41 - 20:43
    in a police cruiser outside your home
  • 20:43 - 20:45
    and a few doors down someone
  • 20:45 - 20:47
    may not be within that box,
  • 20:47 - 20:48
    and doesn't have this.
  • 20:48 - 20:51
    So, that may represent unfairness.
  • 20:51 - 20:55
    So, there are real questions here,
  • 20:55 - 20:58
    especially because there's no opt-out.
  • 20:58 - 21:00
    There's no way to opt-out of this system:
  • 21:00 - 21:02
    if you live in a city that has this,
  • 21:02 - 21:05
    then you have to deal with it.
  • 21:05 - 21:07
    So, it's quite difficult to find out
  • 21:07 - 21:10
    what's really going on
  • 21:10 - 21:11
    because the algorithm is secret.
  • 21:11 - 21:13
    And, in most cases, we don't know
  • 21:13 - 21:15
    the full details of the inputs.
  • 21:15 - 21:17
    We have some idea
    about what features are used,
  • 21:17 - 21:18
    but that's about it.
  • 21:18 - 21:20
    We also don't know the output.
  • 21:20 - 21:22
    That would be knowing police allocation,
  • 21:22 - 21:23
    police strategies,
  • 21:23 - 21:26
    and in order to nail down
    what's really going on here
  • 21:26 - 21:29
    in order to verify the validity of
  • 21:29 - 21:30
    these companies' claims,
  • 21:30 - 21:34
    it may be necessary
    to have a 3rd party come in,
  • 21:34 - 21:36
    examine the inputs and outputs of the system,
  • 21:36 - 21:38
    and say concretely what's going on.
  • 21:38 - 21:39
    And if everything is fine and dandy
  • 21:39 - 21:41
    then this shouldn't be a problem.
  • 21:41 - 21:44
    So, that's potentially one role that
  • 21:44 - 21:45
    advocates can play.
  • 21:45 - 21:47
    Maybe we should start pushing for audits
  • 21:47 - 21:49
    of systems that are used in this way.
  • 21:49 - 21:51
    These could have serious implications
  • 21:51 - 21:53
    for peoples' lives.
  • 21:53 - 21:55
    So, we'll return
    to this idea a little bit later,
  • 21:55 - 21:58
    but for now this leads us
    nicely to Transparency.
  • 21:58 - 21:59
    So, we wanna know
  • 21:59 - 22:02
    what these systems are doing.
  • 22:02 - 22:05
    But it's very hard,
    for the reasons described earlier,
  • 22:05 - 22:06
    but even in the case of something like
  • 22:06 - 22:10
    trying to understand Google's search algorithm,
  • 22:10 - 22:12
    it's difficult because it's personalized.
  • 22:12 - 22:14
    So, by construction, each user is
  • 22:14 - 22:15
    only seeing one endpoint.
  • 22:15 - 22:18
    So, it's a very isolating system.
  • 22:18 - 22:20
    What do other people see?
  • 22:20 - 22:22
    And one reason it's difficult to make
  • 22:22 - 22:24
    some of these systems transparent
  • 22:24 - 22:27
    is because of, simply, the complexity
  • 22:27 - 22:28
    of the algorithms.
  • 22:28 - 22:30
    So, an algorithm can become so complex that
  • 22:30 - 22:32
    it's difficult to comprehend,
  • 22:32 - 22:33
    even for the designer of the system,
  • 22:33 - 22:36
    or the implementer of the system.
  • 22:36 - 22:38
    The designed might know that this algorithm
  • 22:38 - 22:43
    maximizes some metric-- say, accuracy,
  • 22:43 - 22:45
    but they may not always have a solid
  • 22:45 - 22:47
    understanding of what the algorithm is doing
  • 22:47 - 22:48
    for all inputs.
  • 22:48 - 22:51
    Certainly with respect to fairness.
  • 22:51 - 22:56
    So, in some cases,
    it might not be appropriate to use
  • 22:56 - 22:57
    an extremely complex model.
  • 22:57 - 23:00
    It might be better to use a simpler system
  • 23:00 - 23:03
    with human-interpretable features.
  • 23:03 - 23:05
    Another issue that arises
  • 23:05 - 23:08
    from the opacity of these systems
  • 23:08 - 23:09
    and the centralized control
  • 23:09 - 23:12
    is that it makes them very influential.
  • 23:12 - 23:14
    And thus, an excellent target
  • 23:14 - 23:16
    for manipulation or tampering.
  • 23:16 - 23:18
    So, this might be tampering that is done
  • 23:18 - 23:22
    from an organization that controls the system,
  • 23:22 - 23:24
    or an insider at one of the organizations,
  • 23:24 - 23:27
    or anyone who's able to compromise their security.
  • 23:27 - 23:30
    So, this is an interesting academic work
  • 23:30 - 23:32
    that looked at the possibility of
  • 23:32 - 23:34
    slightly modifying search rankings
  • 23:34 - 23:37
    to shift people's political views.
  • 23:37 - 23:39
    So, since people are most likely to
  • 23:39 - 23:41
    click on the top search results,
  • 23:41 - 23:44
    so 90% of clicks go to the
    first page of search results,
  • 23:44 - 23:47
    then perhaps by reshuffling
    things a little bit,
  • 23:47 - 23:49
    or maybe dropping some search results,
  • 23:49 - 23:50
    you can influence people's views
  • 23:50 - 23:52
    in a coherent way,
  • 23:52 - 23:53
    and maybe you can make it so subtle
  • 23:53 - 23:56
    that no one is able to notice.
  • 23:56 - 23:57
    So in this academic study,
  • 23:57 - 24:00
    they did an experiment
  • 24:00 - 24:02
    in the 2014 Indian election.
  • 24:02 - 24:04
    So they used real voters,
  • 24:04 - 24:06
    and they kept the size
    of the experiment small enough
  • 24:06 - 24:08
    that it was not going to influence the outcome
  • 24:08 - 24:10
    of the election.
  • 24:10 - 24:12
    So the researchers took people,
  • 24:12 - 24:14
    they determined their political leaning,
  • 24:14 - 24:17
    and they segmented them into
    control and treatment groups,
  • 24:17 - 24:19
    where the treatment was manipulation
  • 24:19 - 24:21
    of the search ranking results,
  • 24:21 - 24:24
    And then they had these people
    browse the web.
  • 24:24 - 24:26
    And what they found, is that
  • 24:26 - 24:28
    this mechanism is very effective at shifting
  • 24:28 - 24:30
    people's voter preferences.
  • 24:30 - 24:34
    So, in this study, they were able to introduce
  • 24:34 - 24:37
    a 20% shift in voter preferences.
  • 24:37 - 24:39
    Even alerting users to the fact that this
  • 24:39 - 24:42
    was going to be done, telling them
  • 24:42 - 24:44
    "we are going to manipulate your search results,"
  • 24:44 - 24:46
    "really pay attention,"
  • 24:46 - 24:49
    they were totally unable to decrease
  • 24:49 - 24:51
    the magnitude of the effect.
  • 24:51 - 24:55
    So, the margins of error in many elections
  • 24:55 - 24:58
    is incredibly small,
  • 24:58 - 25:00
    and the authors estimate that this shift
  • 25:00 - 25:02
    could change the outcome of about
  • 25:02 - 25:07
    25% of elections worldwide, if this were done.
  • 25:07 - 25:11
    And the bias is so small that no one can tell.
  • 25:11 - 25:14
    So, all humans, no matter how smart
  • 25:14 - 25:17
    and resistant to manipulation
    we think we are,
  • 25:17 - 25:22
    all of us are subject to this sort of manipulation,
  • 25:22 - 25:24
    and we really can't tell.
  • 25:24 - 25:27
    So, I'm not saying that this is occurring,
  • 25:27 - 25:31
    but right now there is no
    regulation to stop this,
  • 25:31 - 25:34
    there is no way we could reliably detect this,
  • 25:34 - 25:37
    so there's a huge amount of power here.
  • 25:37 - 25:40
    So, something to think about.
  • 25:40 - 25:43
    But it's not only corporations that are interested
  • 25:43 - 25:47
    in this sort of behavioral manipulation.
  • 25:47 - 25:51
    In 2010, UK Prime Minister David Cameron
  • 25:51 - 25:55
    created this UK Behavioural Insights Team,
  • 25:55 - 25:57
    which is informally called the Nudge Unit.
  • 25:57 - 26:01
    And so what they do is
    they use behavioral science
  • 26:01 - 26:05
    and this predictive analytics approach,
  • 26:05 - 26:06
    with experimentation,
  • 26:06 - 26:08
    to have people make better decisions
  • 26:08 - 26:10
    for themselves and society--
  • 26:10 - 26:12
    as determined by the UK government.
  • 26:12 - 26:14
    And as of a few months ago,
  • 26:14 - 26:17
    after an executive order signed by Obama
  • 26:17 - 26:19
    in September, the United States now has
  • 26:19 - 26:21
    its own Nudge Unit.
  • 26:21 - 26:24
    So, to be clear, I don't think that this is
  • 26:24 - 26:26
    some sort of malicious plot.
  • 26:26 - 26:27
    I think that there can be huge value
  • 26:27 - 26:29
    in these sorts of initiatives,
  • 26:29 - 26:31
    positively impacting people's lives,
  • 26:31 - 26:34
    but when this sort of behavioral manipulation
  • 26:34 - 26:37
    is being done, in part openly,
  • 26:37 - 26:39
    oversight is pretty important,
  • 26:39 - 26:42
    and we really need to consider
  • 26:42 - 26:46
    what these systems are optimizing for.
  • 26:46 - 26:48
    And that's something that we might
  • 26:48 - 26:52
    not always know, or at least understand,
  • 26:52 - 26:54
    so for example, for industry,
  • 26:54 - 26:58
    we do have a pretty good understanding there:
  • 26:58 - 27:00
    industry cares about optimizing for
  • 27:00 - 27:02
    the time spent on the website,
  • 27:02 - 27:05
    Facebook wants you to spend more time on Facebook,
  • 27:05 - 27:07
    they want you to click on ads,
  • 27:07 - 27:09
    click on newsfeed items,
  • 27:09 - 27:11
    they want you to like things.
  • 27:11 - 27:14
    And, fundamentally: profit.
  • 27:14 - 27:18
    So, already this has some serious implications,
  • 27:18 - 27:20
    and this had pretty serious implications
  • 27:20 - 27:22
    in the last 10 years, in media for example.
  • 27:22 - 27:25
    The optimizing for click-through rate in journalism
  • 27:25 - 27:27
    has produced a race to the bottom
  • 27:27 - 27:28
    in terms of quality.
  • 27:28 - 27:31
    And another issue is that optimizing
  • 27:31 - 27:35
    for what people like might not always be
  • 27:35 - 27:36
    the best approach.
  • 27:36 - 27:39
    So, Facebook officials have said publicly
  • 27:39 - 27:41
    about how Facebook's goal is to make you happy,
  • 27:41 - 27:43
    they want you to open that newsfeed
  • 27:43 - 27:45
    and just feel great.
  • 27:45 - 27:47
    But, there's an issue there, right?
  • 27:47 - 27:50
    Because people get their news,
  • 27:50 - 27:52
    like 40% of people according to Pew Research,
  • 27:52 - 27:55
    get their news from Facebook.
  • 27:55 - 27:58
    So, if people don't want to see
  • 27:58 - 28:01
    war and corpses,
    because it makes them feel sad,
  • 28:01 - 28:04
    so this is not a system that is gonna optimize
  • 28:04 - 28:07
    for an informed population.
  • 28:07 - 28:09
    It's not gonna produce a population that is
  • 28:09 - 28:11
    ready to engage in civic life.
  • 28:11 - 28:13
    It's gonna produce an amused populations
  • 28:13 - 28:17
    whose time is occupied by cat pictures.
  • 28:17 - 28:19
    So, in politics, we have a similar
  • 28:19 - 28:21
    optimization problem that's occurring.
  • 28:21 - 28:24
    So, these political campaigns that use
  • 28:24 - 28:27
    these predictive systems,
  • 28:27 - 28:29
    are optimizing for votes for the desired candidate,
  • 28:29 - 28:30
    of course.
  • 28:30 - 28:33
    So, instead of a political campaign being
  • 28:33 - 28:36
    --well, maybe this is a naive view, but--
  • 28:36 - 28:38
    being an open discussion of the issues
  • 28:38 - 28:40
    facing the country,
  • 28:40 - 28:43
    it becomes this micro-targeted
    persuasion game,
  • 28:43 - 28:45
    and the people that get targeted
  • 28:45 - 28:47
    are a very small subset of all people,
  • 28:47 - 28:49
    and it's only gonna be people that are
  • 28:49 - 28:51
    you know, on the edge, maybe disinterested,
  • 28:51 - 28:54
    those are the people that are gonna get attention
  • 28:54 - 28:59
    from political candidates.
  • 28:59 - 29:02
    In policy, as with these Nudge Units,
  • 29:02 - 29:04
    they're being used to enable
  • 29:04 - 29:06
    better use of government services.
  • 29:06 - 29:07
    There are some good projects that have
  • 29:07 - 29:09
    come out of this:
  • 29:09 - 29:11
    increasing voter registration,
  • 29:11 - 29:13
    improving health outcomes,
  • 29:13 - 29:14
    improving education outcomes.
  • 29:14 - 29:16
    But some of these predictive systems
  • 29:16 - 29:18
    that we're starting to see in government
  • 29:18 - 29:21
    are optimizing for compliance,
  • 29:21 - 29:24
    as is the case with predictive policing.
  • 29:24 - 29:25
    So this is something that we need to
  • 29:25 - 29:29
    watch carefully.
  • 29:29 - 29:30
    I think this is a nice quote that
  • 29:30 - 29:33
    sort of describes the problem.
  • 29:33 - 29:35
    In some ways me might be narrowing
  • 29:35 - 29:38
    our horizon, and the danger is that
  • 29:38 - 29:42
    these tools are separating people.
  • 29:42 - 29:44
    And this is particularly bad
  • 29:44 - 29:46
    for political action, because political action
  • 29:46 - 29:50
    requires people to have shared experience,
  • 29:50 - 29:54
    and thus are able to collectively act
  • 29:54 - 29:58
    to exert pressure to fix problems.
  • 29:58 - 30:01
    So, finally: accountability.
  • 30:01 - 30:03
    So, we need some oversight mechanisms.
  • 30:03 - 30:07
    For example, in the case of errors--
  • 30:07 - 30:08
    so this is particularly important for
  • 30:08 - 30:11
    civil or bureaucratic systems.
  • 30:11 - 30:14
    So, when an algorithm produces some decision,
  • 30:14 - 30:17
    we don't always want humans to just
  • 30:17 - 30:18
    defer to the machine,
  • 30:18 - 30:22
    and that might represent one of the problems.
  • 30:22 - 30:25
    So, there are starting to be some cases
  • 30:25 - 30:28
    of computer algorithms yielding a decision,
  • 30:28 - 30:30
    and then humans being unable to correct
  • 30:30 - 30:32
    an obvious error.
  • 30:32 - 30:35
    So there's this case in Georgia,
    in the United States,
  • 30:35 - 30:37
    where 2 young people went to
  • 30:37 - 30:39
    the Department of Motor Vehicles,
  • 30:39 - 30:40
    they're twins, and they went
  • 30:40 - 30:42
    to get their driver's license.
  • 30:42 - 30:45
    However, they were both flagged by
  • 30:45 - 30:47
    a fraud algorithm that uses facial recognition
  • 30:47 - 30:49
    to look for similar faces,
  • 30:49 - 30:51
    and I guess the people that designed the system
  • 30:51 - 30:55
    didn't think of the possibility of twins.
  • 30:55 - 30:58
    Yeah.
    So, they just left
  • 30:58 - 31:00
    without their driver's licenses.
  • 31:00 - 31:02
    The people in the Department of Motor Vehicles
  • 31:02 - 31:04
    were unable to correct this.
  • 31:04 - 31:07
    So, this is one implication--
  • 31:07 - 31:09
    it's like something out of Kafka.
  • 31:09 - 31:12
    But there are also cases of errors being made,
  • 31:12 - 31:14
    and people not noticing until
  • 31:14 - 31:16
    after actions have been taken,
  • 31:16 - 31:18
    some of them very serious--
  • 31:18 - 31:19
    because people simply deferred
  • 31:19 - 31:21
    to the machine.
  • 31:21 - 31:23
    So, this is an example from San Francisco.
  • 31:23 - 31:27
    So, an ALPR-- an Automated License Plate Reader--
  • 31:27 - 31:29
    is a device that uses image recognition
  • 31:29 - 31:32
    to detect and read license plates,
  • 31:32 - 31:34
    and usually to compare license plates
  • 31:34 - 31:37
    with a known list of plates of interest.
  • 31:37 - 31:40
    And, so, San Francisco uses these
  • 31:40 - 31:42
    and they're mounted on police cars.
  • 31:42 - 31:47
    So, in this case, San Francisco ALPR
  • 31:47 - 31:49
    got a hit on a car,
  • 31:49 - 31:53
    and it was the car of a 47-year-old woman,
  • 31:53 - 31:55
    with no criminal history.
  • 31:55 - 31:56
    And so it was a false hit
  • 31:56 - 31:58
    because it was a blurry image,
  • 31:58 - 32:00
    and it matched erroneously with
  • 32:00 - 32:01
    one of the plates of interest
  • 32:01 - 32:03
    that happened to be a stolen vehicle.
  • 32:03 - 32:07
    So, they conducted a traffic stop on her,
  • 32:07 - 32:09
    and they take her out of the vehicle,
  • 32:09 - 32:11
    they search her and the vehicle,
  • 32:11 - 32:13
    she gets a pat-down,
  • 32:13 - 32:15
    and they have her kneel
  • 32:15 - 32:18
    at gunpoint, in the street.
  • 32:18 - 32:21
    So, how much oversight should be present
  • 32:21 - 32:24
    depends on the implications of the system.
  • 32:24 - 32:25
    It's certainly the case that
  • 32:25 - 32:27
    for some of these decision-making systems,
  • 32:27 - 32:29
    an error might not be that important,
  • 32:29 - 32:31
    it could be relatively harmless,
  • 32:31 - 32:34
    but in this case,
    an error in this algorithmic decision
  • 32:34 - 32:36
    led to this totally innocent person
  • 32:36 - 32:40
    literally having a gun pointed at her.
  • 32:40 - 32:44
    So, that brings us to: we need some way of
  • 32:44 - 32:45
    getting some information about
  • 32:45 - 32:47
    what is going on here.
  • 32:47 - 32:50
    We don't wanna have to wait for these events
  • 32:50 - 32:53
    before we are able to determine
  • 32:53 - 32:54
    some information about the system.
  • 32:54 - 32:56
    So, auditing is one option:
  • 32:56 - 32:58
    to independently verify the statements
  • 32:58 - 33:01
    of companies, in situations where we have
  • 33:01 - 33:03
    inputs and outputs.
  • 33:03 - 33:05
    So, for example, this could be done with
  • 33:05 - 33:07
    Google, Facebook.
  • 33:07 - 33:09
    If you have the inputs of a system,
  • 33:09 - 33:11
    say you have test accounts,
  • 33:11 - 33:12
    or real accounts,
  • 33:12 - 33:14
    maybe you can collect
    people's information together.
  • 33:14 - 33:16
    So that was something that was done
  • 33:16 - 33:19
    during the 2012 Obama campaign
  • 33:19 - 33:20
    by ProPublica.
  • 33:20 - 33:21
    People noticed that they were getting
  • 33:21 - 33:25
    different emails from the Obama campaign,
  • 33:25 - 33:26
    and were interested to see
  • 33:26 - 33:28
    based on what factors
  • 33:28 - 33:30
    the emails were changing.
  • 33:30 - 33:33
    So, I think about 200 people submitted emails
  • 33:33 - 33:35
    and they were able to determine some information
  • 33:35 - 33:39
    about what the emails
    were being varied based on.
  • 33:39 - 33:41
    So there have been some successful
  • 33:41 - 33:43
    attempts at this.
  • 33:43 - 33:46
    So, compare inputs and then look at
  • 33:46 - 33:49
    why one item was shown to one user
  • 33:49 - 33:50
    and not another, and see if there's
  • 33:50 - 33:52
    any statistical differences.
  • 33:52 - 33:56
    So, there's some potential legal issues
  • 33:56 - 33:58
    with the test accounts, so that's something
  • 33:58 - 34:01
    to think about-- I'm not a lawyer.
  • 34:01 - 34:04
    So, for example, if you wanna examine
  • 34:04 - 34:06
    ad-targeting algorithms,
  • 34:06 - 34:08
    one way to proceed is to construct
  • 34:08 - 34:11
    a browsing profile, and then examine
  • 34:11 - 34:13
    what ads are served back to you.
  • 34:13 - 34:14
    And so this is something that
  • 34:14 - 34:16
    academic researchers have looked at,
  • 34:16 - 34:17
    because, at the time at least,
  • 34:17 - 34:21
    you didn't need to make an account to do this.
  • 34:21 - 34:25
    So, this was a study that was presented at
  • 34:25 - 34:28
    Privacy Enhancing Technologies last year,
  • 34:28 - 34:31
    and in this study, the researchers
  • 34:31 - 34:33
    generate some browsing profiles
  • 34:33 - 34:36
    that differ only by one characteristic,
  • 34:36 - 34:38
    so they're basically identical in every way
  • 34:38 - 34:39
    except for one thing.
  • 34:39 - 34:42
    And that is denoted by Treatment 1 and 2.
  • 34:42 - 34:44
    So this is a randomized, controlled trial,
  • 34:44 - 34:46
    but I left out the randomization part
  • 34:46 - 34:48
    for simplicity.
  • 34:48 - 34:55
    So, in one study,
    they applied a treatment of gender.
  • 34:55 - 34:57
    So, they had the browsing profiles
  • 34:57 - 34:59
    in Treatment 1 be male browsing profiles,
  • 34:59 - 35:02
    and the browsing profiles in Treatment 2
    be female.
  • 35:02 - 35:04
    And they wanted to see: is there any difference
  • 35:04 - 35:06
    in the way that ads are targeted
  • 35:06 - 35:09
    if browsing profiles are effectively identical
  • 35:09 - 35:11
    except for gender?
  • 35:11 - 35:15
    So, it turns out that there was.
  • 35:15 - 35:19
    So, a 3rd-party site was showing Google ads
  • 35:19 - 35:21
    for senior executive positions
  • 35:21 - 35:24
    at a rate 6 times higher to the fake men
  • 35:24 - 35:27
    than for the fake women in this study.
  • 35:27 - 35:30
    So, this sort of auditing is not going to
  • 35:30 - 35:33
    be able to determine everything
  • 35:33 - 35:35
    that algorithms are doing, but they can
  • 35:35 - 35:37
    sometimes uncover interesting,
  • 35:37 - 35:41
    at least statistical differences.
  • 35:41 - 35:47
    So, this leads us to the fundamental issue:
  • 35:47 - 35:49
    Right now, we're really not in control
  • 35:49 - 35:51
    of some of these systems,
  • 35:51 - 35:54
    and we really need these predictive systems
  • 35:54 - 35:56
    to be controlled by us,
  • 35:56 - 35:58
    in order for them not to be used
  • 35:58 - 36:00
    as a system of control.
  • 36:00 - 36:03
    So there are some technologies that I'd like
  • 36:03 - 36:07
    to point you all to.
  • 36:07 - 36:08
    We need tools in the digital commons
  • 36:08 - 36:11
    that can help address some of these concerns.
  • 36:11 - 36:13
    So, the first thing is that of course
  • 36:13 - 36:15
    we known that minimizing the amount of
  • 36:15 - 36:17
    data available can help in some contexts,
  • 36:17 - 36:19
    which we can do by making systems
  • 36:19 - 36:23
    that are private by design, and by default.
  • 36:23 - 36:25
    Another thing is that these audit tools
  • 36:25 - 36:26
    might be useful.
  • 36:26 - 36:31
    And, so, these 2 nice examples in academia...
  • 36:31 - 36:34
    the ad experiment that I just showed was done
  • 36:34 - 36:36
    using AdFisher.
  • 36:36 - 36:38
    So, these are 2 toolkits that you can use
  • 36:38 - 36:41
    to start doing this sort of auditing.
  • 36:41 - 36:45
    Another technology that is generally useful,
  • 36:45 - 36:47
    but particularly in the case of prediction
  • 36:47 - 36:49
    it's useful to maintain access to
  • 36:49 - 36:50
    as many sites as possible,
  • 36:50 - 36:53
    through anonymity systems like Tor,
  • 36:53 - 36:54
    because it's impossible to personalize
  • 36:54 - 36:56
    when everyone looks the same.
  • 36:56 - 36:59
    So this is a very important technology.
  • 36:59 - 37:02
    Something that doesn't really exist,
  • 37:02 - 37:04
    but that I think is pretty important,
  • 37:04 - 37:06
    is having some tool to view the landscape.
  • 37:06 - 37:08
    So, as we know from these few studies
  • 37:08 - 37:10
    that have been done,
  • 37:10 - 37:12
    different people are not seeing the internet
  • 37:12 - 37:13
    in the same way.
  • 37:13 - 37:16
    This is one reason why we don't like censorship.
  • 37:16 - 37:18
    But, rich and poor people,
  • 37:18 - 37:20
    from academic research we know that
  • 37:20 - 37:24
    there is widespread price discrimination
    on the internet,
  • 37:24 - 37:26
    so rich and poor people see a different view
  • 37:26 - 37:27
    of the Internet,
  • 37:27 - 37:28
    men and women see a different view
  • 37:28 - 37:30
    of the Internet.
  • 37:30 - 37:31
    We wanna know how different people
  • 37:31 - 37:32
    see the same site,
  • 37:32 - 37:34
    and this could be the beginning of
  • 37:34 - 37:36
    a defense system for this sort of
  • 37:36 - 37:42
    manipulation/tampering that I showed earlier.
  • 37:42 - 37:46
    Another interesting approach is obfuscation:
  • 37:46 - 37:47
    injecting noise into the system.
  • 37:47 - 37:49
    So there's an interesting browser extension
  • 37:49 - 37:52
    called Adnauseum, that's for Firefox,
  • 37:52 - 37:55
    which clicks on every single ad you're served,
  • 37:55 - 37:56
    to inject noise.
  • 37:56 - 37:57
    So that's, I think, an interesting approach
  • 37:57 - 38:00
    that people haven't looked at too much.
  • 38:00 - 38:04
    So in terms of policy,
  • 38:04 - 38:07
    Facebook and Google, these internet giants,
  • 38:07 - 38:09
    have billions of users,
  • 38:09 - 38:12
    and sometimes they like to call themselves
  • 38:12 - 38:14
    new public utilities,
  • 38:14 - 38:15
    and if that's the case then
  • 38:15 - 38:18
    it might be necessary to subject them
  • 38:18 - 38:21
    to additional regulation.
  • 38:21 - 38:22
    Another problem that's come up,
  • 38:22 - 38:24
    for example with some of the studies
  • 38:24 - 38:25
    that Facebook has done,
  • 38:25 - 38:29
    is sometimes a lack of ethics review.
  • 38:29 - 38:31
    So, for example, in academia,
  • 38:31 - 38:34
    if you're gonna do research involving humans,
  • 38:34 - 38:35
    there's an Institutional Review Board
  • 38:35 - 38:37
    that you go to that verifies that
  • 38:37 - 38:39
    you're doing things in an ethical manner.
  • 38:39 - 38:41
    And some companies do have internal
  • 38:41 - 38:43
    review processes like this, but it might
  • 38:43 - 38:45
    be important to have an independent
  • 38:45 - 38:48
    ethics board that does this sort of thing.
  • 38:48 - 38:51
    And we really need 3rd-party auditing.
  • 38:51 - 38:55
    So, for example, some companies
  • 38:55 - 38:56
    don't want auditing to be done
  • 38:56 - 38:59
    because of IP concerns,
  • 38:59 - 39:01
    and if that's the concern
  • 39:01 - 39:03
    maybe having a set of people
  • 39:03 - 39:06
    that are not paid by the company
  • 39:06 - 39:07
    to check how some of these systems
  • 39:07 - 39:09
    are being implemented,
  • 39:09 - 39:11
    could help give us confidence that
  • 39:11 - 39:17
    things are being done in a reasonable way.
  • 39:17 - 39:20
    So, in closing,
  • 39:20 - 39:23
    algorithmic decision making is here,
  • 39:23 - 39:26
    and it's barreling forward
    at a very fast rate,
  • 39:26 - 39:28
    and we need to figure out what
  • 39:28 - 39:30
    the guide rails should be,
  • 39:30 - 39:31
    and how to install them
  • 39:31 - 39:33
    to handle some of the potential threats.
  • 39:33 - 39:35
    There's a huge amount of power here.
  • 39:35 - 39:38
    We need more openness in these systems.
  • 39:38 - 39:40
    And, right now,
  • 39:40 - 39:42
    with the intelligent systems that do exist,
  • 39:42 - 39:44
    we don't know what's occurring really,
  • 39:44 - 39:47
    and we need to watch carefully
  • 39:47 - 39:49
    where and how these systems are being used.
  • 39:49 - 39:51
    And I think this community has
  • 39:51 - 39:54
    an important role to play in this fight,
  • 39:54 - 39:56
    to study what's being done,
  • 39:56 - 39:57
    to show people what's being done,
  • 39:57 - 39:59
    to raise the debate and advocate,
  • 39:59 - 40:01
    and, where necessary, to resist.
  • 40:01 - 40:03
    Thanks.
  • 40:03 - 40:13
    applause
  • 40:13 - 40:18
    Herald: So, let's have a question and answer.
  • 40:18 - 40:19
    Microphone 2, please.
  • 40:19 - 40:20
    Mic 2: Hi there.
  • 40:20 - 40:23
    Thanks for the talk.
  • 40:23 - 40:26
    Since these pre-crime softwares also
  • 40:26 - 40:27
    arrived here in Germany
  • 40:27 - 40:30
    with the start of the so-called CopWatch system
  • 40:30 - 40:33
    in southern Germany,
    and Bavaria and Nuremberg especially,
  • 40:33 - 40:35
    where they try to predict burglary crime
  • 40:35 - 40:37
    using that criminal record
  • 40:37 - 40:40
    geographical analysis, like you explained,
  • 40:40 - 40:43
    leads me to a 2-fold question:
  • 40:43 - 40:48
    first, have you heard of any research
  • 40:48 - 40:50
    that measures the effectiveness
  • 40:50 - 40:54
    of such measures, at all?
  • 40:54 - 40:57
    And, second:
  • 40:57 - 41:01
    What do you think of the game theory
  • 41:01 - 41:03
    if the thieves or the bad guys
  • 41:03 - 41:08
    know the system, and when they
    game the system,
  • 41:08 - 41:10
    they will probably win,
  • 41:10 - 41:12
    since one police officer in an interview said
  • 41:12 - 41:14
    this system is used to reduce
  • 41:14 - 41:16
    the personal costs of policing,
  • 41:16 - 41:19
    so they just send the guys
    where the red flags are,
  • 41:19 - 41:22
    and the others take the day off.
  • 41:22 - 41:24
    Dr. Helsby: Yup.
  • 41:24 - 41:27
    Um, so, with respect to
  • 41:27 - 41:31
    testing the effectiveness of predictive policing,
  • 41:31 - 41:32
    the companies,
  • 41:32 - 41:34
    some of them do randomized, controlled trials
  • 41:34 - 41:35
    and claim a reduction in policing.
  • 41:35 - 41:38
    The best independent study that I've seen
  • 41:38 - 41:41
    is by this RAND Corporation
  • 41:41 - 41:43
    that did a study in, I think,
  • 41:43 - 41:45
    Shreveport, Louisiana,
  • 41:45 - 41:48
    and in their report they claim
  • 41:48 - 41:50
    that there was no statistically significant
  • 41:50 - 41:53
    difference, they didn't find any reduction.
  • 41:53 - 41:54
    And it was specifically looking at
  • 41:54 - 41:57
    property crime, which I think you mentioned.
  • 41:57 - 41:59
    So, I think right now there's sort of
  • 41:59 - 42:01
    conflicting reports between
  • 42:01 - 42:06
    the independent auditors
    and these company claims.
  • 42:06 - 42:09
    So there definitely needs to be more study.
  • 42:09 - 42:12
    And then, the 2nd thing...sorry,
    remind me what it was?
  • 42:12 - 42:15
    Mic 2: What about the guys gaming the system?
  • 42:15 - 42:17
    Dr. Helsby: Oh, yeah.
  • 42:17 - 42:19
    I think it's a legitimate concern.
  • 42:19 - 42:22
    Like, if all the outputs
    were just immediately public,
  • 42:22 - 42:25
    then, yes, everyone knows the location
  • 42:25 - 42:27
    of all police officers,
  • 42:27 - 42:29
    and I imagine that people would have
  • 42:29 - 42:31
    a problem with that.
  • 42:31 - 42:33
    Yup.
  • 42:33 - 42:36
    Heraldl: Microphone #4, please.
  • 42:36 - 42:39
    Mic 4: Yeah, this is not actually a question,
  • 42:39 - 42:41
    but just a comment.
  • 42:41 - 42:43
    I've enjoyed your talk very much,
  • 42:43 - 42:48
    in particular after watching
  • 42:48 - 42:52
    the talk in Hall 1 earlier in the afternoon.
  • 42:52 - 42:56
    The "Say Hi to Your New Boss", about
  • 42:56 - 43:00
    algorithms that are trained with big data,
  • 43:00 - 43:02
    and finally make decisions.
  • 43:02 - 43:08
    And I think these 2 talks are kind of complementary,
  • 43:08 - 43:11
    and if people are interested in the topic
  • 43:11 - 43:15
    they might want to check out the other talk
  • 43:15 - 43:16
    and watch it later, because these
  • 43:16 - 43:17
    fit very well together.
  • 43:17 - 43:20
    Dr. Helsby: Yeah, it was a great talk.
  • 43:20 - 43:22
    Herald: Microphone #2, please.
  • 43:22 - 43:25
    Mic 2: Um, yeah, you mentioned
  • 43:25 - 43:27
    the need to have some kind of 3rd-party auditing
  • 43:27 - 43:31
    or some kind of way to
  • 43:31 - 43:32
    peek into these algorithms
  • 43:32 - 43:33
    and to see what they're doing,
  • 43:33 - 43:34
    and to see if they're being fair.
  • 43:34 - 43:36
    Can you talk a little bit more about that?
  • 43:36 - 43:38
    Like, going forward,
  • 43:38 - 43:41
    some kind of regulatory structures
  • 43:41 - 43:44
    would probably have to emerge
  • 43:44 - 43:47
    to analyze and to look at
  • 43:47 - 43:49
    these black boxes that are just sort of
  • 43:49 - 43:51
    popping up everywhere and, you know,
  • 43:51 - 43:53
    controlling more and more of the things
  • 43:53 - 43:56
    in our lives, and important decisions.
  • 43:56 - 43:59
    So, just, what kind of discussions
  • 43:59 - 43:59
    are there for that?
  • 43:59 - 44:02
    And what kind of possibility
    is there for that?
  • 44:02 - 44:05
    And, I'm sure that companies would be
  • 44:05 - 44:08
    very, very resistant to
  • 44:08 - 44:10
    any kind of attempt to look into
  • 44:10 - 44:14
    algorithms, and to...
  • 44:14 - 44:15
    Dr. Helsby: Yeah, I mean, definitely
  • 44:15 - 44:18
    companies would be very resistant to
  • 44:18 - 44:20
    having people look into their algorithms.
  • 44:20 - 44:22
    So, if you wanna do a very rigorous
  • 44:22 - 44:23
    audit of what's going on
  • 44:23 - 44:26
    then it's probably necessary to have
  • 44:26 - 44:27
    a few people come in
  • 44:27 - 44:29
    and sign NDAs, and then
  • 44:29 - 44:31
    look through the systems.
  • 44:31 - 44:33
    So, that's one way to proceed.
  • 44:33 - 44:35
    But, another way to proceed that--
  • 44:35 - 44:39
    so, these academic researchers have done
  • 44:39 - 44:40
    a few experiments
  • 44:40 - 44:43
    and found some interesting things,
  • 44:43 - 44:46
    and that's sort all the attempts at auditing
  • 44:46 - 44:46
    that we've seen:
  • 44:46 - 44:48
    there was 1 attempt in 2012
    for the Obama campaign,
  • 44:48 - 44:50
    but there's really not been any
  • 44:50 - 44:52
    sort of systematic attempt--
  • 44:52 - 44:53
    you know, like, in censorship
  • 44:53 - 44:55
    we see a systematic attempt to
  • 44:55 - 44:57
    do measurement as often as possible,
  • 44:57 - 44:58
    check what's going on,
  • 44:58 - 44:59
    and that itself, you know,
  • 44:59 - 45:01
    can act as an oversight mechanism.
  • 45:01 - 45:02
    But, right now,
  • 45:02 - 45:04
    I think many of these companies
  • 45:04 - 45:05
    realize no one is watching,
  • 45:05 - 45:07
    so there's no real push to have
  • 45:07 - 45:10
    people verify: are you being fair when you
  • 45:10 - 45:12
    implement this system?
  • 45:12 - 45:13
    Because no one's really checking.
  • 45:13 - 45:14
    Mic 2: Do you think that,
  • 45:14 - 45:15
    at some point, it would be like
  • 45:15 - 45:19
    an FDA or SEC, to give some American examples...
  • 45:19 - 45:21
    an actual government regulatory agency
  • 45:21 - 45:25
    that has the power and ability to
  • 45:25 - 45:28
    not just sort of look and try to
  • 45:28 - 45:32
    reverse engineer some of these algorithms,
  • 45:32 - 45:34
    but actually peek in there and make sure
  • 45:34 - 45:36
    that things are fair, because it seems like
  • 45:36 - 45:38
    there's just-- it's so important now
  • 45:38 - 45:42
    that, again, it could be the difference between
  • 45:42 - 45:43
    life and death, between
  • 45:43 - 45:45
    getting a job, not getting a job,
  • 45:45 - 45:46
    being pulled over,
    not being pulled over,
  • 45:46 - 45:48
    being racially profiled,
    not racially profiled,
  • 45:48 - 45:49
    things like that.
    Dr. Helsby: Right.
  • 45:49 - 45:50
    Mic 2: Is it moving in that direction?
  • 45:50 - 45:52
    Or is it way too early for it?
  • 45:52 - 45:55
    Dr. Helsby: I mean, so some people have...
  • 45:55 - 45:57
    someone has called for, like,
  • 45:57 - 45:59
    a Federal Search Commission,
  • 45:59 - 46:01
    or like a Federal Algorithms Commission,
  • 46:01 - 46:03
    that would do this sort of oversight work,
  • 46:03 - 46:06
    but it's in such early stages right now
  • 46:06 - 46:10
    that there's no real push for that.
  • 46:10 - 46:13
    But I think it's a good idea.
  • 46:13 - 46:16
    Herald: And again, #2 please.
  • 46:16 - 46:17
    Mic 2: Thank you again for your talk.
  • 46:17 - 46:19
    I was just curious if you can point
  • 46:19 - 46:20
    to any examples of
  • 46:20 - 46:23
    either current producers or consumers
  • 46:23 - 46:24
    of these algorithmic systems
  • 46:24 - 46:26
    who are actively and publicly trying
  • 46:26 - 46:28
    to do so in a responsible manner
  • 46:28 - 46:30
    by describing what they're trying to do
  • 46:30 - 46:31
    and how they're going about it?
  • 46:31 - 46:37
    Dr. Helsby: So, yeah, there are some companies,
  • 46:37 - 46:39
    for example, like DataKind,
  • 46:39 - 46:43
    that try to deploy algorithmic systems
  • 46:43 - 46:45
    in as responsible a way as possible,
  • 46:45 - 46:47
    for like public policy.
  • 46:47 - 46:50
    Like, I actually also implement systems
  • 46:50 - 46:52
    for public policy in a transparent way.
  • 46:52 - 46:54
    Like, all the code is in GitHub, etc.
  • 46:54 - 47:00
    And it is also the case to give credit to
  • 47:00 - 47:02
    Google, and these giants,
  • 47:02 - 47:06
    they're trying to implement transparency systems
  • 47:06 - 47:08
    that help you understand.
  • 47:08 - 47:09
    This has been done with respect to
  • 47:09 - 47:12
    how your data is being collected,
  • 47:12 - 47:15
    but for example if you go on Amazon.com
  • 47:15 - 47:18
    you can see a recommendation has been made,
  • 47:18 - 47:19
    and that is pretty transparent.
  • 47:19 - 47:21
    You can see "this item
    was recommended to me,"
  • 47:21 - 47:25
    so you know that prediction
    is being used in this case,
  • 47:25 - 47:27
    and it will say why prediction is being used:
  • 47:27 - 47:29
    because you purchased some item.
  • 47:29 - 47:30
    And Google has a similar thing,
  • 47:30 - 47:32
    if you go to like Google Ad Settings,
  • 47:32 - 47:35
    you can even turn off personalization of ads
  • 47:35 - 47:36
    if you want,
  • 47:36 - 47:38
    and you can also see some of the inferences
  • 47:38 - 47:39
    that have been learned about you.
  • 47:39 - 47:41
    A subset of the inferences that have been
  • 47:41 - 47:42
    learned about you.
  • 47:42 - 47:44
    So, like, what interests...
  • 47:44 - 47:48
    Herald: A question from the internet, please?
  • 47:48 - 47:51
    Signal Angel: Yes, billetQ is asking
  • 47:51 - 47:54
    how do you avoid biases in machine learning?
  • 47:54 - 47:57
    I asume analysis system, for example,
  • 47:57 - 48:00
    could be biased against women and minorities,
  • 48:00 - 48:05
    if used for hiring decisions
    based on known data.
  • 48:05 - 48:06
    Dr. Helsby: Yeah, so one thing is to
  • 48:06 - 48:09
    just explicitly check.
  • 48:09 - 48:12
    So, you can check to see how
  • 48:12 - 48:14
    positive outcomes are being distributed
  • 48:14 - 48:17
    among those protected classes.
  • 48:17 - 48:19
    You could also incorporate these sort of
  • 48:19 - 48:21
    fairness constraints in the function
  • 48:21 - 48:24
    that you optimize when you train the system,
  • 48:24 - 48:26
    and so, if you're interested in reading more
  • 48:26 - 48:29
    about this, the 2 papers--
  • 48:29 - 48:32
    let me go to References--
  • 48:32 - 48:33
    there's a good paper called
  • 48:33 - 48:35
    Fairness Through Awareness that describes
  • 48:35 - 48:37
    how to go about doing this,
  • 48:37 - 48:40
    so I recommend this person read that.
  • 48:40 - 48:41
    It's good.
  • 48:41 - 48:43
    Herald: Microphone 2, please.
  • 48:43 - 48:45
    Mic2: Thanks again for your talk.
  • 48:45 - 48:50
    Umm, hello?
  • 48:50 - 48:51
    Okay.
  • 48:51 - 48:53
    Umm, I see of course a problem with
  • 48:53 - 48:55
    all the black boxes that you describe
  • 48:55 - 48:57
    with regards for the crime systems,
  • 48:57 - 49:00
    but when we look at the advertising systems
  • 49:00 - 49:02
    in many cases they are very networked.
  • 49:02 - 49:04
    There are many different systems collaborating
  • 49:04 - 49:07
    and exchanging data via open APIs:
  • 49:07 - 49:09
    RESTful APIs, and various
  • 49:09 - 49:12
    demand-side platforms
    and audience-exchange platforms,
  • 49:12 - 49:13
    and everything.
  • 49:13 - 49:15
    So, can that help to at least
  • 49:15 - 49:22
    increase awareness on where targeting, personalization
  • 49:22 - 49:24
    might be happening?
  • 49:24 - 49:26
    I mean, I'm looking at systems like
  • 49:26 - 49:30
    BuiltWith, that surface what kind of
  • 49:30 - 49:31
    JavaScript libraries are used elsewhere.
  • 49:31 - 49:33
    So, is that something that could help
  • 49:33 - 49:36
    at least to give a better awareness
  • 49:36 - 49:39
    and listing all the points where
  • 49:39 - 49:41
    you might be targeted...
  • 49:41 - 49:43
    Dr. Helsby: So, like, with respect to
  • 49:43 - 49:46
    advertising, the fact that
    there is behind the scenes
  • 49:46 - 49:48
    this like complicated auction process
  • 49:48 - 49:51
    that's occurring, just makes things
  • 49:51 - 49:52
    a lot more complicated.
  • 49:52 - 49:54
    So, for example, I said briefly
  • 49:54 - 49:57
    that they found that there's this
    statistical difference
  • 49:57 - 49:59
    between how men and women are treated,
  • 49:59 - 50:01
    but it doesn't necessarily mean that
  • 50:01 - 50:04
    "Oh, the algorithm is definitely biased."
  • 50:04 - 50:06
    It could be because of this auction process,
  • 50:06 - 50:11
    it could be that women are considered
  • 50:11 - 50:13
    more valuable when it comes to advertising,
  • 50:13 - 50:15
    and so these executive ads are getting
  • 50:15 - 50:17
    outbid by some other ads,
  • 50:17 - 50:19
    and so there's a lot of potential
  • 50:19 - 50:20
    causes for that.
  • 50:20 - 50:23
    So, I think it just makes things
    a lot more complicated.
  • 50:23 - 50:26
    I don't know if it helps
    with the bias at all.
  • 50:26 - 50:27
    Mic 2: Well, the question was more
  • 50:27 - 50:30
    a direction... can it help to surface
  • 50:30 - 50:32
    and make people aware of that fact?
  • 50:32 - 50:35
    I mean, I can talk to my kids probably,
  • 50:35 - 50:36
    and they will probably understand,
  • 50:36 - 50:38
    but I can't explain that to my grandma,
  • 50:38 - 50:43
    who's also, umm, looking at an iPad.
  • 50:43 - 50:44
    Dr. Helsby: So, the fact that
  • 50:44 - 50:46
    the systems are...
  • 50:46 - 50:49
    I don't know if I understand.
  • 50:49 - 50:51
    Mic 2: OK. I think that the main problem
  • 50:51 - 50:54
    is that we are behind the industry efforts
  • 50:54 - 50:57
    to being targeted at, and many people
  • 50:57 - 51:01
    do know, but a lot more people don't know,
  • 51:01 - 51:03
    and making them aware of the fact
  • 51:03 - 51:07
    that they are a target, in a way,
  • 51:07 - 51:11
    is something that can only be shown
  • 51:11 - 51:15
    by a 3rd party that disposed that data,
  • 51:15 - 51:16
    and make audits in a way--
  • 51:16 - 51:18
    maybe in an automated way.
  • 51:18 - 51:19
    Dr. Helsby: Right.
  • 51:19 - 51:21
    Yeah, I think it certainly
    could help with advocacy
  • 51:21 - 51:23
    if that's the point, yeah.
  • 51:23 - 51:26
    Herald: Another question
    from the internet, please.
  • 51:26 - 51:29
    Signal Angel: Yes, on IRC they are asking
  • 51:29 - 51:31
    if we know that prediction in some cases
  • 51:31 - 51:34
    provides an influence that cannot be controlled.
  • 51:34 - 51:38
    So, r4v5 would like to know from you
  • 51:38 - 51:42
    if there are some cases or areas where
  • 51:42 - 51:45
    machine learning simply shouldn't go?
  • 51:45 - 51:48
    Dr. Helsby: Umm, so I think...
  • 51:48 - 51:53
    I mean, yes, I think that it is the case
  • 51:53 - 51:55
    that in some cases machine learning
  • 51:55 - 51:56
    might not be appropriate.
  • 51:56 - 51:58
    For example, if you use machine learning
  • 51:58 - 52:01
    to decide who should be searched.
  • 52:01 - 52:03
    I don't think it should be the case that
  • 52:03 - 52:04
    machine learning algorithms should
  • 52:04 - 52:05
    ever be used to determine
  • 52:05 - 52:08
    probable cause, or something like that.
  • 52:08 - 52:12
    So, if it's just one piece of evidence
  • 52:12 - 52:13
    that you consider,
  • 52:13 - 52:15
    and there's human oversight always,
  • 52:15 - 52:19
    maybe it's fine, but
  • 52:19 - 52:21
    we should be very suspicious and hesitant
  • 52:21 - 52:22
    in certain contexts where
  • 52:22 - 52:25
    the ramifications are very serious.
  • 52:25 - 52:27
    Like the No Fly List, and so on.
  • 52:27 - 52:29
    Herald: And #2 again.
  • 52:29 - 52:31
    Mic 2: A second question
  • 52:31 - 52:34
    that just occurred to me, if you don't mind.
  • 52:34 - 52:35
    Umm, until the advent of
  • 52:35 - 52:37
    algorithmic systems,
  • 52:37 - 52:40
    when there've been cases of serious harm
  • 52:40 - 52:43
    that's been resulted in individuals or groups,
  • 52:43 - 52:45
    and it's been demonstrated that
  • 52:45 - 52:46
    it's occurred because of
  • 52:46 - 52:49
    an individual or a system of people
  • 52:49 - 52:53
    being systematically biased, then often
  • 52:53 - 52:55
    one of the actions that's taken is
  • 52:55 - 52:57
    pressure's applied, and then
  • 52:57 - 53:00
    people are required to change,
  • 53:00 - 53:01
    and hopely be held responsible,
  • 53:01 - 53:03
    and then change the way that they do things
  • 53:03 - 53:06
    to try to remove bias from that system.
  • 53:06 - 53:08
    What's the current thinking about
  • 53:08 - 53:10
    how we can go about doing that
  • 53:10 - 53:13
    when the systems that are doing that
  • 53:13 - 53:14
    are algorithmic?
  • 53:14 - 53:16
    Is it just going to be human oversight,
  • 53:16 - 53:17
    and humans are gonna have to be
  • 53:17 - 53:18
    held responsible for the oversight?
  • 53:18 - 53:21
    Dr. Helsby: So, in terms of bias,
  • 53:21 - 53:23
    if we're concerned about bias towards
  • 53:23 - 53:24
    particular types of people,
  • 53:24 - 53:26
    that's something that we can optimize for.
  • 53:26 - 53:29
    So, we can train systems that are unbiased
  • 53:29 - 53:30
    in this way.
  • 53:30 - 53:32
    So that's one way to deal with it.
  • 53:32 - 53:34
    But there's always gonna be errors,
  • 53:34 - 53:35
    so that's sort of a separate issue
  • 53:35 - 53:38
    from the bias, and in the case
  • 53:38 - 53:39
    where there are errors,
  • 53:39 - 53:41
    there must be oversight.
  • 53:41 - 53:45
    So, one way that one could improve
  • 53:45 - 53:46
    the way that this is done
  • 53:46 - 53:48
    is by making sure that you're
  • 53:48 - 53:51
    keeping track of confidence of decisions.
  • 53:51 - 53:54
    So, if you have a low confidence prediction,
  • 53:54 - 53:56
    then maybe a human
    should come in and check things.
  • 53:56 - 53:59
    So, that might be one way to proceed.
  • 54:02 - 54:04
    Herald: So, there's no more question.
  • 54:04 - 54:06
    I close this talk now,
  • 54:06 - 54:08
    and thank you very much
  • 54:08 - 54:09
    and a big applause to
  • 54:09 - 54:12
    Jennifer Helsby!
  • 54:12 - 54:16
    roaring applause
  • 54:16 - 54:28
    subtitles created by c3subtitles.de
    Join, and help us!
Title:
Jennifer Helsby: Prediction and Control
Description:

more » « less
Video Language:
English
Duration:
54:28

English subtitles

Revisions