How we found the worst place to park in New York City — using big data
-
0:01 - 0:04Six thousand miles of road,
-
0:04 - 0:06600 miles of subway track,
-
0:06 - 0:07400 miles of bike lanes
-
0:07 - 0:09and a half a mile of tram track,
-
0:09 - 0:11if you've ever been to Roosevelt Island.
-
0:11 - 0:14These are the numbers that make up
the infrastructure of New York City. -
0:14 - 0:17These are the statistics
of our infrastructure. -
0:17 - 0:21They're the kind of numbers you can find
released in reports by city agencies. -
0:21 - 0:24For example, the Department
of Transportation will probably tell you -
0:24 - 0:26how many miles of road they maintain.
-
0:26 - 0:29The MTA will boast how many miles
of subway track there are. -
0:29 - 0:30Most city agencies give us statistics.
-
0:30 - 0:32This is from a report this year
-
0:32 - 0:34from the Taxi and Limousine Commission,
-
0:34 - 0:37where we learn that there's about
13,500 taxis here in New York City. -
0:37 - 0:38Pretty interesting, right?
-
0:38 - 0:41But did you ever think about
where these numbers came from? -
0:41 - 0:44Because for these numbers to exist,
someone at the city agency -
0:44 - 0:48had to stop and say, hmm, here's a number
that somebody might want want to know. -
0:48 - 0:50Here's a number
that our citizens want to know. -
0:50 - 0:52So they go back to their raw data,
-
0:52 - 0:54they count, they add, they calculate,
-
0:54 - 0:55and then they put out reports,
-
0:55 - 0:57and those reports
will have numbers like this. -
0:57 - 1:00The problem is, how do they know
all of our questions? -
1:00 - 1:01We have lots of questions.
-
1:01 - 1:05In fact, in some ways there's literally
an infinite number of questions -
1:05 - 1:06that we can ask about our city.
-
1:06 - 1:08The agencies can never keep up.
-
1:08 - 1:12So the paradigm isn't exactly working,
and I think our policymakers realize that, -
1:12 - 1:16because in 2012, Mayor Bloomberg
signed into law what he called -
1:16 - 1:20the most ambitious and comprehensive
open data legislation in the country. -
1:20 - 1:21In a lot of ways, he's right.
-
1:21 - 1:24In the last two years,
the city has released 1,000 datasets -
1:24 - 1:26on our open data portal,
-
1:26 - 1:27and it's pretty awesome.
-
1:27 - 1:29So you go and look at data like this,
-
1:29 - 1:32and instead of just counting
the number of cabs, -
1:32 - 1:34we can start to ask different questions.
-
1:34 - 1:35So I had a question.
-
1:35 - 1:36When's rush hour in New York City?
-
1:36 - 1:39It can be pretty bothersome.
When is rush hour exactly? -
1:39 - 1:42And I thought to myself,
these cabs aren't just numbers, -
1:42 - 1:44these are GPS recorders
driving around in our city streets -
1:44 - 1:46recording each and every ride they take.
-
1:46 - 1:49There's data there,
and I looked at that data, -
1:49 - 1:53and I made a plot of the average speed of
taxis in New York City throughout the day. -
1:53 - 1:56You can see that from about midnight
to around 5:18 in the morning, -
1:56 - 2:00speed increases, and at that point,
things turn around, -
2:00 - 2:04and they get slower and slower and slower
until about 8:35 in the morning, -
2:04 - 2:06when they end up at around
11 and a half miles per hour. -
2:06 - 2:10The average taxi is going 11 and a half
miles per hour on our city streets, -
2:10 - 2:12and it turns out it stays that way
-
2:12 - 2:15for the entire day.
-
2:15 - 2:16(Laughter)
-
2:16 - 2:20So I said to myself, I guess
there's no rush hour in New York City. -
2:20 - 2:21There's just a rush day.
-
2:21 - 2:24Makes sense. And this is important
for a couple of reasons. -
2:24 - 2:28If you're a transportation planner,
this might be pretty interesting to know. -
2:28 - 2:30But if you want to get somewhere quickly,
-
2:30 - 2:33you now know to set your alarm for
4:45 in the morning and you're all set. -
2:33 - 2:34New York, right?
-
2:34 - 2:36But there's a story behind this data.
-
2:36 - 2:38This data wasn't
just available, it turns out. -
2:38 - 2:42It actually came from something called
a Freedom of Information Law Request, -
2:42 - 2:43or a FOIL Request.
-
2:43 - 2:46This is a form you can find on the
Taxi and Limousine Commission website. -
2:46 - 2:49In order to access this data,
you need to go get this form, -
2:49 - 2:51fill it out, and they will notify you,
-
2:51 - 2:53and a guy named Chris Whong
did exactly that. -
2:53 - 2:55Chris went down, and they told him,
-
2:55 - 2:58"Just bring a brand new hard drive
down to our office, -
2:58 - 3:01leave it here for five hours,
we'll copy the data and you take it back." -
3:01 - 3:03And that's where this data came from.
-
3:03 - 3:06Now, Chris is the kind of guy
who wants to make the data public, -
3:06 - 3:10and so it ended up online for all to use,
and that's where this graph came from. -
3:10 - 3:14And the fact that it exists is amazing.
These GPS recorders -- really cool. -
3:14 - 3:17But the fact that we have citizens
walking around with hard drives -
3:17 - 3:19picking up data from city agencies
to make it public -- -
3:19 - 3:22it was already kind of public,
you could get to it, -
3:22 - 3:23but it was "public," it wasn't public.
-
3:23 - 3:25And we can do better than that as a city.
-
3:25 - 3:28We don't need our citizens
walking around with hard drives. -
3:28 - 3:31Now, not every dataset
is behind a FOIL Request. -
3:31 - 3:34Here is a map I made with the most
dangerous intersections in New York City -
3:34 - 3:36based on cyclist accidents.
-
3:36 - 3:38So the red areas are more dangerous.
-
3:38 - 3:41And what it shows is first
the East side of Manhattan, -
3:41 - 3:44especially in the lower area of Manhattan,
has more cyclist accidents. -
3:44 - 3:45That might make sense
-
3:45 - 3:48because there are more cyclists
coming off the bridges there. -
3:48 - 3:50But there's other hotspots worth studying.
-
3:50 - 3:53There's Williamsburg.
There's Roosevelt Avenue in Queens. -
3:53 - 3:56And this is exactly the kind of data
we need for Vision Zero. -
3:56 - 3:58This is exactly what we're looking for.
-
3:58 - 4:00But there's a story
behind this data as well. -
4:00 - 4:02This data didn't just appear.
-
4:02 - 4:04How many of you guys know this logo?
-
4:04 - 4:06Yeah, I see some shakes.
-
4:06 - 4:08Have you ever tried to copy
and paste data out of a PDF -
4:08 - 4:10and make sense of it?
-
4:10 - 4:11I see more shakes.
-
4:11 - 4:14More of you tried copying and pasting
than knew the logo. I like that. -
4:14 - 4:18So what happened is, the data
that you just saw was actually on a PDF. -
4:18 - 4:21In fact, hundreds and hundreds
and hundreds of pages of PDF -
4:21 - 4:23put out by our very own NYPD,
-
4:23 - 4:26and in order to access it,
you would either have to copy and paste -
4:26 - 4:28for hundreds and hundreds of hours,
-
4:28 - 4:29or you could be John Krauss.
-
4:29 - 4:30John Krauss was like,
-
4:30 - 4:34I'm not going to copy and paste this data.
I'm going to write a program. -
4:34 - 4:36It's called the NYPD Crash Data Band-Aid,
-
4:36 - 4:39and it goes to the NYPD's website
and it would download PDFs. -
4:39 - 4:42Every day it would search;
if it found a PDF, it would download it -
4:42 - 4:44and then it would run
some PDF-scraping program, -
4:44 - 4:46and out would come the text,
-
4:46 - 4:49and it would go on the Internet,
and then people could make maps like that. -
4:49 - 4:53And the fact that the data's here,
the fact that we have access to it -- -
4:53 - 4:55Every accident, by the way,
is a row in this table. -
4:55 - 4:57You can imagine how many PDFs that is.
-
4:57 - 4:59The fact that we
have access to that is great, -
4:59 - 5:01but let's not release it in PDF form,
-
5:01 - 5:04because then we're having our citizens
write PDF scrapers. -
5:04 - 5:06It's not the best use
of our citizens' time, -
5:06 - 5:08and we as a city can do better than that.
-
5:08 - 5:11Now, the good news is that
the de Blasio administration -
5:11 - 5:13actually recently released this data
a few months ago, -
5:13 - 5:15and so now we can
actually have access to it, -
5:15 - 5:18but there's a lot of data
still entombed in PDF. -
5:18 - 5:21For example, our crime data
is still only available in PDF. -
5:21 - 5:25And not just our crime data,
our own city budget. -
5:25 - 5:29Our city budget is only readable
right now in PDF form. -
5:29 - 5:31And it's not just us
that can't analyze it -- -
5:31 - 5:34our own legislators
who vote for the budget -
5:34 - 5:36also only get it in PDF.
-
5:36 - 5:40So our legislators cannot analyze
the budget that they are voting for. -
5:40 - 5:43And I think as a city we can do
a little better than that as well. -
5:43 - 5:46Now, there's a lot of data
that's not hidden in PDFs. -
5:46 - 5:47This is an example of a map I made,
-
5:47 - 5:50and this is the dirtiest waterways
in New York City. -
5:50 - 5:52Now, how do I measure dirty?
-
5:52 - 5:54Well, it's kind of a little weird,
-
5:54 - 5:56but I looked at the level
of fecal coliform, -
5:56 - 5:59which is a measurement of fecal matter
in each of our waterways. -
5:59 - 6:03The larger the circle,
the dirtier the water, -
6:03 - 6:06so the large circles are dirty water,
the small circles are cleaner. -
6:06 - 6:08What you see is inland waterways.
-
6:08 - 6:11This is all data that was sampled
by the city over the last five years. -
6:11 - 6:14And inland waterways are,
in general, dirtier. -
6:14 - 6:15That makes sense, right?
-
6:15 - 6:18And the bigger circles are dirty.
And I learned a few things from this. -
6:18 - 6:21Number one: Never swim in anything
that ends in "creek" or "canal." -
6:21 - 6:26But number two: I also found
the dirtiest waterway in New York City, -
6:26 - 6:28by this measure, one measure.
-
6:28 - 6:31In Coney Island Creek, which is not
the Coney Island you swim in, luckily. -
6:31 - 6:32It's on the other side.
-
6:32 - 6:36But Coney Island Creek, 94 percent
of samples taken over the last five years -
6:36 - 6:38have had fecal levels so high
-
6:38 - 6:41that it would be against state law
to swim in the water. -
6:41 - 6:44And this is not the kind of fact
that you're going to see -
6:44 - 6:46boasted in a city report, right?
-
6:46 - 6:48It's not going to be
the front page on nyc.gov. -
6:48 - 6:50You're not going to see it there,
-
6:50 - 6:52but the fact that we can get
to that data is awesome. -
6:52 - 6:54But once again, it wasn't super easy,
-
6:54 - 6:56because this data was not
on the open data portal. -
6:56 - 6:58If you were to go to the open data portal,
-
6:58 - 7:01you'd see just a snippet of it,
a year or a few months. -
7:01 - 7:04It was actually on the Department
of Environmental Protection's website. -
7:04 - 7:08And each one of these links is an Excel
sheet, and each Excel sheet is different. -
7:08 - 7:11Every heading is different:
you copy, paste, reorganize. -
7:11 - 7:14When you do you can make maps
and that's great, but once again, -
7:14 - 7:17we can do better than that
as a city, we can normalize things. -
7:17 - 7:20And we're getting there, because
there's this website that Socrata makes -
7:20 - 7:22called the Open Data Portal NYC.
-
7:22 - 7:24This is where 1,100 data sets
that don't suffer -
7:24 - 7:26from the things I just told you live,
-
7:26 - 7:28and that number is growing,
and that's great. -
7:28 - 7:31You can download data in any format,
be it CSV or PDF or Excel document. -
7:31 - 7:34Whatever you want,
you can download the data that way. -
7:34 - 7:35The problem is, once you do,
-
7:35 - 7:39you will find that each agency
codes their addresses differently. -
7:39 - 7:41So one is street name,
intersection street, -
7:41 - 7:43street, borough, address, building,
building address. -
7:43 - 7:47So once again, you're spending time,
even when we have this portal, -
7:47 - 7:49you're spending time
normalizing our address fields. -
7:49 - 7:52And that's not the best use
of our citizens' time. -
7:52 - 7:53We can do better than that as a city.
-
7:53 - 7:55We can standardize our addresses,
-
7:55 - 7:57and if we do,
we can get more maps like this. -
7:57 - 8:00This is a map of fire hydrants
in New York City, -
8:00 - 8:01but not just any fire hydrants.
-
8:01 - 8:06These are the top 250 grossing fire
hydrants in terms of parking tickets. -
8:06 - 8:08(Laughter)
-
8:08 - 8:11So I learned a few things from this map,
and I really like this map. -
8:11 - 8:14Number one, just don't park
on the Upper East Side. -
8:14 - 8:17Just don't. It doesn't matter where
you park, you will get a hydrant ticket. -
8:17 - 8:21Number two, I found the two highest
grossing hydrants in all of New York City, -
8:21 - 8:23and they're on the Lower East Side,
-
8:23 - 8:28and they were bringing in over
55,000 dollars a year in parking tickets. -
8:28 - 8:31And that seemed a little strange
to me when I noticed it, -
8:31 - 8:34so I did a little digging and it turns out
what you had is a hydrant -
8:34 - 8:36and then something called
a curb extension, -
8:36 - 8:38which is like a seven-foot
space to walk on, -
8:38 - 8:39and then a parking spot.
-
8:39 - 8:42And so these cars came along,
and the hydrant -- -
8:42 - 8:44"It's all the way over there, I'm fine,"
-
8:44 - 8:47and there was actually a parking spot
painted there beautifully for them. -
8:47 - 8:50They would park there, and the NYPD
disagreed with this designation -
8:50 - 8:51and would ticket them.
-
8:51 - 8:54And it wasn't just me
who found a parking ticket. -
8:54 - 8:56This is the Google
Street View car driving by -
8:56 - 8:57finding the same parking ticket.
-
8:57 - 9:02So I wrote about this on my blog,
on I Quant NY, and the DOT responded, -
9:02 - 9:03and they said,
-
9:03 - 9:06"While the DOT has not received
any complaints about this location, -
9:06 - 9:11we will review the roadway markings
and make any appropriate alterations." -
9:11 - 9:14And I thought to myself,
typical government response, -
9:14 - 9:16all right, moved on with my life.
-
9:16 - 9:20But then, a few weeks later,
something incredible happened. -
9:20 - 9:22They repainted the spot,
-
9:22 - 9:25and for a second I thought I saw
the future of open data, -
9:25 - 9:27because think about what happened here.
-
9:27 - 9:32For five years, this spot was being
ticketed, and it was confusing, -
9:32 - 9:36and then a citizen found something,
they told the city, and within a few weeks -
9:36 - 9:38the problem was fixed.
-
9:38 - 9:41It's amazing. And a lot of people
see open data as being a watchdog. -
9:41 - 9:43It's not, it's about being a partner.
-
9:43 - 9:46We can empower our citizens
to be better partners for government, -
9:46 - 9:48and it's not that hard.
-
9:48 - 9:49All we need are a few changes.
-
9:49 - 9:50If you're FOILing data,
-
9:50 - 9:53if you're seeing your data
being FOILed over and over again, -
9:53 - 9:57let's release it to the public, that's
a sign that it should be made public. -
9:57 - 9:59And if you're a government agency
releasing a PDF, -
9:59 - 10:03let's pass legislation that requires you
to post it with the underlying data, -
10:03 - 10:05because that data
is coming from somewhere. -
10:05 - 10:07I don't know where, but it's
coming from somewhere, -
10:07 - 10:09and you can release it with the PDF.
-
10:09 - 10:11And let's adopt and share
some open data standards. -
10:11 - 10:14Let's start with our addresses
here in New York City. -
10:14 - 10:16Let's just start
normalizing our addresses. -
10:16 - 10:18Because New York is a leader in open data.
-
10:18 - 10:21Despite all this, we are absolutely
a leader in open data, -
10:21 - 10:24and if we start normalizing things,
and set an open data standard, -
10:24 - 10:28others will follow. The state will follow,
and maybe the federal government, -
10:28 - 10:29Other countries could follow,
-
10:29 - 10:32and we're not that far off from a time
where you could write one program -
10:32 - 10:34and map information from 100 countries.
-
10:34 - 10:37It's not science fiction.
We're actually quite close. -
10:37 - 10:39And by the way, who are we
empowering with this? -
10:39 - 10:42Because it's not just John Krauss
and it's not just Chris Whong. -
10:42 - 10:45There are hundreds of meetups
going on in New York City right now, -
10:45 - 10:46active meetups.
-
10:46 - 10:49There are thousands of people
attending these meetups. -
10:49 - 10:51These people are going after work
and on weekends, -
10:51 - 10:54and they're attending these meetups
to look at open data -
10:54 - 10:55and make our city a better place.
-
10:55 - 10:59Groups like BetaNYC, who just last week
released something called citygram.nyc -
10:59 - 11:02that allows you to subscribe
to 311 complaints -
11:02 - 11:04around your own home,
or around your office. -
11:04 - 11:06You put in your address,
you get local complaints. -
11:06 - 11:09And it's not just the tech community
that are after these things. -
11:09 - 11:12It's urban planners like
the students I teach at Pratt. -
11:12 - 11:14It's policy advocates, it's everyone,
-
11:14 - 11:17it's citizens from a diverse
set of backgrounds. -
11:17 - 11:19And with some small, incremental changes,
-
11:19 - 11:23we can unlock the passion
and the ability of our citizens -
11:23 - 11:26to harness open data
and make our city even better, -
11:26 - 11:29whether it's one dataset,
or one parking spot at a time. -
11:29 - 11:32Thank you.
-
11:32 - 11:35(Applause)
- Title:
- How we found the worst place to park in New York City — using big data
- Speaker:
- Ben Wellington
- Description:
-
City agencies have access to a wealth of data and statistics reflecting every part of urban life. But as data analyst Ben Wellington suggests in this entertaining talk, sometimes they just don't know what to do with it. He shows how a combination of unexpected questions and smart data crunching can produce strangely useful insights, and shares tips on how to release large sets of data so that anyone can use them.
- Video Language:
- English
- Team:
- closed TED
- Project:
- TEDTalks
- Duration:
- 11:48
Krystian Aparta edited English subtitles for How we found the worst place to park in New York City — using big data | ||
Krystian Aparta edited English subtitles for How we found the worst place to park in New York City — using big data | ||
Morton Bast edited English subtitles for How we found the worst place to park in New York City — using big data | ||
Morton Bast edited English subtitles for How we found the worst place to park in New York City — using big data | ||
Morton Bast edited English subtitles for How we found the worst place to park in New York City — using big data | ||
Morton Bast edited English subtitles for How we found the worst place to park in New York City — using big data | ||
Morton Bast approved English subtitles for How we found the worst place to park in New York City — using big data | ||
Morton Bast edited English subtitles for How we found the worst place to park in New York City — using big data |