How we found the worst place to park in New York City — using big data | Ben Wellington | TEDxNewYork
-
0:17 - 0:20Six thousand miles of road,
-
0:20 - 0:22600 miles of subway track,
-
0:22 - 0:24400 miles of bike lanes,
-
0:24 - 0:26and a half a mile of tram track,
-
0:26 - 0:28if you've ever been to Roosevelt Island.
-
0:28 - 0:31These are the numbers that make up
the infrastructure of NYC, -
0:31 - 0:33these are the statistics
of our infrastructure. -
0:33 - 0:36They're the kind of numbers
released in reports by city agencies. -
0:36 - 0:39For example, the Department
of Transportation will probably tell you -
0:39 - 0:41how many miles of road they maintain.
-
0:41 - 0:44The MTA will boast how many miles
of subway track there are. -
0:44 - 0:46But most city agencies give us statistics.
-
0:46 - 0:49This is from a report this year
from the Taxi & Limousine Commission, -
0:49 - 0:53where we've learned that there is
about 13,500 taxis here in NYC. -
0:53 - 0:54Pretty interesting, right?
-
0:54 - 0:57But did you ever think about
where these numbers came from? -
0:57 - 1:00Because for these numbers to exist
somebody at the city agency -
1:00 - 1:04has to stop and say hmm, here's a number
that somebody might want to know. -
1:04 - 1:06Here's a number
that our citizens want to know. -
1:06 - 1:08So they go back to their raw data,
-
1:08 - 1:09they count, they add, they calculate,
-
1:09 - 1:11and then they put out reports.
-
1:11 - 1:14And those reports
will have numbers like this. -
1:14 - 1:16The problem is, how do they know
all of our questions? -
1:16 - 1:17We have lots of questions.
-
1:17 - 1:21In fact, in some ways there's literally
an infinite number of questions -
1:21 - 1:22that we can ask about our city.
-
1:22 - 1:24So the agencies can never keep up.
-
1:24 - 1:26So the paradigm isn't exactly working
-
1:26 - 1:28and I think our policy makers realize that
-
1:28 - 1:32because in 2012, Mayor Bloomberg
signed into law what he called -
1:32 - 1:36the most ambitious and comprehensive
open data legislation in the country. -
1:36 - 1:38In a lot of ways he's right.
-
1:38 - 1:42In the last two years the city's released
1,000 data sets on our open data portal -
1:42 - 1:44and, it's pretty awesome.
-
1:44 - 1:46You look at data like this,
-
1:46 - 1:48and instead of counting
the number of cabs, -
1:48 - 1:50we can start to ask different questions.
-
1:50 - 1:52So I had a question:
When is rush hour in NYC? -
1:52 - 1:55It can be pretty bothersome.
When is rush hour exactly? -
1:55 - 1:58And I thought to myself,
these cabs aren't just numbers, -
1:58 - 2:01these are GPS recorders driving around
in our city's streets recording -
2:01 - 2:03each and every right they take.
-
2:03 - 2:04There's data there.
-
2:04 - 2:06And I looked at that data
and I made a plot -
2:06 - 2:09of the average speed of taxis in NYC
throughout the day. -
2:09 - 2:13You can see that from around midnight
to around 5:18 AM, speed increases, -
2:13 - 2:16and at that point, things turn around.
-
2:16 - 2:20They get slower, slower and slower
until about 8:35 AM -
2:20 - 2:23when they end up at 11.5 mph.
-
2:23 - 2:26The average taxi is going at 11.5 mph
in our city streets, -
2:26 - 2:28and it turns out it stays that way
-
2:28 - 2:31for the entire day.
-
2:31 - 2:33(Laughter)
-
2:33 - 2:36So I said to myself, I guess
there's no rush hour in NYC, -
2:36 - 2:37there's just a "rush day."
-
2:37 - 2:38(Laughter)
-
2:38 - 2:39Makes sense.
-
2:39 - 2:41This is important
for a couple of reasons. -
2:41 - 2:45If you are a transportation planner,
this might be pretty interesting to know. -
2:45 - 2:47But if you want to get somewhere quickly
-
2:47 - 2:50you now know to set your alarm
for 4:45 AM and you're all set. -
2:50 - 2:50New York, right?
-
2:50 - 2:52But there's story behind this data,
-
2:52 - 2:54it wasn't just available as it turns out.
-
2:54 - 2:58It actually came from something called
a Freedom of Information Law Request, -
2:58 - 2:59or a FOIL Request.
-
2:59 - 3:02This is a form you can find on
the Taxi & Limousine Commission website. -
3:02 - 3:04In order to access this data,
you need to go get this form, -
3:04 - 3:06fill it out, and they will notify you.
-
3:06 - 3:09And a guy name Chris Whong
did exactly that. -
3:09 - 3:11Chris went down and they told him,
-
3:11 - 3:14"Just bring a brand new hard drive
to our office, -
3:14 - 3:17leave it here for 5 hours,
we'll copy the data and you take it back." -
3:17 - 3:19And that's where this data came from.
-
3:19 - 3:22Now, Chris is the kind of guy
that wants to make the data public, -
3:22 - 3:26so it ended up online for all to use
and that's where this graph came from. -
3:26 - 3:28And the fact that it exists is amazing.
-
3:28 - 3:30These GPS recorders - really cool!
-
3:30 - 3:33But the fact that we have citizens
walking around with hard drives -
3:33 - 3:35picking up data from city agencies
to make it public - -
3:35 - 3:38it was already kind of public,
you could get to it, -
3:38 - 3:40but it was "public", it wasn't public.
-
3:40 - 3:42And we can do better than that as a city,
-
3:42 - 3:44we don't need our citizens
walking around with hard drives. -
3:44 - 3:47Now, not every dataset
is behind a FOIL request. -
3:47 - 3:51Here's a map I made with
the most dangerous intersections in NYC -
3:51 - 3:53based on cyclist accidents.
-
3:53 - 3:55So the red areas are more dangerous.
-
3:55 - 3:57What it shows is first
the East side of Manhattan, -
3:57 - 4:01especially in the lower area of Manhattan,
has more cycle accidents. -
4:01 - 4:02That might makes sense
-
4:02 - 4:05because there are more cyclist
coming off the bridges over there. -
4:05 - 4:07But there's other hotspots worth studying.
-
4:07 - 4:10There's Williamsburg.
There's Roosevelt Avenue in Queens. -
4:10 - 4:13This is exactly the type of data
we need for vision zero. -
4:13 - 4:15This is exactly what we're looking for.
-
4:15 - 4:17But there's story
behind this data as well. -
4:17 - 4:18This data didn't just appear.
-
4:18 - 4:21How many of you guys know this logo?
-
4:21 - 4:22Yeah, I see some shakes.
-
4:22 - 4:25Have you ever tried to copy
and paste data out of a PDF -
4:25 - 4:26and make sense of it?
-
4:26 - 4:27I see more shakes.
-
4:27 - 4:31More of you tried to copying and pasting
than knew the logo. I like that. -
4:31 - 4:34What happen is, the data
that you just saw was actually on a PDF. -
4:34 - 4:39In fact, hundreds, and hundreds,
of pages of PDF put out by our own NYPD, -
4:39 - 4:41and in order to access it,
-
4:41 - 4:44you either have to copy and paste
for hundred and hundred of hours, -
4:44 - 4:46or you could be John Krauss.
-
4:46 - 4:47John Krauss is like,
-
4:47 - 4:50I'm not going to copy and paste this data,
I'm going to write a program. -
4:50 - 4:52It's called the NYPD Crash Data Band-Aid.
-
4:52 - 4:55And it goes to the NYPD's website
and it would download PDFs. -
4:55 - 4:57Every day with it would search;
-
4:57 - 4:59if it found a PDF, it would download it,
-
4:59 - 5:01and it would run
some PDF-scraping program, -
5:01 - 5:02and out would come the text
-
5:02 - 5:06and it would go on the Internet,
and people could make maps like that. -
5:06 - 5:09And the fact that the data is here,
that we can have access to it - -
5:09 - 5:11every accident, by the way, is a row
on this table. -
5:11 - 5:13You can imagine how many PDF that is.
-
5:13 - 5:15The fact that we
have access to that is great. -
5:15 - 5:18But let's not release it in PDF form.
-
5:18 - 5:21Because then we're having our citizens
write PDF scrapers. -
5:21 - 5:23It's not the best use
of our citizens' time, -
5:23 - 5:25and we, as a city,
can do better than that. -
5:25 - 5:27The good news is that
the de Blasio Administration -
5:27 - 5:30actually released this data
a few months ago, -
5:30 - 5:32so now, we can have access to it.
-
5:32 - 5:34But there's a lot of data
still entombed in PDF. -
5:34 - 5:38For example our crime data,
still is only available in PDF. -
5:38 - 5:39And not just our crime data,
-
5:39 - 5:42our own city budget.
-
5:42 - 5:45Our city budget is only
readable right now in PDF form. -
5:45 - 5:47And it's not just us
that can't analyze it - -
5:47 - 5:50our own legislators
who vote for the budget, -
5:50 - 5:52also only get it in PDF.
-
5:52 - 5:56So our legislators cannot analyze
the budget that they are voting for. -
5:56 - 6:00And I think as a city we can do
a little better than that as well. -
6:00 - 6:02Now, there's a lot of data
that's not hidden in PDFs. -
6:02 - 6:04This is an example of a map I made.
-
6:04 - 6:07And this is the dirtiest waterways in NYC.
-
6:07 - 6:08How do I measure dirty?
-
6:08 - 6:10Well, it's kind of a little weird,
-
6:10 - 6:12but I looked at the level
of fecal coliform, -
6:12 - 6:16which is a measurement of fecal matter
in each of our waterways. -
6:16 - 6:19The larger the circle,
the dirtier the water. -
6:19 - 6:22The large circles are dirty waters,
the smaller circles are cleaner. -
6:22 - 6:24What you see is inland waterways.
-
6:24 - 6:27This is all data that was sampled
by the city over the last 5 years. -
6:27 - 6:30And inland waterways are,
in general, dirtier. -
6:30 - 6:31That makes sense, right?
-
6:31 - 6:33And I learned a few things from this.
-
6:33 - 6:39Number 1: never swim in anything
that ends in creek or canal. -
6:39 - 6:42Number 2: I also found
the dirtiest waterways in New York City -
6:42 - 6:44by this measure, one measure.
-
6:44 - 6:45In Coney Island Creek,
-
6:45 - 6:48which is not Coney Island you swim in,
luckily, it's on the other side. -
6:48 - 6:53But Coney Island Creek, 94% of samples
taken over the last 5 years -
6:53 - 6:55have had fecal levels so high,
-
6:55 - 6:58that it would be against state law
to swim in the water. -
6:58 - 7:01And this is not the kind of fact
that you're going to see -
7:01 - 7:04boasted in a city report
or on the front page of nyc.gov. -
7:04 - 7:05You're not going to see it there,
-
7:05 - 7:08but the fact that we can
get to that data, is awesome. -
7:08 - 7:10Once again, it wasn't super easy,
-
7:10 - 7:12because this data was not
on the open data portal. -
7:12 - 7:14If you were to go to the open data portal,
-
7:14 - 7:17you'd see just a snippet of it,
a year or a few months. -
7:17 - 7:20It was actually on the Department
of Environmental Protection's website. -
7:20 - 7:24Each one of these links is an Excel sheet,
and this Excel sheet is different. -
7:24 - 7:27Every heading is different:
you copy, paste, reorganize. -
7:27 - 7:30When you do you can make maps
and that's great, but once again, -
7:30 - 7:32we can do better than that as a city,
we can normalize things. -
7:32 - 7:36We're getting there because
there's this website that Socrata makes, -
7:36 - 7:37called the Open Data Portal NYC.
-
7:37 - 7:39This is where 1100 data sets,
that don't suffer -
7:39 - 7:41from the things I told you live,
-
7:41 - 7:43and that number is growing,
and that's great. -
7:43 - 7:47You can download data in any format,
be it CSV or PDF or Excel document. -
7:47 - 7:50Whatever you want,
you can download the data that way. -
7:50 - 7:51The problem is, once you do,
-
7:51 - 7:56you'll find that each agency
codes their addresses differently. -
7:56 - 7:58So, one is street name,
intersection street, -
7:58 - 8:00street, borough, address building,
building, address. -
8:00 - 8:03So, once again, you're spending time,
even when we have this portal, -
8:03 - 8:06you're spending time
normalizing our address field. -
8:06 - 8:08I think that's not the best use
of our citizens' time, -
8:08 - 8:10we can do better than that as a city.
-
8:10 - 8:12We can standardize our addresses.
-
8:12 - 8:14If we do, we can get more maps like this.
-
8:14 - 8:16This is a map of fire hydrants
in New York City. -
8:16 - 8:18But not just any fire hydrant.
-
8:18 - 8:20These are the top 250
grossing fire hydrants -
8:20 - 8:23in terms of parking tickets.
-
8:23 - 8:25(Laughter)
-
8:25 - 8:27So I learned a few things from this map.
-
8:27 - 8:30Number 1: just don't park
on the Upper East side. -
8:30 - 8:34Just don't. No matter where you park,
you will get a hydrant ticket. -
8:34 - 8:38Number 2: I found the two highest
grossing hydrants in all of New York City. -
8:38 - 8:39They are on the Lower East side,
-
8:39 - 8:45and they are bringing in over
55,000 dollars a year in parking tickets. -
8:45 - 8:47And that seemed a little strange to me
when I noticed it, -
8:47 - 8:49so I did a little digging,
and it turns out -
8:49 - 8:53what you had is a hydrant
and something called a curb extension, -
8:53 - 8:55which is like a seven-foot space
to walk on, -
8:55 - 8:56and then a parking spot.
-
8:56 - 8:58So these cars came along and the hydrant -
-
8:58 - 9:00"It's all the way over there, I'm fine,"
-
9:00 - 9:03and there was actually a parking spot
painted there beautifully for them. -
9:03 - 9:06They would park there and the NYPD
disagree with the designation, -
9:06 - 9:08and would ticket them.
-
9:08 - 9:10And it wasn't just me
who found a parking ticket. -
9:10 - 9:14This is the Google street view car
driving by, finding same parking ticket. -
9:14 - 9:16So I wrote about this
on my blog, on I Quant NY, -
9:16 - 9:18and the DOT responded and they said,
-
9:18 - 9:23"While the DOT has not received
any complaints about this location, -
9:23 - 9:27we will review the roadway markings
and make any appropriate alterations." -
9:27 - 9:30I thought to myself, you know,
typical government response, -
9:30 - 9:32all right, moved on with my life.
-
9:32 - 9:37But then, a few weeks later,
something incredible happened. -
9:37 - 9:39They repainted the spot.
-
9:39 - 9:41And for a second I thought
I saw the future of open data -
9:41 - 9:43because think about what happened here.
-
9:43 - 9:48For five years, this spot
was being ticketed, and it was confusing. -
9:48 - 9:53And then a citizen found something,
they told the city and within a few weeks, -
9:53 - 9:55the problem was fixed. It's amazing.
-
9:55 - 9:58A lot of people see open data
as being a watch dog, it's not. -
9:58 - 9:59It's about being a partner.
-
9:59 - 10:03We can empower our citizens to be
better partners for government, -
10:03 - 10:04and it's not that hard.
-
10:04 - 10:06All we need are a few changes.
-
10:06 - 10:07If you're FOILing data,
-
10:07 - 10:09if you seeing your data
being FOILed over and over again, -
10:09 - 10:12let's release it to the public, that's
a sign that it should be made public. -
10:12 - 10:15And if you're a government agency
releasing a PDF, -
10:15 - 10:19let's pass a legislation that requires you
to post it with your underlying data, -
10:19 - 10:21because that data
is coming from somewhere. -
10:21 - 10:24I don't know where,
but you can release it with the PDF. -
10:24 - 10:26And let's adopt and share
some open data standards. -
10:26 - 10:29Let's start with our addresses
here in New York City. -
10:29 - 10:31Let's just start
normalizing our addresses. -
10:31 - 10:33Because New York is a leader in open data.
-
10:33 - 10:35Despite all this, we're absolutely
a leader in open data, -
10:35 - 10:38and if we start normalizing things,
and set an open data standard, -
10:38 - 10:39others will follow.
-
10:39 - 10:42The state will follow,
maybe the federal government, -
10:42 - 10:43other countries could follow,
-
10:43 - 10:47and we're not that far off from a time
where you can write one program -
10:47 - 10:49and map information from a 100 countries.
-
10:49 - 10:52It's not science fiction,
we're actually quite close. -
10:52 - 10:54And by the way, who are we
empowering with this? -
10:54 - 10:58Because it's not just John Krauss,
it's not just Chris Whong. -
10:58 - 11:01There are hundred of meetups
going around in New York City right now, -
11:01 - 11:02active meetups.
-
11:02 - 11:05There are thousands of people
attending these meetups. -
11:05 - 11:07These people are going after work
and on weekends, -
11:07 - 11:10and they're attending these meetups
to look at open data, -
11:10 - 11:11and make our city a better place.
-
11:11 - 11:16Groups like BetaNYC who just last week,
released something called citygram.nyc -
11:16 - 11:18that allows you to subscribe
to 311 complaints -
11:18 - 11:20around your own home,
or around your office. -
11:20 - 11:23You put in your address,
you get local complaints. -
11:23 - 11:26And it's not just the tech community
that are after these things. -
11:26 - 11:28It's urban planners like the students
I teach at Pratt. -
11:28 - 11:30It's policy advocates, it's everyone,
-
11:30 - 11:33it's citizens from a diverse
set of backgrounds. -
11:33 - 11:36And with some small incremental changes,
-
11:36 - 11:39we can unlock the passion
and the ability of our citizens -
11:39 - 11:42to harness open data
and make our city even better, -
11:42 - 11:46whether is one data set
or one parking spot at a time. -
11:46 - 11:47Thank you.
-
11:47 - 11:50(Applause)
- Title:
- How we found the worst place to park in New York City — using big data | Ben Wellington | TEDxNewYork
- Description:
-
City agencies have access to a wealth of data and statistics reflecting every part of urban life. But as data analyst Ben Wellington suggests in this entertaining talk, sometimes they just don't know what to do with it. He shows how a combination of unexpected questions and smart data crunching can produce strangely useful insights, and shares tips on how to release large sets of data so that anyone can use them.
- Video Language:
- English
- Team:
- closed TED
- Project:
- TEDxTalks
- Duration:
- 11:52
Jenny Lam
I think at 11:25-11:28 he said: "...at -- Pratt" -- the name is Pratt Institute, but I can't get the word before that.