<i>hacc preroll music</i>

Herald: And a lovely welcome back to the
haccs stage on the third day this

Congress, we are here with a talk on "A
few quantitive thoughts on parking in

Marburg" by Martin L. He's interested in
data analytics now and infrastructure and

traffic in general. And because of that,
he started scraping publicly available

parking data in Marburg and just went on
and analyzed it and found a lot of

interesting things which he is going to
present in this talk to you right now. In

case you didn't know, there is IRC client
on the live.hacc.media where you can ask

questions later or with the #rC3hacc tag
on Twitter.

Martin Lellep: Welcome to my talk "A few
quantitative thoughts on parking in

Marburg". I am delighted to speak here on
this Congress because I love the yearly

conferences. Also, thank you to the
organizing team for making all this

possible. You do an absolutely fabulous
job. Now, the first question that you

should ask is: why? The following is a
purely hobby project question, I came up

with a question because transportation is
important, but unfortunately, it's also

difficult. The most popular vehicles these
days are cars and hence the question, how

do people park in Marburg? Who am I? My
name is Martin, and I analyze publicly

available data. I live close to Marburg,
therefore the parking in Marburg. Now, a

little bit of background regarding
Marburg, it's a small picturesque, vibrant

university town. There are a few highlights,
such as the castle, the old town and the

river, just to name a few. It has around
80,000 residents and a somewhat dense core

around the old town. You can see a few
pictures here of the castle, the old town

and the river, respectively. Now, at this
point, I would like to give my props to

David Kriesel because all this work was
inspired by his amazing data science

talks. You can find them on YouTube. And I
absolutely encourage you to look for the

Bahnmining, Spiegelmining and the Xerox
story talks. OK, so if you have questions,

then please ask, I will be there live
during the Q&amp;A of this conference and also

you can send me an email with whatever you
like, essentially. OK, so first of all, I

would like to give a quick introduction to
the data source. Now, the data, the

parking data from Marburg is publicly,
well it's published live on a system that

is implemented by the city, by the city
council, I believe . It's called

Parkleitsystem Marburg or PLS for now, and
it publishes the data such as the parking

decks, the number of free parking spots
and the location. The address here is

pls.marburg.de. And let's see how it
looks. Yeah, so obviously it's still

online and you can see here the parking
deck names listed, the number of free

parking spots. Color coded is if it is
rather full or if it's rather empty, you

can see here all of them are in the green.
The green color coding here, it's

because it's probably close to Christmas.
Nobody wants to really park in the city.

And the only one that's this one here, the
Marktdreieck Parkdeck that it has some

load to it. Then also there's a button
called route. So whenever you click on the

on this button, say we we pick the
Erlenring-Center button, we are redirected

to Google Maps and we can see here the
location of this parking deck, for

example. Let's go back. Last but not
least, there's also the maximum vehicle

allowance and of course, the time stamp of
the data. OK, back to the presentation

now. This is a very simple website, so of
course it's easy to scrape and that's what

I did. Regarding the scraper, I used a
Linux computer and a docker container. And

this scraper, you can see a small sketch
here to the left, it simply visits the

website every 3 minutes inside the docker
container and writes the data into I

believe it was csv files, which are
subsequently used for the data analysis.

All of it, the scraper and the analysis
scripts are written in Python. OK, the

data format is pretty simple, it's
processed internally with data frames,

with the package panda. Everybody who
knows Python probably knows panda, anyway.

It's the data format is as follows. The
row corresponds to the time. The column

corresponds to the specific parking deck,
and the cell corresponds to the number of

free parking spots at that time of that
parking deck. Now, in order to make the

numbers a bit more usable, I transformed
the number of free parking spots to the

number of used parking spots by
subtracting it from the maximum along the

time. OK, now the intro is just to get
used to the data, we'd like to take a look

at the locations of the of the park houses
or the park decks. This is a screenshot.

There's an interactive version. Let me
open it here. It's a interactive map. You

can see two types of markers, the first
one red, the second one green, and that's

because the red ones are the ones that are
given, well they are encoded in the links

of the PLS system, and they
are actually wrong. So when you click on

the for instance. Erlenring-Center parking
deck that I've done before, the location,

longitude and latitude are actually
incorrect and, um, Google Maps corrected

on the fly. And therefore, I have shown
here the ones given on the website that

are incorrect in red and the ones shown
that are correct. So you can safely focus

only on the green ones. Um, a quick
overview here is the train station region,

there are two. And then they are scattered
around the city. Um, sometimes there are

two parking decks very close by, for
instance, these two and these two. And

that's because it's essentially one
parking deck with two parking sections

typically inside the building and on top
of the building. OK, let's go back to the

presentation. With that in place, we or we
take a look at the joined data, meaning I

accumulate the number of used parking
spots across all the parking decks. You

can see that here now, so it's a quite
comprehensive picture, I started data

scraping in August 2019 and stopped it at
the end of February 2020.

This data here is a different resample
frequency of the original and raw data. I

started with a resample of one hour. So
just a reminder, it's the true frequency

is three minutes. Again, I resampled here
into one hour. It's not very easy to

understand on that scale here. Then to one
day it's the orange now and lastly on one

week and we can learn different things
from it. So in particular, the orange

curve of one day shows that there might be
some periodicity in the signal. And the

green one shows that there are times or
weeks that are particularly... where

there's particularly little parking
demand, for instance, here around

Christmas 2019. OK, so again, from the
orange signal, you can see that there's

probably some periodicity, and in order to
quantify that, I plotted the or computed

the auto correlation function. The auto
correlation function essentially takes a

time signal and computes the overlap
between the time signal and the same

signal shifted by some time and whenever
there's a large overlap. That points

towards the periodicity, and here we can
see that the periodicity maximum or the

auto correlation maximum, the first one
corresponds to one week and therefore the

periodicity can be safely assumed to be at
seven days. Of course, when there's

periodicity and a signal at seven days,
for instance, there's also periodicity. In

14 days and in 21 days, but the
correlation coefficients, they decay

typically. OK, now we have the periodicity
with respect to days in place. Now let's

take a look at the day and hour demand.
And for that, I computed a two dimensional

histogram with the day Monday to Sunday on
the one axis and the other axis

corresponds to the hour. And here we can
clearly see that the majority of the

parking demand is around the noon hour. So
starting from 11 to to approximately,

let's say, 5 p.m. or so. Interestingly.
That was a point where I was surprised is

that Sunday's is a day where there's
little parking demand in Marburg, I

wouldv'e guesstimated that Sunday when
everybody has spare time, they typically

rush into the city. But that's obviously
not the case. Another interesting fact is

that Monday morning seemed to be very
difficult to get up because you can see

the parking demand is smaller than on on
other mornings. OK, now, after that, I

come to the separated... separate and
analysis where I take a look at the

individual parking decks. So first of all,
again, the times series, it's it's a bit

dense and it's very hard to see. So there
are a few things to learn from the

picture. So first of all, the green
signal that corresponds to the Erlenring-

Center. Reminder, I just opened it. In the
very beginning of this talk seems to be

the dominant one, then there are quite a
few data gaps. So take for instance. Well,

it's very apparent here for the violet
one, the Furthstraße-Parkdeck, this one

here. And that's an extreme case. It had
obviously some kind of problem. It was

open for some time and then closed for
some other times. Typically, park houses

or parking decks are either open 24/7, but
there are also quite a few that are that

close overnight. OK, next I was interested
in the statistics of parking demand for

individual parking decks, so I
concentrated only on, say, one parking

deck and computed the histograms of the
used parking spots also, depending on the

time. Let's focus here on the Oberstadt,
it's the old town and you can see that the

overall parking demand peaks at around,
let's say, maybe 20 used parking spots, so

that's the average, but that's not for all
times when we make that statement,

depending on the time, for instance, the
morning we can see that's approximately

the same. But when we go towards noon, we
can see that the number of parking spots

or used parking spots increases. There are
even a few times when it's at the maximum

around noon. Now, when we go towards later
hours, the maximum shifts towards smaller

values again. Now, this this behavior of
the maximum shifting, so clearly,

depending on the hour, is not not apparent
for all the parking decks. For instance,

the Parkdreieck here ... Marktdreieck,
sorry, that doesn't show the signal as

clear as the Oberstadt one. OK, from this
all now we can quantify also the, I call

it integral parking demand, simply it's
the the number of parking spots that have

been provided per parking deck. Now the
picture here, it's normalized to the

maximum and one can see from this picture
here very easily that the Erlenring-

Center, as we've estimated or guessed
previously already is the one that's

dominating the whole city. It's providing
the most parking spots by a large margin,

actually. The next one is the Lahn-Center
and then maybe the Oberstadt and the other

ones follow after these. Another
interesting point here is that the

proportion of parking spots provided on
weekends differs for the different parking

decks. For instance, here you can see this
one here is quite a big portion, the

Erlenring-Center, also on weekends.
Contrary, the Marktdreieck-Parkdeck has

only a very small portion of, um, of
parking spots provided on weekends. It

might be interesting to know that this
particular parking station is ... it's the

one that is used if you want to go to a
doctor, because it's very close. So many

doctors are not open on Sundays, on
Saturdays, and therefore probably the

parking demand is quite low. Now, there's
a temporal version also where I rendered a

small video that I'm opening now, and you
can see essentially the same as in the

previous graph, but against time. Again,
it's very apparent that there's a

periodicity and here my scraper crashed
and it's back in business again, and I

found it interesting to see that there are
parking decks that have cars... well that

host cars, even at night, for instance,
here the Erlenring-Center again in the

Lahn-Center, the ones that are the largest
one, they offer parking also overnight.

And there are some cars in there,
probably. OK, let's close that again. Now,

I come lastly to the prediction part now.
The goal here is to measure the parking

demand through the parking decks, but then
to interpolate between the parking decks,

so I would like... so I have ...say the
Oberstadt the old town and the, I don't

know, the Erlenring, which was the largest
one. I would like to know what's the

parking demand in between, for instance.
For doing so, I use a spatial fit and I

use a machine learning model for that, in
order to do that spatial fit. It is now,

in this particular case, a non parametric
model called Gaussian Process Regression.

And the nice thing about that is that it
also returns the uncertainty. Because say,

for instance, you would like to use these
model, machine learning predictions to

say, build some kind of parking deck or to
get rid of one. All these operations, all

these derived actions would be very
expensive. So you would like to know if

the uncertainty is large or small for
whatever the machine learning model

predicts. Just for the math oriented
people. If you're interested in that

model, definitely take a look at the, I
would call it, Gaussian process bible by

Rasmussen. It's amazing to read. Yeah,
there are two, um, evaluations now, I did.

The first one is based on the whole data
set, so there's no spatial or..sorry...

there's no temporal resolution. And what I
do, I did well, I rrendered a video and I

would like to explain you the outcome of
that while it is running. The top picture

here shows you the prediction by the
machine learning model. And the the bottom

picture shows you the uncertainty. The
training data, meaning the parking decks,

is denoted by the black points. Now, first
of all, the uncertainty, you can see that

wherever there is training data, the
uncertainty goes down. So the model is,

um, certain about its prediction that
because, well, there's training data and

in between the uncertainty rises again.
Now the prediction, you can see some small

hill. It's exactly the Erlenring-Center,
which was the largest one. Now, what is

shown in the video is it's rotating. You
can see the coordinates of Marburg on the

on the plane, on the bottom plane. And at
some point, the view rotates upwards and

gives you a top down perspective with a
corresponding color bars or corresponding

color map. So, again, here's the the
maximum, the Erlenring-Center. And I did

that because next we would like to finally
measure the parking demand between

stations. OK, there's another small video
again, and now we start right from the top

down, color coded view and again, the
black points are the... is the training

data, but now the red points are, is kind
of test data, meaning positions in

between. I concentrated now on the Mensa
because I have a special relation with the

Mensa, the physics department, the
university library, the train station and

the cinema. And just to demonstrate from
this spatial fit, we can derive the

parking demand at these positions also.
Here, this yellow pike, it's the

Erlenring-Center again. Now, that's only a
qualitative result, of course, I don't

want to derive any quantitative at this
point, it's just a proof of concept that

it is possible to derive something like
that from the publicly available data.

Now I forgot to mention the beginning that
there's a bonus and I would like to come

to the bonus now. It is about the Corona
crisis or pandemic, of course. What I did

is, the initial data acquisition phase,
here in black, that's the whole talk was

about that black portion here. I stopped
it at around the end of February and I

restarted the whole data acquisition
process now again at in approximately

April. Just to capture something from the
Corona crisis as well. And you can see

here again, the time series. I think the
most interesting bit about it and the most

comprehensive bit is the the mean . You
can see the the mean across the whole time

denoted by this dashed line. And you can
see that the mean is smaller. So during

the Corona pandemic fewer people parked in
Marburg, which is reasonable, I would say.

But there are also times where the number
of parking spots decreased significantly.

So for instance, right when the Corona
crisis started in April and now the second

wave in October, November, December, it is
visible that the parking demand decreased

a lot. And I went one step further and
wanted to know the the differences between

pre Corona and during Corona also for each
of the parking decks, that's what I did

here. It's now not the normalized parking
demand, but the absolute parking demand.

So now we can see also the absolute
numbers, the black black bars you've seen

previously already. Now the red bars is
for the during the Corona crisis. And then

I defined these, the first wave and the
second wave as serious corona times. So I

also plotted a third bar... set of bars
here. And it's interesting to see that

while most of the parking decks, of
course, suffered in terms of providing

parking demands or most of them provided
fewer parking decks, parking spots. But

there are a few, like, for instance, the
Marktdreieck-Parkdeck here that, well,

almost increased. We can see during the
corona in general it increased a bit. And

then during the heavy corona, it increased
even more. And as I mentioned before, this

is the parking deck that corresponds to,
yeah, a whole collection of doctors. So. I

derive that well during Corona times the
parking demand in front of doctors even

increased a tiny bit. Yeah, with that, I
would like to come to my conclusions.

Thank you for sticking with me until now.
So I scraped publicly available data here

with a small scraper set up. I analyzed
it, for instance, for day and hour

patterns. And last but not least, did some
machine learning in order to quantify the

demand in between the stations, there is
an accompanying blog article also. You can

find it down here, there all the figures
in higher resolution and you can play

around with an interactive map also, if
you like. Um, and to finally now conclude

the presentation. I would like to hear
from you what you think about this

analysis. I'd like to improve with these
kind of mini studies. And therefore, I

would be very interested in your critique
regarding the content, the presentation

and general content... comments. Again,
you can email me to this email address

here, or alternatively, I set up a Google,
um, Google form. So the Google forms

document that exactly comprised of these
questions, and you can simply type them in

if you're interested. Thank you very much.

Herald: All right, first of all, thank you
for this amazing talk, I have a few

questions what have been relayed to me and
I'm just going to ask them one after the

other. And let's not waste any time and
start with the first one. Have you found

parking decks that are usually heavily
overloaded or never completely used?

Martin: Um so. Given that there are only
around what was it, 8 or 9 or 10 in the

data set, honestly, I never looked for for
that question. So, um, short answers is:

No. Long answer, yes, I could have or I
still could, I would say.

H: OK. Have you tried prediction in time,
so guessing which parking decks will be

exhausted soon?
M: No, no. So that's obviously it's

like... it's... I would consider that
something like the predictive maintenance

of traffic business kind of. It's
definitely a thing that people that have

more time and more are willing to invest
more definitely should do and could do. I

would say I mean, there's lots of lots of
additional data that might be of interest,

like weather data. And, for instance, is
it a is it a public holiday, yes or no and

all that kind of stuff. So, again, short
answer.: No. Long answer. Yes. Would be

possible.
H: OK, so if anyone watching has the time

or energy to do that, they could.
M: Absolutely. Yes.

H: OK, and the last question I have right
now is, will the code or especially the

scraping part be available publicly or
like in the GitHub or somewhere?

H: Um, I could do that. So I was very I
was quite hesitant with it. So obviously

publishing the data could be problematic.
I have no experience with it on the legal

side. So I would probably not publish the
data, which is I mean, it's old data

anyway. So and but then regarding the
code, I was just waiting if anybody's

interested. So given that somebody stated
the interest, I would probably publish it.

Yes.
H: OK, yeah I think that's it from the

question side .
M: Hmhm.

H: And they were all answered quite
nicely. And judging by that, I don't get

any more questions right now. So, yeah, I
would conclude is talk. Maybe you can also

like have a last word. From my side I'm
done here.

M: Yes. So, um, well, thank you very much
for watching the talk. And I try to

improve. I think I said it on the last
slide. If I'm right, let me know if you

have any doubts or things to improve
essentially on. And then regarding maybe

the last question of publishing it, I
believe that I put a link there to find my

blog and I would probably just add another
blog post stating well there's an github

repository. You can go there and just find
just find the code and stuff like that

there. So if you're interested, just, you
know, find my website. My name is Martin

Lellep. Um, and then you will in a few
days, I guess probably in 2021 only. So I

won't be able to publish it in the next
two days. But then the code will be

public. Yes.
H: OK, then. Have a great day. Great time

at Congress and byebye.

<i>postroll music</i>

Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!