0:00:00.000,0:00:18.772
Music
0:00:18.772,0:00:25.332
Herald:Hi! Welcome, welcome in Wikipaka-[br]WG, in this extremely crowded Esszimmer.
0:00:25.332,0:00:32.079
I'm Jakob, I'm your Herald for tonight[br]until 10:00 and I'm here to welcome you
0:00:32.079,0:00:36.690
and to welcome these wonderful three guys[br]on the stage. They're going to talk about
0:00:36.690,0:00:44.710
the infrastructure of Wikipedia.[br]And yeah, they are Lucas, Amir, and Daniel
0:00:44.710,0:00:52.970
and I hope you'll have fun![br]Applause
0:00:52.970,0:00:57.059
Amir Sarabadani: Hello, my name is[br]Amir, um, I'm a software engineer at
0:00:57.059,0:01:01.130
Wikimedia Deutschland, which is the German[br]chapter of Wikimedia Foundation. Wikimedia
0:01:01.130,0:01:06.520
Foundation runs Wikipedia. Here is Lucas.[br]Lucas is also a software engineer, at
0:01:06.520,0:01:10.300
Wikimedia Deutschland, and Daniel here is[br]a software architect at Wikimedia
0:01:10.300,0:01:15.110
Foundation. We are all based in Germany,[br]Daniel in Leipzig, we are in Berlin. And
0:01:15.110,0:01:21.420
today we want to talk about how we run[br]Wikipedia, with using donors' money and
0:01:21.420,0:01:29.910
not lots of advertisement and collecting[br]data. So in this talk, first we are going
0:01:29.910,0:01:34.860
to go on an inside-out approach. So we are[br]going to first talk about the application
0:01:34.860,0:01:39.830
layer and then the outside layers, and[br]then we go to an outside-in approach and
0:01:39.830,0:01:48.635
then talk about how you're going to hit[br]Wikipedia from the outside.
0:01:48.635,0:01:53.320
So first of all, let's some,[br]let me get you some information. First of
0:01:53.320,0:01:57.259
all, all of Wikimedia, Wikipedia[br]infrastructure is run by Wikimedia
0:01:57.259,0:02:01.810
Foundation, an American nonprofit[br]charitable organization. We don't run any
0:02:01.810,0:02:07.960
ads and we are only 370 people. If you[br]count Wikimedia Deutschland or all other
0:02:07.960,0:02:12.500
chapters, it's around 500 people in total.[br]It's nothing compared to the companies
0:02:12.500,0:02:19.530
outside. But all of the content is[br]managed by volunteers. Even our staff
0:02:19.530,0:02:24.170
doesn't do edits, add content to[br]Wikipedia. And we support 300 languages,
0:02:24.170,0:02:29.501
which is a very large number. And [br]Wikipedia, it's eighteen years old, so it
0:02:29.501,0:02:37.950
can vote now. And also, Wikipedia has some[br]really, really weird articles. Um, I want
0:02:37.950,0:02:42.510
to ask you, what is your, if you have[br]encountered any really weird article
0:02:42.510,0:02:47.970
in Wikipedia? My favorite is a list of[br]people who died on the toilet. But if you
0:02:47.970,0:02:54.620
know anything, raise your hands. Uh, do[br]you know any weird articles in Wikipedia?
0:02:54.620,0:02:58.750
Do you know some?[br]Daniel Kinzler: Oh, the classic one….
0:02:58.750,0:03:03.600
Amir: You need to unmute yourself. Oh,[br]okay.
0:03:03.600,0:03:09.551
Daniel: This is technology. I don't know[br]anything about technology. OK, no. The, my
0:03:09.551,0:03:13.900
favorite example is "people killed by[br]their own invention". That's yeah. That's
0:03:13.900,0:03:20.510
a lot of fun. Look it up. It's amazing.[br]Lucas Werkmeister: There's also a list,
0:03:20.510,0:03:24.810
there is also a list of prison escapes[br]using helicopters. I almost said
0:03:24.810,0:03:28.790
helicopter escapes using prisons, which[br]doesn't make any sense. But that was also
0:03:28.790,0:03:31.830
a very interesting list.[br]Daniel: I think we also have a category of
0:03:31.830,0:03:35.310
lists of lists of lists.[br]Amir: That's a page.
0:03:35.310,0:03:39.040
Lucas: And every few months someone thinks[br]it's funny to redirect it to Russel's
0:03:39.040,0:03:42.940
paradox or so.[br]Daniel: Yeah.
0:03:42.940,0:03:49.209
Amir: But also beside that, people cannot[br]read Wikipedia in Turkey or China. But
0:03:49.209,0:03:54.450
three days ago, actually, the block in[br]Turkey was ruled unconstitutional, but
0:03:54.450,0:04:01.000
it's not lifted yet. Hopefully they will[br]lift it soon. Um, so Wikipedia, Wikimedia
0:04:01.000,0:04:05.660
projects is just not Wikipedia. It's lots[br]and lots of projects. Some of them are not
0:04:05.660,0:04:11.650
as successful as the Wikipedia. Um, uh,[br]like Wikinews. But uh, for example,
0:04:11.650,0:04:16.190
Wikipedia is the most successful one, and[br]there's another one, that's Wikidata. It's
0:04:16.190,0:04:21.680
being developed by Wikimedia Deutschland.[br]I mean the Wikidata team, with Lucas, um,
0:04:21.680,0:04:26.520
and it's being used – it's infobox – it[br]has the data that Wikipedia or Google
0:04:26.520,0:04:31.449
Knowledge Graph or Siri or Alexa uses.[br]It's basically, it's sort of a backbone of
0:04:31.449,0:04:37.981
all of the data, uh, through the whole[br]Internet. Um, so our infrastructure. Let
0:04:37.981,0:04:42.910
me… So first of all, our infrastructure is[br]all Open Source. By principle, we never
0:04:42.910,0:04:48.081
use any commercial software. Uh, we could[br]use a lots of things. They are even
0:04:48.081,0:04:54.330
sometimes were given us for free, but we[br]were, refused to use them. Second
0:04:54.330,0:04:59.060
thing is we have two primary data center[br]for like failovers, when, for example, a
0:04:59.060,0:05:03.960
whole datacenter goes offline, so we can[br]failover to another data center. We have
0:05:03.960,0:05:11.100
three caching points of presence or[br]CDNs. Our CDNs are all over the world. Uh,
0:05:11.100,0:05:15.180
also, we have our own CDN. We don't have,[br]we don't use CloudFlare, because
0:05:15.180,0:05:20.960
CloudFlare, we care about the privacy of[br]the users and is very important that, for
0:05:20.960,0:05:25.490
example, people edit from countries that[br]might be, uh, dangerous for them to edit
0:05:25.490,0:05:29.810
Wikipedia. So we really care to keep the[br]data as protected as possible.
0:05:29.810,0:05:32.400
Applause
0:05:32.400,0:05:39.460
Amir: Uh, we have 17 billion page views[br]per month, and, which goes up and down
0:05:39.460,0:05:44.350
based on the season and everything, we[br]have around 100 to 200 thousand requests
0:05:44.350,0:05:48.449
per second. It's different from the[br]pageview because requests can be requests
0:05:48.449,0:05:54.540
to the objects, can be API, can be lots of[br]things. And we have 300,000 new editors
0:05:54.540,0:06:03.120
per month and we run all of this with 1300[br]bare metal servers. So right now, Daniel
0:06:03.120,0:06:07.010
is going to talk about the application[br]layer and the inside of that
0:06:07.010,0:06:11.830
infrastructure.[br]Daniel: Thanks, Amir. Oh, the clicky
0:06:11.830,0:06:20.330
thing. Thank you. So the application layer[br]is basically the software that actually
0:06:20.330,0:06:25.050
does what a wiki does, right? It lets you[br]edit pages, create or update pages and
0:06:25.050,0:06:29.650
then search the page views. interference[br]noise The challenge for Wikipedia, of
0:06:29.650,0:06:37.150
course, is serving all the many page views[br]that Amir just described. The core of the
0:06:37.150,0:06:42.690
application is a classic LAMP application.[br]interference noise I have to stop
0:06:42.690,0:06:50.130
moving. Yes? Is that it? It's a classic[br]LAMP stack application. So it's written in
0:06:50.130,0:06:57.080
PHP, it runs on an Apache server. It uses[br]MySQL as a database in the backend. We
0:06:57.080,0:07:01.630
used to use a HHVM instead of the… Yeah,[br]we…
0:07:01.630,0:07:13.830
Herald: Hier. Sorry. Nimm mal das hier.[br]Daniel: Hello. We used to use HHVM as the
0:07:13.830,0:07:20.810
PHP engine, but we just switched back to[br]the mainstream PHP, using PHP 7.2 now,
0:07:20.810,0:07:24.720
because Facebook decided that HHVM is[br]going to be incompatible with the standard
0:07:24.720,0:07:35.430
and they were just basically developing it[br]for, for themselves. Right. So we have
0:07:35.430,0:07:42.740
separate clusters of servers for serving[br]requests, for serving different requests,
0:07:42.740,0:07:48.020
page views on the one hand, and also[br]handling edits. Then we have a cluster for
0:07:48.020,0:07:55.350
handling API calls and then we have a[br]bunch of servers set up to handle
0:07:55.350,0:08:01.050
asynchronous jobs, things that happen in[br]the background, the job runners, and…
0:08:01.050,0:08:05.240
I guess video scaling is a very obvious[br]example of that. It just takes too long to
0:08:05.240,0:08:11.720
do it on the fly. But we use it for many[br]other things as well. MediaWiki, MediaWiki
0:08:11.720,0:08:15.930
is kind of an amazing thing because you[br]can just install it on your own shared-
0:08:15.930,0:08:23.419
hosting, 10-bucks-a-month's webspace and[br]it will run. But you can also use it to,
0:08:23.419,0:08:29.270
you know, serve half the world. And so[br]it's a very powerful and versatile system,
0:08:29.270,0:08:34.479
which also… I mean, this, this wide span[br]of different applications also creates
0:08:34.479,0:08:41.000
problems. That's something that I will[br]talk about tomorrow. But for now, let's
0:08:41.000,0:08:49.230
look at the fun things. So if you want to[br]serve a lot of page views, you have to do
0:08:49.230,0:08:55.550
a lot of caching. And so we have a whole…[br]yeah, a whole set of different caching
0:08:55.550,0:09:00.880
systems. The most important one is[br]probably the parser cache. So as you
0:09:00.880,0:09:07.431
probably know, wiki pages are created in,[br]in a markup language, Wikitext, and they
0:09:07.431,0:09:13.290
need to be parsed and turned into HTML.[br]And the result of that parsing is, of
0:09:13.290,0:09:19.940
course, cached. And that cache is semi-[br]persistent, it… nothing really ever drops
0:09:19.940,0:09:25.060
out of it. It's a huge thing. And it's, it[br]lives in a dedicated MySQL database
0:09:25.060,0:09:33.490
system. Yeah. We use memcached a lot for[br]all kinds of miscellaneous things,
0:09:33.490,0:09:38.930
anything that we need to keep around and[br]share between server instances. And we
0:09:38.930,0:09:43.589
have been using redis for a while, for[br]anything that we want to have available,
0:09:43.589,0:09:47.560
not just between different servers, but[br]also between different data centers,
0:09:47.560,0:09:53.200
because redis is a bit better about[br]synchronizing things between, between
0:09:53.200,0:09:59.820
different systems, we still use it for[br]session storage, especially, though we are
0:09:59.820,0:10:09.600
about to move away from that and we'll be[br]using Cassandra for session storage. We
0:10:09.600,0:10:19.310
have a bunch of additional services[br]running for specialized purposes, like
0:10:19.310,0:10:27.120
scaling images, rendering formulas, math[br]formulas, ORES is pretty interesting. ORES
0:10:27.120,0:10:33.400
is a system for automatically detecting[br]vandalism or rating edits. So this is a
0:10:33.400,0:10:38.120
machine learning based system for[br]detecting problems and highlighting edits
0:10:38.120,0:10:45.060
that may not be, may not be great and need[br]more attention. We have some additional
0:10:45.060,0:10:50.940
services that process our content for[br]consumption on mobile devices, chopping
0:10:50.940,0:10:56.480
pages up into bits and pieces that then[br]can be consumed individually and many,
0:10:56.480,0:11:08.200
many more. In the background, we also have[br]to manage events, right, we use Kafka for
0:11:08.200,0:11:14.640
message queuing, and we use that to notify[br]different parts of the system about
0:11:14.640,0:11:19.980
changes. On the one hand, we use that to[br]feed the job runners that I just
0:11:19.980,0:11:27.540
mentioned. But we also use it, for[br]instance, to purge the entries in the
0:11:27.540,0:11:35.050
CDN when pages become updated and things[br]like that. OK, the next session is going
0:11:35.050,0:11:40.269
to be about the databases. Are there, very[br]quickly, we will have quite a bit of time
0:11:40.269,0:11:45.230
for discussion afterwards. But are there[br]any questions right now about what we said
0:11:45.230,0:11:57.120
so far? Everything extremely crystal[br]clear. OK, no clarity is left? I see. Oh,
0:11:57.120,0:12:07.570
one question, in the back.[br]Q: Can you maybe turn the volume up a
0:12:07.570,0:12:20.220
little bit? Thank you.[br]Daniel: Yeah, I think this is your
0:12:20.220,0:12:27.959
section, right? Oh, its Amir again. Sorry.[br]Amir: So I want to talk about my favorite
0:12:27.959,0:12:32.279
topic, the dungeons of, dungeons of every[br]production system, databases. The database
0:12:32.279,0:12:39.580
of Wikipedia is really interesting and[br]complicated on its own. We use MariaDB, we
0:12:39.580,0:12:45.870
switched from MySQL in 2013 for lots of[br]complicated reasons. As, as I said,
0:12:45.870,0:12:50.200
because we are really open source, you can[br]go and not just check our database tree,
0:12:50.200,0:12:55.310
that says, like, how it looks and what's[br]the replicas and masters. Actually, you
0:12:55.310,0:12:59.650
can even query the Wikipedia's database[br]live when you have that, you can just go
0:12:59.650,0:13:02.930
to that address and login with your[br]Wikipedia account and just can do whatever
0:13:02.930,0:13:07.430
you want. Like, it was a funny thing that[br]a couple of months ago, someone sent me a
0:13:07.430,0:13:12.970
message, sent me a message like, oh, I[br]found a security issue. You can just query
0:13:12.970,0:13:18.000
Wikipedia's database. I was like, no, no,[br]it's actually, we, we let this happen.
0:13:18.000,0:13:21.900
It's like, it's sanitized. We removed the[br]password hashes and everything. But still,
0:13:21.900,0:13:27.779
you can use this. And, but if you wanted[br]to say, like, how the clusters work, the
0:13:27.779,0:13:32.029
database clusters, because it gets too[br]big, they first started sharding, but now
0:13:32.029,0:13:36.279
we have sections that are basically[br]different clusters. Uh, really large wikis
0:13:36.279,0:13:42.839
have their own section. For example,[br]English Wikipedia is s1. German Wikipedia
0:13:42.839,0:13:50.820
with two or three other small wikis are in[br]s5. Wikidata is on s8, and so on. And
0:13:50.820,0:13:56.250
each section have a master and several[br]replicas. But one of the replicas is
0:13:56.250,0:14:01.700
actually a master in another data center[br]because of the failover that I told you.
0:14:01.700,0:14:08.079
So it is, basically two layers of[br]replication exist. This is, what I'm
0:14:08.079,0:14:13.070
telling you, is about metadata. But for[br]Wikitext, we also need to have a complete
0:14:13.070,0:14:19.450
different set of databases. But it can be,[br]we use consistent hashing to just scale it
0:14:19.450,0:14:27.630
horizontally so we can just put more[br]databases on it, for that. Uh, but I don't
0:14:27.630,0:14:32.070
know if you know it, but Wikipedia stores[br]every edit. So you have the text of,
0:14:32.070,0:14:36.930
Wikitext of every edit in the whole[br]history in the database. Uhm, also we have
0:14:36.930,0:14:41.910
parser cache that Daniel explained, and[br]parser cache is also consistent hashing.
0:14:41.910,0:14:47.000
So we just can horizontally scale it. But[br]for metadata, it is slightly more
0:14:47.000,0:14:56.440
complicated. Um, metadata shows and is[br]being used to render the page. So in order
0:14:56.440,0:15:01.680
to do this, this is, for example, a very[br]short version of the database tree that I
0:15:01.680,0:15:07.019
showed you. You can even go and look for[br]other ones but this is a s1. s1 eqiad this
0:15:07.019,0:15:12.100
is the main data center the master is this[br]number and it replicates to some of this
0:15:12.100,0:15:16.860
and then this 7, the second one that this[br]was with 2000 because it's the second data
0:15:16.860,0:15:24.750
center and it's a master of the other one.[br]And it has its own replications
0:15:24.750,0:15:30.680
between cross three replications because[br]the master, that master data center is in
0:15:30.680,0:15:37.399
Ashburn, Virginia. The second data center[br]is in Dallas, Texas. So they need to have a
0:15:37.399,0:15:43.220
cross DC replication and that happens[br]with a TLS to make sure that no one starts
0:15:43.220,0:15:49.200
to listen to, in between these two, and we[br]have snapshots and even dumps of the whole
0:15:49.200,0:15:53.440
history of Wikipedia. You can go to[br]dumps.wikimedia.org and download the whole
0:15:53.440,0:15:59.130
reserve every wiki you want, except the[br]ones that we had to remove for privacy
0:15:59.130,0:16:04.899
reasons and with a lots and lots of[br]backups. I recently realized we have lots
0:16:04.899,0:16:15.149
of backups. And in total it is 570 TB of data[br]and total 150 database servers and a
0:16:15.149,0:16:20.269
queries that happens to them is around[br]350,000 queries per second and, in total,
0:16:20.269,0:16:29.459
it requires 70 terabytes of RAM. So and[br]also we have another storage section that
0:16:29.459,0:16:35.000
called Elasticsearch which you can guess[br]it- it's being used for search, on the top
0:16:35.000,0:16:39.050
right, if you're using desktop. It's[br]different in mobile, I think. And also it
0:16:39.050,0:16:44.610
depends on if you're rtl language as well,[br]but also it runs by a team called search
0:16:44.610,0:16:47.550
platform because none of us are from[br]search platform we cannot explain it this
0:16:47.550,0:16:54.010
much we don't know much how it works it[br]slightly. Also we have a media storage for
0:16:54.010,0:16:58.420
all of the free pictures that's being[br]uploaded to Wikimedia like, for example,
0:16:58.420,0:17:02.400
if you have a category in Commons. Commons[br]is our wiki that holds all of the free
0:17:02.400,0:17:08.130
media and if we have a category in Commons[br]called cats looking at left and you have
0:17:08.130,0:17:15.630
category cats looking at right so we have[br]lots and lots of images. It's 390 terabytes
0:17:15.630,0:17:20.620
of media, 1 billion object and uses Swift.[br]Swift is the object is storage component
0:17:20.620,0:17:29.190
of OpenStack and it has it has several[br]layers of caching, frontend, backend.
0:17:29.190,0:17:36.799
Yeah, that's mostly it. And we want to[br]talk about traffic now and so this picture
0:17:36.799,0:17:43.929
is when Sweden in 1967 moved from a left-[br]driving from left to there driving to
0:17:43.929,0:17:48.999
right. This is basically what happens in[br]Wikipedia infrastructure as well. So we
0:17:48.999,0:17:54.942
have five caching layers and the most[br]recent one is eqsin which is in Singapore,
0:17:54.942,0:17:59.310
the three one are just CDN ulsfo, codfw,[br]esams and eqsin. Sorry, ulsfo, esams and
0:17:59.310,0:18:06.590
eqsin are just CDNs. We have also two[br]points of presence, one in Chicago and the
0:18:06.590,0:18:15.080
other one is also in Amsterdam, but we[br]don't get to that. So, we have, as I said,
0:18:15.080,0:18:20.230
we have our own content delivery network[br]with our traffic or allocation is done by
0:18:20.230,0:18:26.860
GeoDNS which actually is written and[br]maintained by one of the traffic people,
0:18:26.860,0:18:32.140
and we can pool and depool DCs. It has a[br]time to live of 10 minute- 10 minutes, so
0:18:32.140,0:18:37.950
if a data center goes down. We have - it[br]takes 10 minutes to actually propagate for
0:18:37.950,0:18:47.110
being depooled and repooled again. And we[br]use LVS as transport layer and this layer
0:18:47.110,0:18:55.799
3 and 4 of the Linux load balancer for[br]Linux and supports consistent hashing and
0:18:55.799,0:19:00.679
also we ever got we grow so big that we[br]needed to have something that manages the
0:19:00.679,0:19:07.100
load balancer so we wrote something our[br]own system is called pybal. And also we -
0:19:07.100,0:19:11.210
lots of companies actually peer with us. We[br]for example directly connect to
0:19:11.210,0:19:20.440
Amsterdam amps X. So this is how the[br]caching works, which is, anyway, it's
0:19:20.440,0:19:24.779
there is lots of reasons for this. Let's[br]just get the started. We use TLS, we
0:19:24.779,0:19:31.080
support TLS 1.2 where we have K then[br]the first layer we have nginx-. Do you
0:19:31.080,0:19:40.049
know it - does anyone know what nginx-[br]means? And so that's related but not - not
0:19:40.049,0:19:46.780
correct. So we have nginx which is the free[br]version and we have nginx plus which is
0:19:46.780,0:19:51.729
the commercial version and nginx. But we[br]don't use nginx to do load balancing or
0:19:51.729,0:19:56.389
anything so we stripped out everything[br]from it, and we just use it for TLS
0:19:56.389,0:20:02.019
termination so we call it nginx-, is an[br]internal joke. So and then we have Varnish
0:20:02.019,0:20:09.809
frontend. Varnish also is a caching layer[br]and this is the frontend is on the memory
0:20:09.809,0:20:15.000
which is very very fast and you have the[br]backend which is on the storage and the
0:20:15.000,0:20:22.559
hard disk but this is slow. The fun thing[br]is like just CDN caching layer takes 90%
0:20:22.559,0:20:26.869
of our requests. Its response and 90% of[br]because just gets to the Varnish and just
0:20:26.869,0:20:34.720
return and then with doesn't work it goes[br]through the application layer. The Varnish
0:20:34.720,0:20:41.259
holds-- it has a TTL of 24 hours so if you[br]change an article, it also get invalidated
0:20:41.259,0:20:47.159
by the application. So if someone added the[br]CDN actually purges the result. And the
0:20:47.159,0:20:52.330
thing is, the frontend is shorted that can[br]spike by request so you come here load
0:20:52.330,0:20:56.470
balancer just randomly sends your request[br]to a frontend but then the backend is
0:20:56.470,0:21:00.989
actually, if the frontend can't find it,[br]it sends it to the backend and the backend
0:21:00.989,0:21:09.700
is actually sort of - how is it called? -[br]it's a used hash by request, so, for
0:21:09.700,0:21:15.402
example, article of Barack Obama is only[br]being served from one node in the data
0:21:15.402,0:21:22.059
center in the CDN. If none of this works it[br]actually hits the other data center. So,
0:21:22.059,0:21:29.940
yeah, I actually explained all of this. So[br]we have two - two caching clusters and one
0:21:29.940,0:21:35.820
is called text and the other one is called[br]upload, it's not confusing at all, and if
0:21:35.820,0:21:42.559
you want to find out, you can just do mtr[br]en.wikipedia.org and you - you're - the end
0:21:42.559,0:21:49.909
node is text-lb.wikimedia.org which is the[br]our text storage but if you go to
0:21:49.909,0:21:57.789
upload.wikimedia.org, you get to hit the[br]upload cluster. Yeah this is so far, what
0:21:57.789,0:22:03.669
is it, and it has lots of problems because[br]a) varnish is open core, so the version
0:22:03.669,0:22:09.309
that you use is open source we don't use[br]the commercial one, but the open core one
0:22:09.309,0:22:21.009
doesn't support TLS. What? What happened?[br]Okay. No, no, no! You should I just-
0:22:21.009,0:22:35.789
you're not supposed to see this. Okay,[br]sorry for the- huh? Okay, okay sorry. So
0:22:35.789,0:22:40.119
Varnish has lots of problems, Varnish is[br]open core, it doesn't support TLS
0:22:40.119,0:22:45.220
termination which makes us to have this[br]nginx- their system just to do TLS
0:22:45.220,0:22:49.539
termination, makes our system complicated.[br]It doesn't work very well with so if that
0:22:49.539,0:22:55.970
causes us to have a cron job to restart[br]every Varnish node twice a week. We have a
0:22:55.970,0:23:04.330
cron job that this restarts every Vanish[br]node which is embarrassing, but also, on
0:23:04.330,0:23:08.809
the other hand then the end of Varnish[br]like backend wants to talk to the
0:23:08.809,0:23:13.010
application layer, it also doesn't support[br]terminate - TLS termination, so we use
0:23:13.010,0:23:19.970
IPSec which is even more embarrassing, but[br]we are changing it. So we call it, if you
0:23:19.970,0:23:25.080
are using a particular fixed server which[br]is very very nice and it's also open
0:23:25.080,0:23:31.070
source, a fully open source like in with[br]Apache Foundation, Apache does the TLS,
0:23:31.070,0:23:37.169
does the TLS by termination and still[br]for now we have a Varnish frontend that
0:23:37.169,0:23:44.809
still exists but a backend is also going[br]to change to the ATS, so we call this ATS
0:23:44.809,0:23:49.970
sandwich. Two ATS happening between and[br]there the middle there's a Varnish. The
0:23:49.970,0:23:55.269
good thing is that the TLS termination[br]when it moves to ATS, you can actually use
0:23:55.269,0:24:01.499
TLS 1.3 which is more modern and more[br]secure and even very faster so it
0:24:01.499,0:24:05.889
basically drops 100 milliseconds from[br]every request that goes to Wikipedia.
0:24:05.889,0:24:12.350
That translates to centuries of our[br]users' time every month, but ATS is going
0:24:12.350,0:24:19.480
on and hopefully it will go live soon and[br]once these are done, so this is the new
0:24:19.480,0:24:25.669
version. And, as I said, the TLS and when[br]we can do this we can actually use the
0:24:25.669,0:24:36.519
more secure instead of IPSec to talk about[br]between data centers. Yes. And now it's
0:24:36.519,0:24:42.260
time that Lucas talks about what happens[br]when you type in en.wikipedia.org.
0:24:42.260,0:24:44.879
[br]Lucas: Yes, this makes sense, thank you.
0:24:44.879,0:24:49.070
So, first of all, what you see on the[br]slide here as the image doesn't really
0:24:49.070,0:24:52.299
have anything to do with what happens when[br]you type in wikipedia.org because it's an
0:24:52.299,0:24:57.249
offline Wikipedia reader but it's just a[br]nice image. So this is basically a summary
0:24:57.249,0:25:02.850
of everything they already said, so if,[br]which is the most common case, you are
0:25:02.850,0:25:10.969
lucky and get a URL which is cached, then,[br]so, first your computer asked for the IP
0:25:10.969,0:25:15.619
address of en.wikipedia.org it reaches[br]this whole DNS daemon and because we're at
0:25:15.619,0:25:19.239
Congress here it tells you the closest[br]data center is the one in Amsterdam, so
0:25:19.239,0:25:25.759
esams and it's going to hit the edge, what[br]we call load bouncers/router there, then
0:25:25.759,0:25:31.929
going through TLS termination through[br]nginx- and then it's going to hit the
0:25:31.929,0:25:36.809
Varnish caching server, either frontend or[br]backends and then you get a response and
0:25:36.809,0:25:40.940
that's already it and nothing else is ever[br]bothered again. It doesn't even reach any
0:25:40.940,0:25:46.320
other data center which is very nice and[br]so that's, you said around 90% of the
0:25:46.320,0:25:52.419
requests we get, and if you're unlucky and[br]the URL you requested is not in the
0:25:52.419,0:25:57.400
Varnish in the Amsterdam data center then[br]it gets forwarded to the eqiad data
0:25:57.400,0:26:01.519
center, which is the primary one and there[br]it still has a chance to hit the cache and
0:26:01.519,0:26:04.840
perhaps this time it's there and then the[br]response is going to get cached in the
0:26:04.840,0:26:09.739
frontend, no, in the Amsterdam Varnish and[br]you're also going to get a response and we
0:26:09.739,0:26:13.639
still don't have to run any application[br]stuff. If we do have to hit any
0:26:13.639,0:26:17.450
application stuff and then Varnish is[br]going to forward that, if it's
0:26:17.450,0:26:22.970
upload.wikimedia.org, it goes to the media[br]storage Swift, if it's any other domain it
0:26:22.970,0:26:28.450
goes to MediaWiki and then MediaWiki does[br]a ton of work to connect to the database,
0:26:28.450,0:26:33.529
in this case the first shard for English[br]Wikipedia, get the wiki text from there,
0:26:33.529,0:26:38.599
get the wiki text of all the related pages[br]and templates. No, wait I forgot
0:26:38.599,0:26:43.519
something. First it checks if the HTML for[br]this page is available in parser cache, so
0:26:43.519,0:26:46.909
that's another caching layer, and this[br]application cache - this parser cache
0:26:46.909,0:26:53.529
might either be memcached or the database[br]cache behind it and if it's not there,
0:26:53.529,0:26:57.679
then it has to go get the wikitext, get[br]all the related things and render that
0:26:57.679,0:27:03.679
into HTML which takes a long time and goes[br]through some pretty ancient code and if
0:27:03.679,0:27:07.779
you are doing an edit or an upload, it's[br]even worse, because then always has to go
0:27:07.779,0:27:13.969
to MediaWiki and then it not only has to[br]store this new edit, either in the media
0:27:13.969,0:27:19.629
back-end or in the database, it also has[br]update a bunch of stuff, like, especially
0:27:19.629,0:27:25.200
if you-- first of all, it has to purge the[br]cache, it has to tell all the Varnish
0:27:25.200,0:27:28.999
servers that there's a new version of this[br]URL available so that it doesn't take a
0:27:28.999,0:27:33.940
full day until the time-to-live expires.[br]It also has to update a bunch of things,
0:27:33.940,0:27:38.639
for example, if you edited a template, it[br]might have been used in a million pages
0:27:38.639,0:27:43.750
and the next time anyone requests one of[br]those million pages, those should also
0:27:43.750,0:27:49.019
actually be rendered again using the new[br]version of the template so it has to
0:27:49.019,0:27:54.149
invalidate the cache for all of those and[br]all that is deferred through the job queue
0:27:54.149,0:28:01.440
and it might have to calculate thumbnails[br]if you uploaded the file or create a -
0:28:01.440,0:28:06.609
retranscode media files because maybe you[br]uploaded in - what do we support? - you
0:28:06.609,0:28:09.839
upload in WebM and the browser only[br]supports some other media codec or
0:28:09.839,0:28:12.869
something, we transcode that and also[br]encode it down to the different
0:28:12.869,0:28:19.740
resolutions, so then it goes through that[br]whole dance and, yeah, that was already
0:28:19.740,0:28:23.769
those slides. Is Amir going to talk again[br]about how we manage -
0:28:23.769,0:28:29.519
Amir: I mean okay yeah I quickly come back[br]just for a short break to talk about
0:28:29.519,0:28:36.690
managing to manage because managing 100-[br]1300 bare metal hardware plus a Kubernetes
0:28:36.690,0:28:42.700
cluster is not easy, so what we do is that[br]we use Puppet for configuration
0:28:42.700,0:28:48.220
management in our bare metal systems, it's[br]fun, five to 50,000 lines of Puppet code. I
0:28:48.220,0:28:52.119
mean, lines of code is not a great[br]indicator but you can roughly get an
0:28:52.119,0:28:59.149
estimate of how its things work and we[br]have 100,000 lines of Ruby and we have our
0:28:59.149,0:29:04.429
CI and CD cluster, we have so we don't[br]store anything in GitHub or GitLab, we
0:29:04.429,0:29:10.559
have our own system which is based on[br]Gerrit and for that we have a system of
0:29:10.559,0:29:15.539
Jenkins and the Jenkins does all of this[br]kind of things and also because we have a
0:29:15.539,0:29:21.960
Kubernetes cluster for services, some of[br]our services, if you make a merger change
0:29:21.960,0:29:26.440
in the Gerrit it also builds the Docker[br]files and containers and push it up to the
0:29:26.440,0:29:35.440
production and also in order to run remote[br]SSH commands, we have cumin that's like in
0:29:35.440,0:29:39.200
the house automation and we built this[br]farm for our systems and for example you
0:29:39.200,0:29:45.570
go there and say ok we pull this node or[br]run this command in all of the data
0:29:45.570,0:29:52.889
Varnish nodes that I told you like you[br]want to restart them. And with this I get
0:29:52.889,0:29:57.899
back to Lucas.[br]Lucas: So, I am going to talk a bit more
0:29:57.899,0:30:01.929
about Wikimedia Cloud Services which is a[br]bit different in that it's not really our
0:30:01.929,0:30:06.269
production stuff but it's where you[br]people, the volunteers of the Wikimedia
0:30:06.269,0:30:11.489
movement can run their own code, so you[br]can request a project which is kind of a
0:30:11.489,0:30:15.509
group of users and then you get assigned a[br]pool of you have this much CPU and this
0:30:15.509,0:30:20.999
much RAM and you can create virtual[br]machines with those resources and then do
0:30:20.999,0:30:29.119
stuff there and run basically whatever you[br]want, to create and boot and shut down the
0:30:29.119,0:30:33.360
VMs and stuff we use OpenStack and there's[br]a Horizon frontend for that which you use
0:30:33.360,0:30:36.409
through the browser and it's largely out[br]all the time but otherwise it works pretty
0:30:36.409,0:30:42.619
well. Internally, ideally you manage the[br]VMs using Puppet but a lot of people just
0:30:42.619,0:30:47.860
SSH in and then do whatever they need to[br]set up the VM manually and it happens,
0:30:47.860,0:30:52.759
well, and there's a few big projects like[br]Toolforge where you can run your own web-
0:30:52.759,0:30:57.499
based tools or the beta cluster which is[br]basically a copy of some of the biggest
0:30:57.499,0:31:02.499
wikis like there's a beta English[br]Wikipedia, beta Wikidata, beta Wikimedia
0:31:02.499,0:31:08.320
Commons using mostly the same[br]configuration as production but using the
0:31:08.320,0:31:12.450
current master version of the software[br]instead of whatever we deploy once a week so
0:31:12.450,0:31:15.840
if there's a bug, we see it earlier[br]hopefully, even if we didn't catch it
0:31:15.840,0:31:20.279
locally, because the beta cluster is more[br]similar to the production environment and
0:31:20.279,0:31:24.230
also the continuous - continuous[br]integration service run in Wikimedia Cloud
0:31:24.230,0:31:28.979
Services as well. Yeah and also you have[br]to have Kubernetes somewhere on these
0:31:28.979,0:31:33.609
slides right, so you can use that to[br]distribute work between the tools in
0:31:33.609,0:31:37.179
Toolforge or you can use the grid engine[br]which does a similar thing but it's like
0:31:37.179,0:31:42.519
three decades old and through five forks[br]now I think the current fork we use is son
0:31:42.519,0:31:46.999
of grid engine and I don't know what it[br]was called before, but that's Cloud
0:31:46.999,0:31:54.789
Services.[br]Amir: So in a nutshell, this is our - our
0:31:54.789,0:32:01.090
systems. We have 1300 bare metal services[br]with lots and lots of caching, like lots
0:32:01.090,0:32:06.919
of layers of caching, because mostly we[br]serves read and we can just keep them as a
0:32:06.919,0:32:12.179
cached version and all of this is open[br]source, you can contribute to it, if you
0:32:12.179,0:32:18.089
want to and there's a lot of configuration[br]is also open and I - this is the way I got
0:32:18.089,0:32:21.940
hired like I open it started contributing[br]to the system I feel like yeah we can-
0:32:21.940,0:32:31.549
come and work for us, so this is a -[br]Daniel: That's actually how all of us got
0:32:31.549,0:32:38.350
hired.[br]Amir: So yeah, and this is the whole thing
0:32:38.350,0:32:47.570
that happens in Wikimedia and if you want[br]to - no, if you want to help us, we are
0:32:47.570,0:32:51.419
hiring. You can just go to jobs at[br]wikimedia.org, if you want to work for
0:32:51.419,0:32:54.379
Wikimedia Foundation. If you want to work[br]with Wikimedia Deutschland, you can go to
0:32:54.379,0:32:59.179
wikimedia.de and at the bottom there's a[br]link for jobs because the links got too
0:32:59.179,0:33:03.469
long. If you can contribute, if you want[br]to contribute to us, there is so many ways
0:33:03.469,0:33:07.929
to contribute, as I said, there's so many[br]bugs, we have our own graphical system,
0:33:07.929,0:33:12.721
you can just look at the monitor and a[br]Phabricator is our bug tracker, you can
0:33:12.721,0:33:20.639
just go there and find the bug and fix[br]things. Actually, we have one repository
0:33:20.639,0:33:26.469
that is private but it only holds the[br]certificate for as TLS and things that are
0:33:26.469,0:33:31.499
really really private then we cannot[br]remove them. But also there are
0:33:31.499,0:33:33.779
documentations, the documentation for[br]infrastructure is at
0:33:33.779,0:33:40.409
wikitech.wikimedia.org and documentation[br]for configuration is at noc.wikimedia.org
0:33:40.409,0:33:46.599
plus the documentation of our codebase.[br]The documentation for MediaWiki itself is
0:33:46.599,0:33:52.989
at mediawiki.org and also we have a our[br]own system of URL shortener you can go to
0:33:52.989,0:33:58.789
w.wiki and short and shorten any URL in[br]Wikimedia structure so we reserved the
0:33:58.789,0:34:08.779
dollar sign for the donate site and yeah,[br]you have any questions, please.
0:34:08.779,0:34:16.540
Applause
0:34:16.540,0:34:21.679
Daniel: It's if you know we have quite a bit of[br]time for questions so if anything wasn't
0:34:21.679,0:34:27.149
clear or they're curious about anything[br]please, please ask.
0:34:27.149,0:34:37.200
AM: So one question what is not in the[br]presentation. Do you have any efforts with
0:34:37.200,0:34:42.460
hacking attacks?[br]Amir: So the first rule of security issues
0:34:42.460,0:34:49.210
is that we don't talk about security issues[br]but let's say this baby has all sorts of
0:34:49.210,0:34:56.240
attacks happening, we have usually we have[br]DDo. Once there was happening a couple of
0:34:56.240,0:34:59.819
months ago that was very successful. I[br]don't know if you read the news about
0:34:59.819,0:35:05.200
that, but we also, we have a infrastructure[br]to handle this, we have a security team
0:35:05.200,0:35:12.740
that handles these cases and yes.[br]AM: Hello how do you manage access to your
0:35:12.740,0:35:20.069
infrastructure from your employees?[br]Amir: So it's SS-- so we have a LDAP
0:35:20.069,0:35:25.390
group and LDAP for the web-based[br]systems but for SSH and for this ssh we
0:35:25.390,0:35:30.660
have strict protocols and then you get a[br]private key and some people usually
0:35:30.660,0:35:35.480
protect their private key using UV keys[br]and then you have you can SSH to the
0:35:35.480,0:35:40.420
system basically.[br]Lucas: Yeah, well, there's some
0:35:40.420,0:35:44.720
firewalling setup but there's only one[br]server for data center that you can
0:35:44.720,0:35:48.221
actually reach through SSH and then you[br]have to tunnel through that to get to any
0:35:48.221,0:35:51.359
other server.[br]Amir: And also, like, we have we have a
0:35:51.359,0:35:55.500
internal firewall and it's basically if[br]you go to the inside of the production you
0:35:55.500,0:36:01.450
cannot talk to the outside. You even, you[br]for example do git clone github.org, it
0:36:01.450,0:36:07.200
doesn't, github.com doesn't work. It[br]only can access tools that are for inside
0:36:07.200,0:36:13.390
Wikimedia Foundation infrastructure.[br]AM: Okay, hi, you said you do TLS
0:36:13.390,0:36:18.640
termination through nginx, do you still[br]allow non-HTTPS so it should be non-secure access.
0:36:18.640,0:36:22.780
Amir: No we dropped it a really long[br]time ago but also
0:36:22.780,0:36:25.069
Lucas: 2013 or so[br]Amir: Yeah, 2015
0:36:25.069,0:36:28.651
Lucas: 2015[br]Amir: 2013 started serving the most of the
0:36:28.651,0:36:35.740
traffic but 15, we dropped all of the[br]HTTP- non-HTTPS protocols and recently even
0:36:35.740,0:36:43.940
dropped and we are not serving any SSL[br]requests anymore and TLS 1.1 is also being
0:36:43.940,0:36:48.460
phased out, so we are sending you a warning[br]to the users like you're using TLS 1.1,
0:36:48.460,0:36:54.810
please migrate to these new things that[br]came out around 10 years ago, so yeah
0:36:54.810,0:36:59.849
Lucas: Yeah I think the deadline for that[br]is like February 2020 or something then
0:36:59.849,0:37:04.710
we'll only have TLS 1.2[br]Amir: And soon we are going to support TLS
0:37:04.710,0:37:06.640
1.3[br]Lucas: Yeah
0:37:06.640,0:37:12.460
Are there any questions?[br]Q: so does read-only traffic
0:37:12.460,0:37:18.029
from logged in users hit all the way[br]through to the parser cache or is there
0:37:18.029,0:37:22.280
another layer of caching for that?[br]Amir: Yes we, you bypass all of
0:37:22.280,0:37:28.470
that, you can.[br]Daniel: We need one more microphone. Yes,
0:37:28.470,0:37:33.869
it actually does and this is a pretty big[br]problem and something we want to look into
0:37:33.869,0:37:38.930
clears throat but it requires quite a[br]bit of rearchitecting. If you are
0:37:38.930,0:37:44.250
interested in this kind of thing, maybe[br]come to my talk tomorrow at noon.
0:37:44.250,0:37:48.819
Amir: Yeah one reason we can, we are[br]planning to do is active active so we have
0:37:48.819,0:37:56.500
two primaries and the read request gets[br]request - from like the users can hit
0:37:56.500,0:37:58.460
their secondary data center instead of the[br]main one.
0:37:58.460,0:38:03.990
Lucas: I think there was a question way in[br]the back there, for some time already
0:38:03.990,0:38:13.950
AM: Hi, I got a question. I read on the[br]Wikitech that you are using karate as a
0:38:13.950,0:38:19.040
validation platform for some parts, can[br]you tell us something about this or what
0:38:19.040,0:38:24.619
parts of Wikipedia or Wikimedia are hosted[br]on this platform?
0:38:24.619,0:38:29.589
Amir: I am I'm not oh sorry so I don't[br]know this kind of very very sure but take
0:38:29.589,0:38:34.390
it with a grain of salt but as far as I[br]know karate is used to build a very small
0:38:34.390,0:38:39.829
VMs in productions that we need for very[br]very small micro sites that we serve to
0:38:39.829,0:38:45.619
the users. So we built just one or two VMs,[br]we don't use it very as often as I think
0:38:45.619,0:38:54.819
so.[br]AM: Do you also think about open hardware?
0:38:54.819,0:39:03.950
Amir: I don't, you can[br]Daniel: Not - not for servers. I think for
0:39:03.950,0:39:07.500
the offline Reader project, but this is not[br]actually run by the Foundation, it's
0:39:07.500,0:39:10.289
supported but it's not something that the[br]Foundation does. They were sort of
0:39:10.289,0:39:15.100
thinking about open hardware but really[br]open hardware in practice usually means,
0:39:15.100,0:39:19.609
you - you don't, you know, if you really[br]want to go down to the chip design, it's
0:39:19.609,0:39:25.210
pretty tough, so yeah, it's- it's it- it's[br]usually not practical, sadly.
0:39:25.210,0:39:31.660
Amir: And one thing I can say but this is[br]that we have a some machine - machines that
0:39:31.660,0:39:37.150
are really powerful that we give to the[br]researchers to run analysis on the between
0:39:37.150,0:39:43.369
this itself and we needed to have GPUs for[br]those but the problem was - was there
0:39:43.369,0:39:49.109
wasn't any open source driver for them so[br]we migrated and use AMD I think, but AMD
0:39:49.109,0:39:53.609
didn't fit in the rack it was a quite a[br]endeavor to get it to work for our
0:39:53.609,0:40:03.710
researchers to help you CPU.[br]AM: I'm still impressed that you answer
0:40:03.710,0:40:10.920
90% out of the cache. Do all people access[br]the same pages or is the cache that huge?
0:40:10.920,0:40:21.160
So what percentage of - of the whole[br]database is in the cache then?
0:40:21.160,0:40:29.760
Daniel: I don't have the exact numbers to[br]be honest, but a large percentage of the
0:40:29.760,0:40:36.769
whole database is in the cache. I mean it[br]expires after 24 hours so really obscure
0:40:36.769,0:40:43.430
stuff isn't there but I mean it's- it's a-[br]it's a- it's a power-law distribution
0:40:43.430,0:40:47.890
right? You have a few pages that are[br]accessed a lot and you have many many many
0:40:47.890,0:40:55.420
pages that are not actually accessed[br]at all for a week or so except maybe for a
0:40:55.420,0:41:01.740
crawler, so I don't know a number. My[br]guess would be it's less than 50% that is
0:41:01.740,0:41:06.520
actually cached but, you know, that still[br]covers 90%-- it's probably the top 10% of
0:41:06.520,0:41:11.630
pages would still cover 90% of the[br]pageviews, but I don't-- this would be
0:41:11.630,0:41:15.509
actually-- I should look this up, it would[br]be interesting numbers to have, yes.
0:41:15.509,0:41:20.710
Lucas: Do you know if this is 90% of the[br]pageviews or 90% of the get requests
0:41:20.710,0:41:24.279
because, like, requests for the JavaScript[br]would also be cached more often, I assume
0:41:24.279,0:41:27.529
Daniel: I would expect that for non-[br]pageviews, it's even higher
0:41:27.529,0:41:30.010
Lucas: Yeah[br]Daniel: Yeah, because you know all the
0:41:30.010,0:41:34.150
icons and- and, you know, JavaScript[br]bundles and CSS and stuff doesn't ever
0:41:34.150,0:41:40.309
change[br]Lucas: I'm gonna say for every 180 min 90%
0:41:40.309,0:41:50.790
but there's a question back there[br]AM: Hey. Do your data centers run on green
0:41:50.790,0:41:55.220
energy?[br]Amir: Very valid question. So, the
0:41:55.220,0:42:03.450
Amsterdam city n1 is a full green but the[br]other ones are partially green, partially
0:42:03.450,0:42:10.840
coal and like gas. As far as I know, there[br]are some plans to make them move away from
0:42:10.840,0:42:15.170
it but the other hand we realized that if[br]we don't produce as much as a carbon
0:42:15.170,0:42:21.349
emission because we don't have much servers[br]and we don't use much data, there was a
0:42:21.349,0:42:26.789
summation and that we realized our carbon[br]emission is basically as the same as 200
0:42:26.789,0:42:34.720
and in the datacenter plus all of their[br]travel that all of this have to and all of
0:42:34.720,0:42:37.880
the events is 250 households, it's very[br]very small it's I think it's one
0:42:37.880,0:42:44.890
thousandth of the comparable[br]traffic with Facebook even if you just cut
0:42:44.890,0:42:50.650
down with the same traffic because[br]Facebook collects the data, it runs very
0:42:50.650,0:42:54.269
sophisticated machine learning algorithms[br]that's that's a real complicate, but for
0:42:54.269,0:43:01.119
Wikimedia, we don't do this so we don't[br]need much energy. Does - does the answer
0:43:01.119,0:43:04.920
your question?[br]Herald: Do we have any other
0:43:04.920,0:43:15.720
questions left? Yeah sorry[br]AM: hi how many developers do you need to
0:43:15.720,0:43:19.789
maintain the whole infrastructure and how[br]many developers or let's say head
0:43:19.789,0:43:24.500
developer hours you needed to build the[br]whole infrastructure like the question is
0:43:24.500,0:43:29.329
because what I find very interesting about[br]the talk it's a non-profit, so as an
0:43:29.329,0:43:34.109
example for other nonprofits is how much[br]money are we talking about in order to
0:43:34.109,0:43:38.760
build something like this as a digital[br]common.
0:43:45.630,0:43:48.980
Daniel: If this is just about actually[br]running all this so just operations is
0:43:48.980,0:43:53.530
less than 20 people I think which makes if[br]you if you basically divide the requests
0:43:53.530,0:43:59.869
per second by people you get to something[br]like 8,000 requests per second per
0:43:59.869,0:44:04.369
operations engineer which I think is a[br]pretty impressive number. This is probably
0:44:04.369,0:44:09.809
a lot higher I would I would really like[br]to know if there's any organization that
0:44:09.809,0:44:17.270
tops that. I don't actually know the whole[br]the the actual operations budget I know is
0:44:17.270,0:44:24.559
it two two-digit millions annually. Total[br]hours for building this over the last 18
0:44:24.559,0:44:29.069
years, I have no idea. For the for the[br]first five or so years, the people doing
0:44:29.069,0:44:34.609
it were actually volunteers. We still had[br]volunteer database administrators and
0:44:34.609,0:44:42.160
stuff until maybe ten years ago, eight[br]years ago, so yeah it's really nobody
0:44:42.160,0:44:44.589
did any accounting of this I can only[br]guess.
0:44:56.669,0:45:03.810
AM: Hello a tools question. I a few years[br]back I saw some interesting examples of
0:45:03.810,0:45:09.089
saltstack use for Wikimedia but right now[br]I see only Puppet that come in mentioned
0:45:09.089,0:45:17.819
so kind of what happened with that[br]Amir: I think we dished saltstack you -
0:45:17.819,0:45:22.970
I don't I cannot because none of us are in[br]the Cloud Services team and I don't think
0:45:22.970,0:45:27.380
I can answer you but if you look at the[br]wikitech.wikimedia.org, it's
0:45:27.380,0:45:30.869
probably if last time I checked says like[br]it's deprecated and obsolete we don't use
0:45:30.869,0:45:32.144
it anymore.
0:45:37.394,0:45:39.920
AM: Do you use the bat-ropes like the top
0:45:39.920,0:45:46.130
runners to fill spare capacity on the web[br]serving servers or do you have dedicated
0:45:46.130,0:45:51.589
servers for the roles.[br]Lucas: I think they're dedicated.
0:45:51.589,0:45:56.390
Amir: The job runners if you're asking job runners [br]are dedicated yes they are they are I
0:45:56.390,0:46:02.910
think 5 per primary data center so[br]Daniel: Yeah they don't, I mean do we do we
0:46:02.910,0:46:06.559
actually have any spare capacity on[br]anything? We don't have that much hardware
0:46:06.559,0:46:08.700
everything is pretty much at a hundred[br]percent.
0:46:08.700,0:46:14.109
Lucas: I think we still have some server[br]that is just called misc1111 or something
0:46:14.109,0:46:18.620
which run five different things at once,[br]you can look for those on wikitech.
0:46:18.620,0:46:25.820
Amir: But but we go oh sorry it's not five[br]it's 20 per data center 20 per primary
0:46:25.820,0:46:31.440
data center that's our job runner and they[br]run 700 jobs per second.
0:46:31.440,0:46:35.690
Lucas: And I think that does not include[br]the video scaler so those are separate
0:46:35.690,0:46:38.109
again[br]Amir: No, they merged them in like a month
0:46:38.109,0:46:40.040
ago[br]Lucas: Okay, cool
0:46:47.470,0:46:51.420
AM: Maybe a little bit off topic that can[br]tell us a little bit about decision making
0:46:51.420,0:46:55.750
process for- for technical decision,[br]architecture decisions, how does it work
0:46:55.750,0:47:01.890
in an organization like this: decision[br]making process for architectural
0:47:01.890,0:47:03.409
decisions for example.
0:47:08.279,0:47:11.009
Daniel: Yeah so Wikimedia has a
0:47:11.009,0:47:16.539
committee for making high-level technical[br]decisions, it's called a Wikimedia
0:47:16.539,0:47:23.609
Technical Committee, techcom and we run an[br]RFC process so any decision that is a
0:47:23.609,0:47:27.540
cross-cutting strategic are especially[br]hard to undo should go through this
0:47:27.540,0:47:33.579
process and it's pretty informal,[br]basically you file a ticket and start
0:47:33.579,0:47:38.000
this process. It gets announced[br]in the mailing list, hopefully you get
0:47:38.000,0:47:45.009
input and feedback and at some point it is[br]it's approved for implementation. We're
0:47:45.009,0:47:48.640
currently looking into improving this[br]process, it's not- sometimes it works
0:47:48.640,0:47:52.200
pretty well, sometimes things don't get[br]that much feedback but it still it makes
0:47:52.200,0:47:55.890
sure that people are aware of these high-[br]level decisions
0:47:55.890,0:47:59.790
Amir: Daniel is the chair of that[br]committee
0:48:02.160,0:48:07.839
Daniel: Yeah, if you want to complain[br]about the process, please do.
0:48:13.549,0:48:21.440
AM: yes regarding CI and CD across along the[br]pipeline, of course with that much traffic
0:48:21.440,0:48:27.359
you want to keep everything consistent[br]right. So is there any testing
0:48:27.359,0:48:32.150
strategies that you have said internally,[br]like of course unit tests integration
0:48:32.150,0:48:35.790
tests but do you do something like[br]continuous end to end testing on beta
0:48:35.790,0:48:40.100
instances?[br]Amir: So if we have beta cluster but also
0:48:40.100,0:48:44.670
we do deploy, we call it train and so[br]we deploy once a week, all of the changes
0:48:44.670,0:48:50.349
gets merged to one, like a branch and the[br]branch gets cut in every Tuesday and it
0:48:50.349,0:48:54.680
first goes to the test wikis and[br]then it goes to all of the wikis that are
0:48:54.680,0:48:59.270
not Wikipedia except Catalan and Hebrew[br]Wikipedia. So basically Hebrew and Catalan
0:48:59.270,0:49:03.759
Wikipedia volunteer to be the guinea pigs[br]of the next wikis and if everything works
0:49:03.759,0:49:07.599
fine usually it goes there and is like oh[br]the fatal mater and we have a logging and
0:49:07.599,0:49:12.579
then it's like okay we need to fix this[br]and we fix it immediately and then it goes
0:49:12.579,0:49:18.690
live to all wikis. This is one way of[br]looking at it well so okay yeah
0:49:18.690,0:49:23.279
Daniel: So, our test coverage is not as[br]great as it should be and so we kind of,
0:49:23.279,0:49:30.970
you know, abuse our users for this. We[br]are, of course, working to improve this
0:49:30.970,0:49:37.230
and one thing that we started recently is[br]a program for creating end-to-end tests
0:49:37.230,0:49:43.460
for all the API modules we have, in the[br]hope that we can thereby cover pretty much
0:49:43.460,0:49:49.849
all of the application logic bypassing the[br]user interface. I mean, full end-to-end
0:49:49.849,0:49:52.770
should, of course, include the user[br]interface but user interface tests are
0:49:52.770,0:49:58.180
pretty brittle and often tests you know[br]where things are on the screen and it just
0:49:58.180,0:50:02.559
seems to us that it makes a lot of sense[br]to have more- to have tests that actually
0:50:02.559,0:50:07.259
test the application logic for what the[br]system actually should be doing, rather
0:50:07.259,0:50:15.910
than what it should look like and, yeah,[br]we are currently working on making- so
0:50:15.910,0:50:20.210
yeah, basically this has been a proof of[br]concept and we're currently working to
0:50:20.210,0:50:27.079
actually integrate it in- in CI. That[br]perhaps should land once everyone is back
0:50:27.079,0:50:34.560
from the vacations and then we have to[br]write about a thousand or so tests, I
0:50:34.560,0:50:37.930
guess.[br]Lucas: I think there's also a plan to move
0:50:37.930,0:50:42.559
to a system where we actually deploy[br]basically after every commit and can
0:50:42.559,0:50:45.910
immediately roll back if something goes[br]wrong but that's more midterm stuff and
0:50:45.910,0:50:48.339
I'm not sure what the current status of[br]that proposal is
0:50:48.339,0:50:50.450
Amir: And it will be in Kubernetes, so it[br]will be completely different
0:50:50.450,0:50:55.529
Daniel: That would be amazing[br]Lucas: But right now, we are on this
0:50:55.529,0:50:59.730
weekly basis, if something goes wrong, we[br]roll back to the last week's version of
0:50:59.730,0:51:06.049
the code[br]Herald: Are there are any questions-
0:51:06.049,0:51:18.549
questions left? Sorry. Yeah. Okay, um, I[br]don't think so. So, yeah, thank you for
0:51:18.549,0:51:25.329
this wonderful talk. Thank you for all[br]your questions. Um, yeah, I hope you liked
0:51:25.329,0:51:29.750
it. Um, see you around, yeah.
0:51:29.750,0:51:33.725
Applause
0:51:33.725,0:51:39.270
Music
0:51:39.270,0:52:01.000
Subtitles created by c3subtitles.de[br]in the year 2021. Join, and help us!