WEBVTT
00:00:00.000 --> 00:00:18.890
Music
00:00:18.890 --> 00:00:24.350
Herald: Hello everybody, we are ready to
get started we have Lucas and Amir here
00:00:24.350 --> 00:00:29.170
and they want to give us a quick
introduction of a project from the
00:00:29.170 --> 00:00:33.540
wikimedia foundation called "cloud
services" and how it might be may be
00:00:33.540 --> 00:00:39.110
useful to all of us. So let's give a round
of welcoming applause to Lucas and Amir.
00:00:39.110 --> 00:00:42.850
Applause
00:00:42.850 --> 00:00:49.490
Lucas: Thanks! yea, hello. So "wikimedia
cloud services" is basically this big
00:00:49.490 --> 00:00:55.230
collection of all kinds of different
things which are useful if you want to do
00:00:55.230 --> 00:00:58.780
taking your things in the wikimedia
universe like with wikipedia or other
00:00:58.780 --> 00:01:05.920
projects and you get them free of charge
or you can just use them and the only
00:01:05.920 --> 00:01:09.880
requirement is that you use them for
something that's a kind of relevant to the
00:01:09.880 --> 00:01:14.400
mission of wikimedia of promoting free
knowledge and that kind of stuff and it's
00:01:14.400 --> 00:01:18.360
kind of split into the things that you can
do with your regular wikimedia account
00:01:18.360 --> 00:01:22.750
which any registered user can do and then
there's also things you need a special
00:01:22.750 --> 00:01:26.250
account for on a different system called
wiki tech and Amir is going to talk more
00:01:26.250 --> 00:01:30.030
about those later but first let's just
look into some of the things you can do
00:01:30.030 --> 00:01:34.420
with your regular wikimedia account. And
if you want to follow any of these links
00:01:34.420 --> 00:01:39.220
there's a shortcut here. I was about to
switch the next tab, so let's just stay
00:01:39.220 --> 00:01:45.211
here for a few seconds yeah. So the first
thing is the API sandbox which is if you
00:01:45.211 --> 00:01:51.590
want to use the MediaWiki API to figure
out what you have on a page or to make
00:01:51.590 --> 00:01:55.820
edits or any kind of stuff. The API
sandbox is a special page that's really
00:01:55.820 --> 00:02:00.630
useful to find out how to use the API for
example here's all the different actions I
00:02:00.630 --> 00:02:08.179
can use that say query is the kind of
general catch-all action that's here and
00:02:08.179 --> 00:02:12.700
then I get down here a list of all the
parameters I can use with queries such as:
00:02:12.700 --> 00:02:20.160
I want to have all the user info and what
kind of user info do I want? I want
00:02:20.160 --> 00:02:24.280
options, blablabla. I would like to have
some different format versions. So it
00:02:24.280 --> 00:02:29.070
gives you all these nice inputs for
figuring out exactly how to use the API
00:02:29.070 --> 00:02:33.000
what's valid what's not valid and then you
can make the API request and there you get
00:02:33.000 --> 00:02:38.440
a response and we can't read anything
because it's zoomed in way too much. But
00:02:38.440 --> 00:02:42.880
it's very helpful when trying to use the
API and then in the end here you can see
00:02:42.880 --> 00:02:49.280
what you need to do in your own code to
make the same API request. And for
00:02:49.280 --> 00:02:54.949
anything that you can't do with the normal
API - so if you want to do some kind of
00:02:54.949 --> 00:02:59.570
more expensive analysis - you can often do
that with Quarry, which is a tool that
00:02:59.570 --> 00:03:04.910
lets you write SQL queries against
databases that are almost like the ones in
00:03:04.910 --> 00:03:09.310
production like you don't have user
passwords and stuff but you'll have all
00:03:09.310 --> 00:03:16.100
the database tables with page metadata and
connections between them and the logs and
00:03:16.100 --> 00:03:20.370
all kinds of stuff and you can just write
your SQL here send it and you get the
00:03:20.370 --> 00:03:25.310
results for example here's the number of
lexemes published a days so it's some kind
00:03:25.310 --> 00:03:30.790
of selecting from the page where the
namespace is the lexeme namespace and
00:03:30.790 --> 00:03:41.260
grouping that by the date and then we get
something like all the way down to
00:03:41.260 --> 00:03:46.260
September which is apparently when I ran
this query there were here there were 116
00:03:46.260 --> 00:03:52.630
lexemes created in this day. Or here
someone had a list of edits to JavaScript
00:03:52.630 --> 00:03:57.810
and CSS pages on Hungarian Wikipedia so
you can run these queries against any Wiki
00:03:57.810 --> 00:04:07.090
you like, like this here in wikipedia one.
And if you can't get by with just SQL what
00:04:07.090 --> 00:04:13.340
you also have is this thing called Paws,
which gives you a Jupiter(?) instance if
00:04:13.340 --> 00:04:18.970
you've heard of that you can basically
write your own Python code here and do it
00:04:18.970 --> 00:04:23.900
in a very convenient way because there's
all kinds of auto-completion and helpful
00:04:23.900 --> 00:04:36.780
things. So i can just try to copy this and
run the code (then I needed a new cell
00:04:36.780 --> 00:04:43.000
below it… there we go, Thanks!) and if I
type item I should get helpful hints what
00:04:43.000 --> 00:04:50.350
I can do with the item (if it's not
hanging or something or the tab control
00:04:50.350 --> 00:04:58.650
space no oh there we go yeah) and it's
also a very useful way to work with py-
00:04:58.650 --> 00:05:07.310
wiki-bot or you can also directly get
normal shell here. And one thing (oops did
00:05:07.310 --> 00:05:11.890
I click and wrong thing? I would like to
have oh no I don't want a bash notebook I
00:05:11.890 --> 00:05:20.750
want a new terminal that's what I want).
And here you have for example database
00:05:20.750 --> 00:05:33.010
dumps in (where was it?) public/dumps/
something public again… So if you want to
00:05:33.010 --> 00:05:40.660
do some kind of analysis here on the data
dumps you can get them here and then have
00:05:40.660 --> 00:05:47.530
all the computing that you want I guess to
analyze the wiki more thoroughly and all
00:05:47.530 --> 00:05:51.720
of this is hosted in the Wikimedia Cloud
for you and you don't need your own server
00:05:51.720 --> 00:05:56.950
or anything. Oh yeah I had two more
examples of that, for example here: I use
00:05:56.950 --> 00:06:01.750
that too so there were a lot of items on
Wikipedia where there was some encoding
00:06:01.750 --> 00:06:06.360
error, this should be an apostrophe like
down here and instead it was this kind of
00:06:06.360 --> 00:06:12.680
I with an accent and I hacked together
some ugly Java/Python code to make all of
00:06:12.680 --> 00:06:16.609
these edits and it was already logged in
as well I didn't need to worry about
00:06:16.609 --> 00:06:21.060
logging in or having a password or
anything. So it's a very convenient way to
00:06:21.060 --> 00:06:29.280
make edits as well. Or you can build
something nicer here you can insert like
00:06:29.280 --> 00:06:36.950
markdown cells to explain what you're
doing and how the code works and build
00:06:36.950 --> 00:06:41.700
nice notebooks like that, which are almost
self-explanatory. And those are some of
00:06:41.700 --> 00:06:44.880
the things you can do just with your
Wikimedia account and now Amir is going to
00:06:44.880 --> 00:06:49.180
talk about some other things.
Amir: Thanks Lucas! So the thing that we
00:06:49.180 --> 00:06:55.010
can do is that maybe some of you like me
think that doing thing in browser is for
00:06:55.010 --> 00:07:00.130
kids I need to do things in terminal I
need to do connected system and then you
00:07:00.130 --> 00:07:05.340
can access for a wiki tech account which
you can just make a wiki tech account in
00:07:05.340 --> 00:07:11.980
this place called wiki tech. (where is the
li… no no but I do'… the main thing, the
00:07:11.980 --> 00:07:20.520
main list. yeah okay) And so in here so
and then you make a wiki tech account and
00:07:20.520 --> 00:07:24.650
it gets approved quickly and then you get
the shell and then you can just quickly go
00:07:24.650 --> 00:07:30.440
there (where is yer…) and you can go to
this shell and just log in and then you
00:07:30.440 --> 00:07:35.260
have access to day a big set of nodes in
the cloud and you can just do whatever you
00:07:35.260 --> 00:07:40.520
want. Also you have access to the core
dumps and you have access to the replica
00:07:40.520 --> 00:07:58.670
database. Let me show it to you.
[mumbling] So for example you can go to LS
00:07:58.670 --> 00:08:13.750
/public/dumps/public/wikidatawiki/ and
then you get - oh there's like all sorts
00:08:13.750 --> 00:08:18.790
of time and everything that you want to,
but if you also… you can do something else
00:08:18.790 --> 00:08:32.780
is that you can just do SQL wikidatawiki
and then you go inside the wikidata's
00:08:32.780 --> 00:08:36.150
database, I mean it does you don't have
the rights you can you cannot write to
00:08:36.150 --> 00:08:40.329
their replica because it's a replica and
also it's sanitized so it doesn't have
00:08:40.329 --> 00:08:49.740
their like hash of user password and stuff
like that but still you can do just select
00:08:49.740 --> 00:09:09.130
varies from recent changes limits five and
yeah and then you get all of the things
00:09:09.130 --> 00:09:15.140
that you want you cannot even describe
anything you want to directly into their
00:09:15.140 --> 00:09:20.171
system and then there is also we have
something called the job grid so you can
00:09:20.171 --> 00:09:25.310
just put a crown and anything that you
want to or just it's something run
00:09:25.310 --> 00:09:30.690
something directly and you goes to the a
big note of cloud kubernetes and then just
00:09:30.690 --> 00:09:35.820
runs everything that you want to in its
here there's a more information about it
00:09:35.820 --> 00:09:42.690
in here there's a like a long help that it
says like oh I used to run this job and
00:09:42.690 --> 00:09:48.260
then job of what it does and you can get
this so you just need to it's a bash
00:09:48.260 --> 00:09:54.780
command you can run any bash command and
send it okay return me this output to this
00:09:54.780 --> 00:10:00.140
place and the other places one thing that
you can do is also there's a web server
00:10:00.140 --> 00:10:05.380
that you can access everything directly so
you can just put a PHP file there and into
00:10:05.380 --> 00:10:12.720
the Apache and then yet for example this
is this is an example that we built
00:10:12.720 --> 00:10:19.380
together I think two two Christmases ago
but this was like you can just see this is
00:10:19.380 --> 00:10:23.710
a piece before the source code is
available and you just copy pasted that
00:10:23.710 --> 00:10:28.350
source code into like a directory and it
was there and every time we click on it
00:10:28.350 --> 00:10:32.210
and you get most of the edits that happen
on description wiki data that might be
00:10:32.210 --> 00:10:38.040
vandalism and we can fix it also a this is
not just the only thing that you can do
00:10:38.040 --> 00:10:45.050
with this is that you can also put a
Python flask application is this the file
00:10:45.050 --> 00:10:50.714
implants and then this can be just a
Python application and you can just have
00:10:50.714 --> 00:10:57.830
the file there and also know JSON Java
there's so many of them also you can have
00:10:57.830 --> 00:11:01.230
own database like I have something that
has its own database for example quick
00:11:01.230 --> 00:11:09.520
categories in here has jobs that are here
this is this tool for its own built-in
00:11:09.520 --> 00:11:15.480
database inside our select cloud services
and its uses it just fine you can do that
00:11:15.480 --> 00:11:21.930
as well and also there's a cloud VPS that
it doesn't do any kubernetes it just you
00:11:21.930 --> 00:11:27.070
can make a VPS of your own and then do
whatever you want with it so for example
00:11:27.070 --> 00:11:31.770
and you get a project and you get the
quota it's a slightly more limited but
00:11:31.770 --> 00:11:35.799
also you have access to the whole VPS you
have sudo rights on it you can do whatever
00:11:35.799 --> 00:11:40.400
you feel like about it so we have like for
example this project in here and it's
00:11:40.400 --> 00:11:47.050
called tools and then there's proxies and
you can for example go into that instance
00:11:47.050 --> 00:11:52.170
and reboot it and do whatever you want and
you can make new instance and look at your
00:11:52.170 --> 00:11:58.770
culture and look at everything else there
and also you can also make it even a wiki
00:11:58.770 --> 00:12:05.740
on one of those cloud VPS systems which is
for example we did in here in here if you
00:12:05.740 --> 00:12:10.750
look at it it's just a wiki and the
difference is that for other ones for
00:12:10.750 --> 00:12:14.590
example for the vandalism dashboard you
have tools that wmf labs org and then
00:12:14.590 --> 00:12:22.149
slash WD w VD which is the tool itself but
in here we get our own subdomain so which
00:12:22.149 --> 00:12:28.870
will be wiki data - like seam that flew
out the wmf labs org and you can even put
00:12:28.870 --> 00:12:35.900
all sorts of add subdomains for the wmf
labs or as long it's not taken so you can
00:12:35.900 --> 00:12:42.260
build a media week instance instance or
you can just complete a new software
00:12:42.260 --> 00:12:47.810
anything you can put a word processor who
cares and then you can use it it's very
00:12:47.810 --> 00:12:58.970
simple your own thing and you can help
lots of experience. Anything else?
00:12:58.970 --> 00:13:00.250
Lucas: I don't think so. Most
00:13:00.250 --> 00:13:06.430
important I would say is tool Forge to run
your websites or if that's not enough for
00:13:06.430 --> 00:13:10.550
you cloud VPS and then you get your own
VMware you can do absolutely anything you
00:13:10.550 --> 00:13:19.970
want as long as it matches those rules and
stuff and I think that's it are there any questions?
00:13:19.970 --> 00:13:24.890
Herald: Hello thank you very much for
the talk that was very quick so maybe
00:13:24.890 --> 00:13:34.600
anybody has a question here I'll give you
my microphone to ask it. I don't see any
00:13:34.600 --> 00:13:41.630
hands nope okay I don't think we have
questions but if you're just too shy to
00:13:41.630 --> 00:13:47.019
ask I think these guys always hanging
around here around the wikipaka wiki so
00:13:47.019 --> 00:13:52.510
if you have anything you want to talk
about you'll find them later okay then
00:13:52.510 --> 00:13:56.130
give a round of applause again
for Lucas and Amir.
00:13:56.130 --> 00:13:58.830
Applause
00:13:58.830 --> 00:14:26.000
Music