WEBVTT

00:00:00.000 --> 00:00:18.420
<i>36C3 preroll music</i>

00:00:18.420 --> 00:00:24.530
Herald-Angel: This Talk will be about… I
have to read this. Mathematical diseases

00:00:24.530 --> 00:00:28.600
in climate models and how to cure them.
And I don't have the slightest idea what

00:00:28.600 --> 00:00:33.850
these two guys are talking about now. And
when I asked them, they said, just tell

00:00:33.850 --> 00:00:40.039
the people it's about next generation
climate models and how to build them.

00:00:40.039 --> 00:00:45.789
Which is cool. Throw that on Twitter.
Please welcome Ali Ramadhan and Valentin

00:00:45.789 --> 00:00:47.229
Churavy.

00:00:47.229 --> 00:00:55.239
<i>applause</i>

00:00:55.239 --> 00:00:58.509
Ali Ramadhan: Can you guys hear us? Is
this OK?

00:00:58.509 --> 00:01:00.210
Valentin Churavy: I'll stand back...
Ramadhan: I'll stand back a little bit.

00:01:00.210 --> 00:01:05.030
OK, cool. Thank you. So if you guys saw
the last talk by karlab... karlabyrinth or

00:01:05.030 --> 00:01:08.530
something. So we're kind of expanding on
her talk a little bit. So she talked a lot

00:01:08.530 --> 00:01:11.209
about kind of uncertainties…
<i>audio feedback from microphone</i>

00:01:11.209 --> 00:01:15.670
uncertainties in climate models. And one
point that she did make was that most of

00:01:15.670 --> 00:01:18.410
the uncertainty actually comes from
humans. But there's a really huge

00:01:18.410 --> 00:01:22.070
uncertainty that also comes from… comes
from the models. So we're talking more

00:01:22.070 --> 00:01:26.170
about the model uncertainties, which is
kind of uncertainties because of unknown

00:01:26.170 --> 00:01:31.590
or missing physics, kind of how to cure
them. So it'll be kind of a weird talk. So

00:01:31.590 --> 00:01:35.119
I'll talk a little bit more about the
climate modeling part and then kind of how

00:01:35.119 --> 00:01:39.840
to cure them involves using new programing
languages. And that's where Valentin will

00:01:39.840 --> 00:01:44.810
talk about Julia. So we'll kind of just
start with maybe just giving kind of an

00:01:44.810 --> 00:01:50.080
idea of why it's so hard to model the
climate. So if you've… maybe you've seen

00:01:50.080 --> 00:01:54.550
images like this a lot where it's like a…
it's a satellite image basically of

00:01:54.550 --> 00:01:58.530
clouds. It's used for like weather
forecasting. But you can immediately see

00:01:58.530 --> 00:02:02.320
there's lots of, you know, lots of really
small clouds. So basically, if you want to

00:02:02.320 --> 00:02:06.250
build a climate model, you've got to be
able to resolve all the physics in these

00:02:06.250 --> 00:02:10.770
clouds. So you can actually zoom in a lot.
And see the clouds look pretty big over

00:02:10.770 --> 00:02:15.140
here. But if you zoom in on kind of
Central America, then you see even smaller

00:02:15.140 --> 00:02:21.509
clouds. And if you zoom in even more so,
so you zoom in on the Yucatan Peninsula,

00:02:21.509 --> 00:02:24.780
then you can see the clouds are really,
really small. So you're… there are maybe

00:02:24.780 --> 00:02:28.980
five smaller clouds, some some of the
clouds are, you know, a hundred meters or

00:02:28.980 --> 00:02:33.709
something. And as the last talk kind of
suggests that most climate models are…

00:02:33.709 --> 00:02:38.990
they resolve things of, you know, up to 50
kilometers. So anything smaller than 50

00:02:38.990 --> 00:02:43.670
kilometers, the climate model can't really
see. So you have to kind of take that… it

00:02:43.670 --> 00:02:46.290
kind of has to account for that because
clouds are important, and if you have more

00:02:46.290 --> 00:02:51.779
clouds, then that reflects some of the
heat out. So maybe you cool. But it also

00:02:51.779 --> 00:02:55.569
traps more of the heat in so maybe you
warm. And if you have more clouds, maybe

00:02:55.569 --> 00:03:00.130
you warm more. But if you have less
clouds, maybe you warm even more. So it's

00:03:00.130 --> 00:03:04.549
kind of unsure. We actually don't know if
clouds will make the climate warmer or if

00:03:04.549 --> 00:03:07.760
they'll make the climate cooler. So it's
important for your climate models to kind

00:03:07.760 --> 00:03:13.200
of resolve or see these little clouds. So
kind of where the mathematical disease

00:03:13.200 --> 00:03:16.940
comes in, is that you don't… we don't know
what equation to solve. We don't know

00:03:16.940 --> 00:03:21.399
exactly what physics to solve, to see, to
kind of resolve the effect of these little

00:03:21.399 --> 00:03:25.120
clouds. So it's kind of the the
mathematical disease. We don't know how to

00:03:25.120 --> 00:03:29.939
do it. So you instead use a… well, it's
called a parametrization, which is the

00:03:29.939 --> 00:03:33.120
mathematical disease. So in the
atmosphere, the big mathematical disease

00:03:33.120 --> 00:03:40.240
is clouds. But if you look at the ocean,
you kind of get a similar… You have also

00:03:40.240 --> 00:03:43.719
similar mathematical diseases. So if you
for example, this is model output. We

00:03:43.719 --> 00:03:48.830
don't have good satellite imagery of the
oceans. So if you if you look at, for

00:03:48.830 --> 00:03:53.400
example, model output from an ocean model,
high resolution ocean model here, it's

00:03:53.400 --> 00:03:58.159
kind of centered on the Pacific. So you
can kind of see Japan and China and the

00:03:58.159 --> 00:04:03.060
white kind of lines. Those are streamlines
or that the lines tell you where the water

00:04:03.060 --> 00:04:07.019
is going. So you could see a lot of kind
of straight lines. You see this curve here

00:04:07.019 --> 00:04:13.669
current off of Japan, but you see lots of
circles. So the circles are these eddies

00:04:13.669 --> 00:04:17.910
and they're kind of the turbulence of the
ocean. They move, they kind of stir and

00:04:17.910 --> 00:04:23.740
mix and transport a lot of salt or heat or
carbon or nutrients or… you know, marine

00:04:23.740 --> 00:04:28.270
life or anything. It's the main way the
ocean kind of moves heat from the equator

00:04:28.270 --> 00:04:32.970
to the pole. It kind of stirs things
around. So they're really important for

00:04:32.970 --> 00:04:38.480
kind of how carbon moves in the ocean, for
how the ocean heats up. And here they look

00:04:38.480 --> 00:04:42.259
pretty big. But again, you can zoom in and
you'll see lots of small scale structures.

00:04:42.259 --> 00:04:46.500
So we're going to switch to a different
model output and different colors. So here

00:04:46.500 --> 00:04:51.260
here's kind of the same area. So you see
Japan in the top left. But what's being

00:04:51.260 --> 00:04:55.720
plotted is vorticity. So you have to know
what that is. It's kind of a measure of

00:04:55.720 --> 00:05:00.030
how much the fluid or the water is
spinning. But the point is that you have

00:05:00.030 --> 00:05:05.340
lots of structure. So there's lots of, you
know, big circles, but there also lots of

00:05:05.340 --> 00:05:10.070
really little circles. And again, your
climate model can only see something like

00:05:10.070 --> 00:05:14.919
50 kilometers or 100 kilometers. But as
you can see here, there's lots of stuff

00:05:14.919 --> 00:05:18.860
that's much smaller than a hundred
kilometers. So if you superimpose kind of

00:05:18.860 --> 00:05:23.880
this this grid, maybe that's your climate
model grid. And, you know, basically for

00:05:23.880 --> 00:05:27.840
the climate model, every one of these
boxes is like one number. So you can't

00:05:27.840 --> 00:05:31.880
really see anything smaller than that. But
there's important dynamics and physics

00:05:31.880 --> 00:05:35.380
that happens in like 10 kilometers, which
is a lot smaller than what the climate

00:05:35.380 --> 00:05:39.470
model can see. And there's even important
physics that happens that like 100 meters

00:05:39.470 --> 00:05:44.290
or 200 meters. So if you want if you want
to, you know, what the climate will look

00:05:44.290 --> 00:05:50.909
like, we need to… we need to know about
the physics that happens at 200 meters. So

00:05:50.909 --> 00:05:55.020
to give an example of some of the physics
that happens at 10 kilometers, here's kind

00:05:55.020 --> 00:06:01.040
of a little animation where this kind of
explains why you get all these eddies or

00:06:01.040 --> 00:06:05.530
all the circles in the ocean. So a lot of
times you have, say, hot water, say, in

00:06:05.530 --> 00:06:10.520
the north. So the hot water here is all in
orange or yellow and you have a lot of

00:06:10.520 --> 00:06:15.340
cold water. So the cold water is in the
south and it's purple. And then once this…

00:06:15.340 --> 00:06:20.099
once you add rotation, you end up with
these eddies because what the hot water

00:06:20.099 --> 00:06:24.350
wants to do, the hot water is lighter,
it's less dense. So it actually wants to

00:06:24.350 --> 00:06:28.810
go on top of the cold water. So usually
have cold at the bottom, hot at the top.

00:06:28.810 --> 00:06:34.120
So you have heavy at the bottom and light
at the top. So when you add… without

00:06:34.120 --> 00:06:38.050
rotation, the hot water will just go on
top of the cold water. But when you have

00:06:38.050 --> 00:06:42.540
rotation, you end up… it kind of wants to
tip over. But it's also rotating. So you

00:06:42.540 --> 00:06:47.479
kind of get this beautiful swirling
patterns and these are kind of the same

00:06:47.479 --> 00:06:52.380
circular eddies that you see in the real
ocean. But this model here is like two

00:06:52.380 --> 00:06:55.139
hundred and fifty kilometers by five
hundred kilometers and it's like one

00:06:55.139 --> 00:06:59.710
kilometer deep. So you need a lot of
resolution to be able to resolve this

00:06:59.710 --> 00:07:04.750
stuff, but not… your climate model doesn't
have that much resolution. So some of the

00:07:04.750 --> 00:07:08.449
features here, like the sharp prints
between the cold and the hot water, your

00:07:08.449 --> 00:07:12.879
climate model might not see that. So maybe
if you if you don't resolve this properly,

00:07:12.879 --> 00:07:17.270
you get the mixing rate wrong or maybe
that the ocean is the wrong temperature or

00:07:17.270 --> 00:07:22.039
something. So it's kind of important to
resolve this stuff. Another one, the color

00:07:22.039 --> 00:07:26.170
scheme here is really bad. <i>laughs</i> I'm
sorry, but another one, for example, is

00:07:26.170 --> 00:07:32.389
here. Everything's under 100 meter, so
it's a cube of 100 meters on each side and

00:07:32.389 --> 00:07:37.340
you're starting with 20 degrees Celsius
water at the top. You have 19 degrees

00:07:37.340 --> 00:07:41.370
Celsius water at the bottom initially. So
it's kind of you're… as you go deeper in

00:07:41.370 --> 00:07:46.270
the ocean, the water gets colder. And then
if you can imagine, the ocean kind of at

00:07:46.270 --> 00:07:50.669
night, it's kind of cold. So the top is
being cooled and you end up with cold

00:07:50.669 --> 00:07:54.639
water on the top. The cold water wants to be
at the bottom. So it ends up sinking and you

00:07:54.639 --> 00:07:59.030
get all this convection going on. So this
is happening at a lot of places in the

00:07:59.030 --> 00:08:03.650
ocean. You get a lot of mixing at the top.
You get this kind of layer at the top of

00:08:03.650 --> 00:08:08.090
the ocean. It's kind of constant color,
constant temperature. So this mix layer is

00:08:08.090 --> 00:08:12.240
important for the ocean. So knowing how
deep that mix layer is and knowing how

00:08:12.240 --> 00:08:16.000
much of the water is being mixed is also
important for for climate. But as you can

00:08:16.000 --> 00:08:19.860
imagine, you know, if this happens on very
small scales. So you're climate model has

00:08:19.860 --> 00:08:24.870
to know something about what's happening
at this scale. So this isn't I guess the

00:08:24.870 --> 00:08:28.921
mathematical diseases in the ocean is, the
climate model cannot see this, so it has

00:08:28.921 --> 00:08:32.970
to do something else that's maybe
unphysical to resolve this stuff. And

00:08:32.970 --> 00:08:38.420
that's a mathematical disease, I guess.
Aside from the ocean and the atmosphere.

00:08:38.420 --> 00:08:41.820
You also have the same problem with sea
ice. So this is kind of just a satellite

00:08:41.820 --> 00:08:47.010
picture of where sea ice is forming off
the coast of Antarctica. So you get winds

00:08:47.010 --> 00:08:48.680
that kind of come off the 
continent and they're

00:08:48.680 --> 00:08:51.080
kind of blowing all the 
ice that's beeing formed

00:08:51.080 --> 00:08:53.900
away. So you get all these
little lines and streaks and they kind of

00:08:53.900 --> 00:08:58.390
merge into sea ice. But in this whole
picture is like 20 kilometers. So the

00:08:58.390 --> 00:09:01.950
climate model doesn't see this, but
somehow it has to represent all the

00:09:01.950 --> 00:09:08.561
physics. And you have kind of similar
things happening with soil moisture, land

00:09:08.561 --> 00:09:16.240
and dynamic vegetation, aerosols. So, you
know, these are kind of three places with

00:09:16.240 --> 00:09:21.270
pretty pictures. But see, if you look at
the atmosphere, so it's not just clouds.

00:09:21.270 --> 00:09:27.350
You also have aerosols, which are like
little particles, or sulfates that are

00:09:27.350 --> 00:09:31.680
important for kind of cloud formation and
maybe atmospheric chemistry. But again, we

00:09:31.680 --> 00:09:34.700
don't fully understand the physics of
these aerosols. So again, you have to kind

00:09:34.700 --> 00:09:40.270
of parametrize them. Same thing with kind
of convictions. You maybe your climate

00:09:40.270 --> 00:09:44.160
model doesn't resolve all the very deep
convection in the atmosphere so as to get

00:09:44.160 --> 00:09:48.190
all sides to parametrize it. So I guess
you have many kind of mathematical

00:09:48.190 --> 00:09:51.291
diseases in the atmosphere. So I'm not
expecting you to understand everything in

00:09:51.291 --> 00:09:55.110
this in this picture. But the idea is: The
atmosphere is complicated. There's no way

00:09:55.110 --> 00:09:59.740
a climate model is going to kind of, you
know, figure all this out by itself. And

00:09:59.740 --> 00:10:04.600
again, you could you could do something
similar for the ocean. So we can just show

00:10:04.600 --> 00:10:07.370
an image for like two little parts of
these. But the point is, you know, the

00:10:07.370 --> 00:10:11.490
ocean is not kind of just a bucket of
water standing there. So there's lots of

00:10:11.490 --> 00:10:15.180
stuff happening deep inside the ocean. And
some of it, we think is important for

00:10:15.180 --> 00:10:19.180
climate. Some of it we don't know. Some
might not be important. But again, a lot

00:10:19.180 --> 00:10:25.430
of this happens on very small spatial
scales. So we don't know or the climate

00:10:25.430 --> 00:10:30.441
model can't always resolve all this stuff.
And again, same thing with kind of sea

00:10:30.441 --> 00:10:34.370
ice. Lots of small scale stuff is
important for sea ice. And I think one

00:10:34.370 --> 00:10:37.770
person asked about kind of tipping points
and there's kind of two with like sea ice

00:10:37.770 --> 00:10:43.880
that are pretty important. One of them is
this CSL biofeedback. So if you have sea

00:10:43.880 --> 00:10:48.630
ice that melts. Now you have more ocean
and the ocean can absorb more heat. But

00:10:48.630 --> 00:10:51.980
now the earth is warmer, so it melts more
sea ice. So as soon as you kind of start

00:10:51.980 --> 00:10:55.670
melting sea ice, maybe you melt even more
sea ice and eventually you reach an earth

00:10:55.670 --> 00:11:00.020
with no sea ice. So there's kind of
research into that stuff going on, but

00:11:00.020 --> 00:11:04.410
it's a possible tipping point. Another one
is this kind of marine ice sheet,

00:11:04.410 --> 00:11:08.970
stability, instability at the bottom of
the ice shelf. So if you start melting

00:11:08.970 --> 00:11:13.500
water, if you start melting ice from the
bottom of the ice shelf, then we create

00:11:13.500 --> 00:11:17.530
kind of a larger area for more ice to
melt. So maybe once you start melting and

00:11:17.530 --> 00:11:20.940
increasing sea level, you just keep
melting more and more and increasing sea

00:11:20.940 --> 00:11:27.300
level even more. But again, it's kind of
hard to quantify these things on like 50

00:11:27.300 --> 00:11:34.910
or 100 year timescales because it all
happens on very small scales. So yeah, the

00:11:34.910 --> 00:11:39.760
point is there's lots of these kind of
parametrizations or mathematical diseases.

00:11:39.760 --> 00:11:43.800
And once you start adding them all up, you
end up with lots and lots of kind of

00:11:43.800 --> 00:11:48.180
parameters. So this is a really boring
table. But the point is, so this is like

00:11:48.180 --> 00:11:53.110
one parametrization for like vertical
mixing in the ocean. It's basically the

00:11:53.110 --> 00:11:57.140
process that I showed the rainbow color
movie about to see a climate model for

00:11:57.140 --> 00:12:02.060
that. I'm trying to kind of parametrize
that, physics might have like 20

00:12:02.060 --> 00:12:05.860
parameters. And, you know, some of them
are crazy like a surface layer fractional

00:12:05.860 --> 00:12:10.120
like zero point one or something. And
usually they keep the same constants for

00:12:10.120 --> 00:12:14.620
all these values. Usually it's like
someone in like 1994 came up with these 20

00:12:14.620 --> 00:12:18.730
numbers and now we all use the same 20
numbers. But you know, maybe they're

00:12:18.730 --> 00:12:21.850
different. And like the Pacific or the
Atlantic or like maybe they're different

00:12:21.850 --> 00:12:25.740
when it's summer and winter and the
problem is, there's many of these

00:12:25.740 --> 00:12:28.920
parametrizations. So you know here's like
20 parameters, but then you have a lot

00:12:28.920 --> 00:12:32.430
more for clouds. You have a lot more sea
ice. We add them all up. Suddenly you have

00:12:32.430 --> 00:12:38.050
like 100, maybe up to a thousand kind of
tunable parameters. Kind of going back to

00:12:38.050 --> 00:12:42.700
this plot that was shown at the last talk.
You can see kind of the all the models

00:12:42.700 --> 00:12:47.880
kind of agree really well from like 1850
to 2000, because they're all kind of being

00:12:47.880 --> 00:12:50.950
they all have different kind of
parameters, but they all get kind of tuned

00:12:50.950 --> 00:12:54.890
or optimized. So they get the 20th
century. Correct. So they get the black

00:12:54.890 --> 00:12:59.980
line. Correct. But then when you run them
forward, you run them to like 2300. They

00:12:59.980 --> 00:13:02.860
all are slightly different. So they all
start producing different physics and

00:13:02.860 --> 00:13:07.550
suddenly you get a huge like red band. So
that's saying you have lots of model

00:13:07.550 --> 00:13:12.870
uncertainty. So it's kind of on some
people might say like oh this like tuning

00:13:12.870 --> 00:13:17.510
process is like optimization. It's like
not very scientific to be kind of right.

00:13:17.510 --> 00:13:22.440
It's kind of like in the past. It's kind
of like the best live we've had. But I

00:13:22.440 --> 00:13:26.420
think, you know, we should be able to do a
little bit better. Better than that. So

00:13:26.420 --> 00:13:28.700
just to give you the idea, you know, some
people would say, you know, why don't you

00:13:28.700 --> 00:13:31.650
just you know, most of the physics, which
you just, you know, resolve all the

00:13:31.650 --> 00:13:36.640
physics, you know, but see if you want to
do like a direct numerical simulation, so

00:13:36.640 --> 00:13:39.410
it's basically saying you want to resolve
all the motions in the ocean, in the

00:13:39.410 --> 00:13:43.310
atmosphere. You basically need to resolve
things down to like one millimeter. So if

00:13:43.310 --> 00:13:46.850
you have like a grid spacing of one
millimeter and you consider the volume of

00:13:46.850 --> 00:13:50.760
the ocean and the atmosphere, you
basically say you need like 10 to the 28

00:13:50.760 --> 00:13:55.470
grid points. You know, that's like imagine
putting cubes of like one millimeter

00:13:55.470 --> 00:13:56.850
everywhere in the 
ocean and atmosphere.

00:13:56.850 --> 00:13:58.630
That's how many great
points you would need.

00:13:58.630 --> 00:14:01.830
So unfortunately, you could do that.
But there's not enough computer power or

00:14:01.830 --> 00:14:05.800
storage space in the world to do that. So
you're kind of stuck doing something a bit

00:14:05.800 --> 00:14:11.120
coarser. Usually most climate models, he's
like 10 to the 8 great points so that you

00:14:11.120 --> 00:14:16.390
10 to the 20 to little words. You don't
want to just run a big climate model once

00:14:16.390 --> 00:14:19.870
you know you need to run them for very
long times, usually like you run them for

00:14:19.870 --> 00:14:23.290
a thousand years or ten thousand years
when you want to run many of them because

00:14:23.290 --> 00:14:27.170
you want to collect statistics. So
generally you don't run at the highest

00:14:27.170 --> 00:14:31.460
resolution possible. You run kind of at a
lower resolution so you can run many, many

00:14:31.460 --> 00:14:36.900
models. So because you can only use so
much resolution, it seems that power

00:14:36.900 --> 00:14:39.790
transitions or these kind of mathematical
things, you have to live with them, you've

00:14:39.790 --> 00:14:45.310
got to use them. But at least one idea is,
you know, instead of using numbers that

00:14:45.310 --> 00:14:49.030
sum. But he came up with in 1994. You
might as well try to figure you know,

00:14:49.030 --> 00:14:50.980
better numbers or maybe 
you know if the numbers

00:14:50.980 --> 00:14:52.210
are kind of different 
in different places,

00:14:52.210 --> 00:14:55.910
you should find that out. So one
thing you could do, one thing we are

00:14:55.910 --> 00:15:01.340
trying to do is get the pressurization is
to kind of agree with like basic physics

00:15:01.340 --> 00:15:05.210
or agree with observations. So we have
lots of observations. How many we can run

00:15:05.210 --> 00:15:07.300
kind of high resolution 
simulations to resolve

00:15:07.300 --> 00:15:09.100
a lot of the physics 
and then make sure

00:15:09.100 --> 00:15:12.100
when you put the prioritization in
the climate model, it actually gives you

00:15:12.100 --> 00:15:16.840
the right numbers according to basic
physics or observations. But sometimes

00:15:16.840 --> 00:15:20.810
that might mean, you know, different
numbers in the Atlantic, in the Pacific or

00:15:20.810 --> 00:15:24.480
different numbers for the winter and the
summer. And you have to run many high

00:15:24.480 --> 00:15:28.660
resolution simulations to get enough data
to do this. But indeed, you know, these

00:15:28.660 --> 00:15:33.890
days I think we have enough computing
power to do that. So it's kind of do all

00:15:33.890 --> 00:15:37.070
these high resolution simulations. We
ended up building a new kind of ocean

00:15:37.070 --> 00:15:42.370
model that we run on GPUs because these
are all faster for giving us these

00:15:42.370 --> 00:15:46.520
results. So we ended up usually most
climate modeling is done in Fortran. We

00:15:46.520 --> 00:15:53.310
decided to go with with Julia for a number
of reasons, which I'll talk about. But the

00:15:53.310 --> 00:15:58.220
left figure is kind of that mixed layer or
boundary layer turbulence kind of movie.

00:15:58.220 --> 00:16:01.630
But instead of the rainbow color map, now
it's using a more reasonable color maps.

00:16:01.630 --> 00:16:07.050
It looks like the ocean, the right is that
old movie. So we're generating tons and

00:16:07.050 --> 00:16:11.670
tons of data from using simulations like
this and then hopefully we can get enough

00:16:11.670 --> 00:16:15.440
data and like figure out a way to explain
the prior transitions. But it's kind of a

00:16:15.440 --> 00:16:19.810
work in progress. So a different idea that
might be more popular here, I don't know.

00:16:19.810 --> 00:16:25.770
Is instead of kind of using the existing
permanent positions, you could say, OK,

00:16:25.770 --> 00:16:29.960
well, now you have tons and tons of data.
Maybe you just throw in like a neural

00:16:29.960 --> 00:16:34.620
network into the differential equations.
Basically, you put in the physics, you

00:16:34.620 --> 00:16:37.760
know, and then the neural network is
responsible for the physics you don't

00:16:37.760 --> 00:16:43.780
know. So, for example, you know, most
people here might not. I also don't want

00:16:43.780 --> 00:16:46.230
to talk about differential equations
because I would take a long time. So just

00:16:46.230 --> 00:16:48.280
imagine that the equation
in the middle is kind of

00:16:48.280 --> 00:16:49.960
what a climate model
needs to solve.

00:16:49.960 --> 00:16:53.240
And the question marks are kind of
physics we don't know. So we don't know

00:16:53.240 --> 00:16:58.720
what to put there. But maybe you could put
out a neural network. So number one is

00:16:58.720 --> 00:17:03.320
kind of a possible characterisation or a
possible way you could try to franchise

00:17:03.320 --> 00:17:05.750
the missing physics where the neural
networks kind of responsible for

00:17:05.750 --> 00:17:09.940
everything. We find that doesn't work as
well. So instead, maybe you tell it some

00:17:09.940 --> 00:17:13.990
of the physics, maybe tell it about cue,
which is like the heating or cooling at

00:17:13.990 --> 00:17:17.540
the surface. And then it's kind of
responsible for resolving the other stuff.

00:17:17.540 --> 00:17:22.300
But it's still a work in progress because
the blue is kind explosive your data. The

00:17:22.300 --> 00:17:25.690
orange is supposed to be the narwhal and
they don't agree. So it's still a work in

00:17:25.690 --> 00:17:28.960
progress, but hopefully we'll be able to
do that better. So this is kind of stuff

00:17:28.960 --> 00:17:34.400
that's like a week or two old. But kind of
reach a conclusion, at least from my half

00:17:34.400 --> 00:17:39.750
of the talk. So the reason I personally
like Julia as a climate modeler is we were

00:17:39.750 --> 00:17:45.000
able to kind of build an ocean model from
scratch basically in less than a year. And

00:17:45.000 --> 00:17:47.471
one of the nice things
is that the user interface

00:17:47.471 --> 00:17:49.561
or the scripting and the model
backend is all in one language,

00:17:49.561 --> 00:17:52.030
whereas in the past
used to usually write the high

00:17:52.030 --> 00:17:57.850
level and like Python and maybe the back
end is like Fortran or C. And we find, you

00:17:57.850 --> 00:18:01.400
know, when we Julia, it's just as fast as
our legacy model, which was written in

00:18:01.400 --> 00:18:07.230
Fortran. And one of the nicest things was
that basically able to write code once and

00:18:07.230 --> 00:18:10.510
using there's a need of GPU compiler. So
basically you write your code one single

00:18:10.510 --> 00:18:15.180
code base and you go to CPUs and GPUs.
So you'd want to write two different code

00:18:15.180 --> 00:18:20.610
bases. And yeah, we find generally because
it's high level language, we're all kind

00:18:20.610 --> 00:18:24.860
of more productive. We can give a more
powerful user API and Julia kind of has a

00:18:24.860 --> 00:18:30.240
nice multiple dispatch backend so that we
find that makes it easy for the users to

00:18:30.240 --> 00:18:35.330
kind of extend the model or hack the
model. And there's. Some people would say

00:18:35.330 --> 00:18:38.480
the Julia community is pretty small. But
we find there's a pretty big Julia

00:18:38.480 --> 00:18:41.470
community interest in scientific
computing. So we fund kind of all the

00:18:41.470 --> 00:18:46.540
packages we need are pretty much
available. So with our client conclude my

00:18:46.540 --> 00:18:50.240
half by saying there is most of the
uncertainty in climate modeling basically

00:18:50.240 --> 00:18:53.850
comes from humans because they don't know
what humans will do. But there's a huge

00:18:53.850 --> 00:18:57.550
model uncertainty basically because of
physics we don't understand or physics,

00:18:57.550 --> 00:19:01.610
the kind of model cannot see. You can't
resolve every cloud and you know, every

00:19:01.610 --> 00:19:05.390
wave in the oceans you've got you've got
to figure out a way to account for them.

00:19:05.390 --> 00:19:09.940
So that's what our prioritization does.
And we're trying to kind of use a lot of

00:19:09.940 --> 00:19:14.680
computing power to kind of make sure we
train or come up with good privatizations

00:19:14.680 --> 00:19:19.600
instead of kind of tuning the model at the
end. And we're hoping this will lead to

00:19:19.600 --> 00:19:23.750
better climate predictions. Maybe you
will. Maybe you won't. But at least, you

00:19:23.750 --> 00:19:28.990
know, even if it doesn't. Hopefully we can
say we got rid of the model tuning problem

00:19:28.990 --> 00:19:33.110
and hopefully we can make. We find it
that software development for climate

00:19:33.110 --> 00:19:37.930
modeling is easier than if we did it in
Fortran. I will say this kind of an

00:19:37.930 --> 00:19:41.760
advertisement, but I'm looking to bike her
on Germany for a week and apparently can't

00:19:41.760 --> 00:19:46.390
take the next bike out of Leipzig. So if
anyone is looking to sell their bicycle or

00:19:46.390 --> 00:19:51.230
wants to make some cash, I'm looking to
rent a bicycle. So yeah, if you have one,

00:19:51.230 --> 00:19:55.070
come talk to me, please. Thank you. Danke.

00:19:55.070 --> 00:20:02.190
<i>applause</i>

00:20:02.190 --> 00:20:09.490
Churavy: So one big question for me always
is how can we ask technologists hub that

00:20:09.490 --> 00:20:14.580
think most of us in this room are fairly
decent with computers? The internet is not

00:20:14.580 --> 00:20:19.410
necessarily an new island for us. But how
do we use that knowledge to actually

00:20:19.410 --> 00:20:25.060
impact real change? And if you haven't
there's some fantastic article:

00:20:25.060 --> 00:20:30.030
worrydreams.com/ClimateChange. Which lists
all the possible or not all the possible

00:20:30.030 --> 00:20:36.610
but a lot of good ideas to think about and
go like, okay, do my skills apply in that

00:20:36.610 --> 00:20:43.490
area? Well, I'm a computer scientist. I do
programing language research. So how do my

00:20:43.490 --> 00:20:50.430
skills really apply to climate change? How
can I help? And one of the things that

00:20:50.430 --> 00:20:54.880
took me in this article was how, and one
of the realization, and why I do my work

00:20:54.880 --> 00:20:59.790
is that the tools that we have built for
scientists and engineers, they are that

00:20:59.790 --> 00:21:05.550
poor. Computer scientists like myself have
focused a lot on making programing easier,

00:21:05.550 --> 00:21:11.580
more accessible. What we don't necessarily
have kept the scientific community as a

00:21:11.580 --> 00:21:18.250
target audience. And then you get into
this position where models are written in

00:21:18.250 --> 00:21:23.150
a language. Fortran 74 and isn't that a
nice language, but it's still not one that

00:21:23.150 --> 00:21:28.320
is easily picked up and where you find
enthusiasm in younger students for using

00:21:28.320 --> 00:21:36.450
it. So I work on Julia and my goal is
basically to make a scientific computing

00:21:36.450 --> 00:21:41.410
easier, more accessible and make it easier
to access the huge computing power we have

00:21:41.410 --> 00:21:49.960
available to do climate modeling. Ideas,
if you are interesting in this space is,

00:21:49.960 --> 00:21:53.590
you don't need to work on Julia
necessarily, but you can think about maybe

00:21:53.590 --> 00:21:59.420
I'm to look at modeling for physical
systems, modeling like one of the

00:21:59.420 --> 00:22:04.250
questions is can be model air conditioning
units more precisely, get them more

00:22:04.250 --> 00:22:09.211
efficient? Or any other technical system.
How do we get that efficiency? But we need

00:22:09.211 --> 00:22:15.340
better tools to do that. So the language
down here as an example is modelicar.

00:22:15.340 --> 00:22:19.560
There is a project right now, modelicar is
trying to see how we can push the

00:22:19.560 --> 00:22:24.770
boundary there. The language up here is Fortran.
You might have seen a little bit of that

00:22:24.770 --> 00:22:32.310
in the talk beforehand and it's most often
used to do climate science. So why

00:22:32.310 --> 00:22:37.100
programing languages? Why do I think that
my time is best spent to actually work on

00:22:37.100 --> 00:22:44.250
programing languages and do that in order
to help people? Well, Wittgenstein says:

00:22:44.250 --> 00:22:48.620
"The limits of my language are the limits
of my world." What I can express is what I

00:22:48.620 --> 00:22:52.190
think about. And I think people are
multilingual, know that that sometimes

00:22:52.190 --> 00:22:56.050
it's easier to think for them. But certain
things in one language say it isn't the

00:22:56.050 --> 00:23:00.660
other one. But language is about
communication. It's about communication

00:23:00.660 --> 00:23:05.670
with scientists, but it's also about
communication with the computer. And too

00:23:05.670 --> 00:23:09.950
often programing language fall into that
trap where it's about, oh, I want to

00:23:09.950 --> 00:23:15.720
express my one particular problem or I
wanna express my problem very well for the

00:23:15.720 --> 00:23:20.420
compiler, for the computer. I won't talk
to the machine. What if I found that

00:23:20.420 --> 00:23:24.640
programming languages are very good to
talk to other scientists, to talk in a

00:23:24.640 --> 00:23:29.220
community and to actually collaborate? And
so the project that Ali and I are both

00:23:29.220 --> 00:23:35.420
part of has, I think, 30 ish. I don't
know. The numbers are as big as the big

00:23:35.420 --> 00:23:40.580
coupe of climate scientists modelers. And
we have a couple of numerical scientists,

00:23:40.580 --> 00:23:44.650
computer scientists and engineers and we
all working the same language, being able

00:23:44.650 --> 00:23:49.260
to collaborate and actually work on the
same code instead of me working on some

00:23:49.260 --> 00:23:56.610
low level implementation and Ali telling
me what to write. That wouldn't be really

00:23:56.610 --> 00:24:05.050
efficient. So, yes, my goal is to make
this search easier. Do we really need yet

00:24:05.050 --> 00:24:09.110
another high level language? That is a
question I often get. It's like why Julia?

00:24:09.110 --> 00:24:14.660
And not why are you not spending your time
and effort doing this for Python? Well, so

00:24:14.660 --> 00:24:21.580
this is as a small example, this is Julia
code. It looks rather readable. I find it

00:24:21.580 --> 00:24:28.930
doesn't use a semantic whitespace. You may
like that or not. It has all the typical

00:24:28.930 --> 00:24:33.960
features that you would expect from a high
level dynamic language. It is using the

00:24:33.960 --> 00:24:37.190
M.I.T. license that has a built in package
manager. It's very good for interactive

00:24:37.190 --> 00:24:45.040
development, but it has a couple of
unusual wants and those matter. You need

00:24:45.040 --> 00:24:49.830
if you want to simulate a climate model,
you need to get top performance on a

00:24:49.830 --> 00:24:56.470
supercomputer. Otherwise you won't get an
answer in the time that it matters. Julia

00:24:56.470 --> 00:25:02.830
uses just in time ahead of time
compilation, the other great feature is

00:25:02.830 --> 00:25:07.670
actually a spitting in Julia. So I can
just look at implementations. I can dive

00:25:07.670 --> 00:25:12.610
and dive and dive deeper into somebodies
code and don't have a comprehension

00:25:12.610 --> 00:25:20.210
barrier. If I if you ever have spent some
time and tried to figure out how Python

00:25:20.210 --> 00:25:26.360
sums numbers under the hood to make it
reasonably fast. Good luck. It's hard.

00:25:26.360 --> 00:25:30.060
It's written in C. See, and there is a lot
of barriers in order to understand what's

00:25:30.060 --> 00:25:35.380
actually going on. Then reflection and
meta programing. You can do a lot of fun

00:25:35.380 --> 00:25:41.140
stuff which we're going to talk about. And
then the big coin for me is that you have

00:25:41.140 --> 00:25:46.150
native keep you code generation support so
you can actually take Julia code and run

00:25:46.150 --> 00:25:49.920
it on the GPU. You you're not
relying on libraries because libraries

00:25:49.920 --> 00:25:59.450
only are can express the things. That was
where writtenin there. So early on last

00:25:59.450 --> 00:26:03.550
December, I think we met up for the
climate science project and after deciding

00:26:03.550 --> 00:26:08.450
on using Julia for the entire project.
They were like, we we're happy with the

00:26:08.450 --> 00:26:12.710
performance, but we have a problem. We
have to duplicate our code for GPUs

00:26:12.710 --> 00:26:20.530
and CPUs. What really? It can't be! I mean, I
designed the damn thing, it should be working.

00:26:20.530 --> 00:26:25.559
Well, what they had at that point was
basically always a copy of two functions

00:26:25.559 --> 00:26:31.170
where one side of it was writing the CPU
code and the other side was implementing a

00:26:31.170 --> 00:26:36.710
GPU code. And really, there were only a
couple of GPU specific parts in there. And

00:26:36.710 --> 00:26:43.210
if anybody has ever written GPU Code, it's
this pesky which index am I calculation.

00:26:43.210 --> 00:26:49.710
Worthy for loop on the CPU to would just
looks quite natural. And I was like, what?

00:26:49.710 --> 00:26:55.610
Sit. Come on. What we can do is we can
just wait a kernel so he takes a body of

00:26:55.610 --> 00:27:00.490
the for loop, extracts it in a new
function. Add a little bit of sugar and

00:27:00.490 --> 00:27:06.460
magic to court GPU kernels and CPU
functions and then we're done. Problem

00:27:06.460 --> 00:27:12.730
solved. What the code roughly would look
look like isn't actually this. You can

00:27:12.730 --> 00:27:19.670
copy and paste this and it should work.
And so you have two functions. One of them

00:27:19.670 --> 00:27:23.679
launches where you extract your kernel.
Then you write a function that takes

00:27:23.679 --> 00:27:29.580
another function and runs it function in a
for loop or it launches that function on

00:27:29.580 --> 00:27:34.600
the GPU. And then you have this little GPU
snippet is the only bit of us actually

00:27:34.600 --> 00:27:39.250
GPU, which calculates the index and
then calls the function F with an index

00:27:39.250 --> 00:27:45.780
argument. I'm done here. My, my
contribution to this project was done,

00:27:45.780 --> 00:27:49.650
Well, they came back to me and we're like,
now it's not good enough. And I was like,

00:27:49.650 --> 00:27:54.550
why? Well, the issue is they needed kernel
fusion. So that's the process of taking

00:27:54.550 --> 00:28:00.110
two functions and merging them together.
I'm like, okay, fine. Why do they need

00:28:00.110 --> 00:28:04.790
that? Because if you want to be white(?)
average efficient GPO code, you need to be

00:28:04.790 --> 00:28:09.980
really concerned about the numbers of
global memory loads and stores. If you

00:28:09.980 --> 00:28:13.720
have too many of them or if they are
irregular, you lose a lot of performance

00:28:13.720 --> 00:28:20.470
and you need good performance. Otherwise,
we can't simulate the solution once. They

00:28:20.470 --> 00:28:24.600
also actually wanted to take use GPU
functionality and low level controlled.

00:28:24.600 --> 00:28:29.610
They wanted to look at their kernels and
use shared memory constructs. They wanted

00:28:29.610 --> 00:28:36.190
to do precise risk working, minimizing the
number of registers used and they really cared

00:28:36.190 --> 00:28:40.751
about low level performance. They were
like, well, we can't do this with the

00:28:40.751 --> 00:28:47.790
abstraction you gave us because it builds
up too many barriers. And I could have

00:28:47.790 --> 00:28:53.970
given you a few more typical computer
science answer, which would have been OK.

00:28:53.970 --> 00:28:59.429
Give me two years and I'll come back to
you and there is a perfect solution which

00:28:59.429 --> 00:29:03.350
is like a cloud cover in the sky. And I
write your speech spoke language that does

00:29:03.350 --> 00:29:06.850
exactly what you need to do. And at the
end, we have a domain specific language

00:29:06.850 --> 00:29:10.590
for climate simulation that will do final
volume and discontinuous cloaking in

00:29:10.590 --> 00:29:17.410
everything you want. And I will have a
PhD. Kit. Fantastic. Well, we don't have

00:29:17.410 --> 00:29:21.880
the time. The whole climate science
project that we are on has accelerated

00:29:21.880 --> 00:29:27.040
timeline because the philanthropist that
the funding that research are. Well, if

00:29:27.040 --> 00:29:35.090
you can't give us better answer anytime
soon, it won't matter anymore. So I sat

00:29:35.090 --> 00:29:40.580
down and was like, okay, I need a box. I
need something. It has minimal effort.

00:29:40.580 --> 00:29:45.150
Quick delivery. I need to be able to fix
it. If I do get it wrong the first time

00:29:45.150 --> 00:29:49.600
around and I did, it needs to be hackable.
My collaborator needs to understand it and

00:29:49.600 --> 00:29:55.040
actually be able to change it. And it
needs to be happened yesterday. Well,

00:29:55.040 --> 00:29:59.370
Julia is good at these kinds of hacks. And
as I've learned, you can actually let them

00:29:59.370 --> 00:30:06.259
go into bespoke solutions and have better
abstractions after the fact. So that

00:30:06.259 --> 00:30:10.220
you're that you can actually do the fancy
computer science that I really wanted to

00:30:10.220 --> 00:30:14.860
do. The product is called GPUify Loops
because I couldn't come up with a worse

00:30:14.860 --> 00:30:23.520
name, nobody else could. So we stick with
it. It's a Macro based. And so, Julia, you

00:30:23.520 --> 00:30:30.730
can write syntax macros that transform the
transform the written statements into

00:30:30.730 --> 00:30:37.330
similar statements so you can insert code
or remove code if you want to. At, right

00:30:37.330 --> 00:30:41.330
now target CPUs and GPUs and we are
talking about how do we get multi threaded

00:30:41.330 --> 00:30:46.480
into the story, how do we target more on
different GPUs? There are other projects

00:30:46.480 --> 00:30:51.470
that are very similar. So there's OCCA,
which is where a lot of these ideas are

00:30:51.470 --> 00:30:58.610
coming from and Open ACC in C++ does
something really similar. But basically

00:30:58.610 --> 00:31:02.290
you write a for loop, you write an at
loop in front of it, which is the magic

00:31:02.290 --> 00:31:09.480
macro that takes a transformation. And you
have two indexed statements and now you

00:31:09.480 --> 00:31:15.090
just say I want to launch it on the GPU
and it magically does a job. Get,

00:31:15.090 --> 00:31:22.150
fantastic. So let's pick up the entire
implementation of the macro at loop

00:31:22.150 --> 00:31:29.270
without the error checking that didn't fit
on the screen a couple of nights. So

00:31:29.270 --> 00:31:38.140
everything is here and basically I'm just
manipulating the for loop so that on the

00:31:38.140 --> 00:31:46.350
GPU it only iterates one iteration per
index and on CPU it iterates all of the

00:31:46.350 --> 00:31:50.890
indices because CPU is single threaded
and a GPU is many, many

00:31:50.890 --> 00:31:56.810
multithreaded. Of course there's a little
bit of magic hidden in the device function

00:31:56.810 --> 00:32:00.880
because how do I know where I'm running?
And if you're curious how to do that and

00:32:00.880 --> 00:32:06.780
then we can talk after afterwards. But
otherwise, it's a very simple,

00:32:06.780 --> 00:32:11.440
straightforward transformation. It's
written in Julia. It's a Julia function.

00:32:11.440 --> 00:32:17.600
And. Yeah. So you don't need to understand
the code here. I just want to show how quick it

00:32:17.600 --> 00:32:24.760
can be to write something like this. If
you know anything about GPU Programming at

00:32:24.760 --> 00:32:28.620
all, there should be a little voice in the
head, of the back of your head is like,

00:32:28.620 --> 00:32:34.780
wait a second. How can you run a dynamic
programming on a GPU? That shouldn't be

00:32:34.780 --> 00:32:43.100
possible. Well, Julia can run on the GPU
because it has a lot of meta programing

00:32:43.100 --> 00:32:47.770
facilities for the port for stage
programing. So I can generate code based

00:32:47.770 --> 00:32:51.679
on a specific call signature. It has
introspection, reflection mechanisms that

00:32:51.679 --> 00:32:56.760
allow me to do some interesting stuff in
the background. It is built upon LVM,

00:32:56.760 --> 00:33:02.470
which is a common compiler infrastructure.
And so I can actually write staged

00:33:02.470 --> 00:33:08.440
function that would generate an LVM
specific code for my one function and do

00:33:08.440 --> 00:33:16.820
so do that during compile time and is a
dynamic language that tries really hard to

00:33:16.820 --> 00:33:20.480
avoid runtime uncertainties. And this is
one of the challenges if you're getting

00:33:20.480 --> 00:33:26.900
into Julia is to understand that when
you're writing code that has a lot of

00:33:26.900 --> 00:33:32.230
runtime uncertainties, you get relative
slow performance, or as fast as Python.

00:33:32.230 --> 00:33:36.780
But if you work with the compiler and you
write runtime uncertainties you can get

00:33:36.780 --> 00:33:40.490
very fast code and you can run your code
on the GPU, you basically that's the

00:33:40.490 --> 00:33:45.340
limites test. If you can run your code on
the GPU, that you did your job well and it

00:33:45.340 --> 00:33:50.750
provides tools to understand the behavior
of your code. So a warning runtime

00:33:50.750 --> 00:33:55.130
uncertainty. It does that and I don't have
the time to go too deep into the answers.

00:33:55.130 --> 00:33:59.240
There is actually a paper about this. It
has a type system that allows you to do

00:33:59.240 --> 00:34:03.260
some sophisticated reasoning type
influence to figure out what your code is

00:34:03.260 --> 00:34:07.660
doing. Mutable dispatchers actually
helping us quite a lot in making it easier

00:34:07.660 --> 00:34:11.409
to do virtualized codes. It was a case of
specialization and just in time

00:34:11.409 --> 00:34:18.250
compilation. And so just looking a little
bit closer at some of these topics, if you

00:34:18.250 --> 00:34:25.060
want to look at the entire pipeline that
flow when you start while you're

00:34:25.060 --> 00:34:30.130
functioning, call it what happens through
the Julia compiler. You have tools to

00:34:30.130 --> 00:34:33.830
introspect and all of these on the right
hand side here and then you have tools to

00:34:33.830 --> 00:34:43.300
interact on the left hand side. You can
inject code back into the compiler. The

00:34:43.300 --> 00:34:48.490
other thing is Julia has dynamic
semantics. So when you difficult, you can

00:34:48.490 --> 00:34:54.200
at runtime, redefine your function and
recall it new function and it uses

00:34:54.200 --> 00:35:00.930
multiple dispatch. So if you look at the
absolute value call here, which of the 13

00:35:00.930 --> 00:35:06.280
possible methods will it call? In C++ or
in other programing languages this called

00:35:06.280 --> 00:35:10.930
a virtual function call. So isn't Julia
everything a virtual functional call? No.

00:35:10.930 --> 00:35:18.250
This is one of the important points is
when we call a function, let's say because

00:35:18.250 --> 00:35:24.230
sign of X, we look at the type of the
input arguments and then we first of all

00:35:24.230 --> 00:35:33.590
look at which function is applicable to
our input argument. So in this case, it

00:35:33.590 --> 00:35:41.920
would be the real down here because float
64 is a subtype of real. So we choose the

00:35:41.920 --> 00:35:50.740
right method using dispatch and then we
specialize that method for the signature.

00:35:50.740 --> 00:35:54.320
So the rule in multiople dispact is to
remember is we calling the most specific

00:35:54.320 --> 00:35:58.670
method, whatever specific might mean. So
if you have this bit of example, where we

00:35:58.670 --> 00:36:06.220
have a function F, which has three
different methods and we have an integer

00:36:06.220 --> 00:36:10.960
argument that can be matched on X, or on Y,
and then we have a floating point argument

00:36:10.960 --> 00:36:16.690
on Y and we call this with a "1,Hello".
Well, we will select the methods that is

00:36:16.690 --> 00:36:24.420
most specific for this argument, which
would be the number 1 here. On the other

00:36:24.420 --> 00:36:28.820
hand, if when we have a float 64 and
the second position, then we will call the

00:36:28.820 --> 00:36:34.730
second method. Now what happens if I pass
in an integer and the first position and a

00:36:34.730 --> 00:36:38.859
floating point in the second position?
Well, you would get a run time error

00:36:38.859 --> 00:36:44.180
because we can't make this decision. What
is the most specific method? That's just

00:36:44.180 --> 00:36:49.170
something to keep in mind. Method
specialization works really similarly when

00:36:49.170 --> 00:36:55.350
you call a method for the first time. This
method sign right now has no

00:36:55.350 --> 00:37:01.380
specializations. And then I look back,
call it once and Julia will insert a

00:37:01.380 --> 00:37:06.710
speciallisation just for Float64. Before
that it could have been a Float32. The

00:37:06.710 --> 00:37:11.430
Float64 is for this method. So
Julia specializes in compilers methods on

00:37:11.430 --> 00:37:15.050
concrete called signatures instead of
keeping everything dynamic or everything

00:37:15.050 --> 00:37:23.210
ambiguous. You can introspect this process
and there are several macros that are code

00:37:23.210 --> 00:37:30.510
lowered or code type that will help you
understand that process. I think I don't

00:37:30.510 --> 00:37:35.160
have enough time to go into detail here,
but just as a note, if you have a look at

00:37:35.160 --> 00:37:40.600
this, the percentage for means it's an
assignment. So if you reference it later,

00:37:40.600 --> 00:37:48.660
so in line 5, we will iterate on the 4
value. And then we can look at the type

00:37:48.660 --> 00:37:52.730
information that Julia infers out of that
call. We're calling the function mandel

00:37:52.730 --> 00:37:59.490
with the U in 32 and you can see how that
information propagates through the

00:37:59.490 --> 00:38:05.109
function itself. And then if you actually
do agressive inlining .., we do aggressive

00:38:05.109 --> 00:38:09.750
inlining and optimizations and
devirtualization. And so in the end, we

00:38:09.750 --> 00:38:15.960
don't have calls anymore. We only have the
intrinsics that Julia provides on which

00:38:15.960 --> 00:38:23.870
programs are actually implemented. So this
is a unsigned less than integer function.

00:38:23.870 --> 00:38:27.830
So we are using time and find as an
optimization to find static or near static

00:38:27.830 --> 00:38:32.260
site programs. It allows us to do us
agressive virtualization, inlining and

00:38:32.260 --> 00:38:36.630
constant propagation. But it raises
problems of cash and validation. So in

00:38:36.630 --> 00:38:42.250
bygone days, this used to be the case. I
could define a new function G after

00:38:42.250 --> 00:38:48.140
calling G want a function, a new function,
f after calling G once and I would get the

00:38:48.140 --> 00:38:53.250
old restore back. That's bad. That's
counter-intuitive. That's not dynamic. So

00:38:53.250 --> 00:38:59.560
in Julia 1.0 and I think 0.5 and 0.6
already. We fix that. So we invalidating

00:38:59.560 --> 00:39:06.400
the functions that have dependencies on
the function. We just changed. But can we

00:39:06.400 --> 00:39:09.710
see latency of your program? If you change
a lot of the functions and you recall them

00:39:09.710 --> 00:39:20.781
well hm we need to do a lot of work every
time. We do constant propagation, so it

00:39:20.781 --> 00:39:27.420
isn't very simple example. We try to
reduce. We try to exploit as much

00:39:27.420 --> 00:39:32.210
information as possible. And so if you
call if you want a function F and you call

00:39:32.210 --> 00:39:36.360
the function sign with a constant value,
we actually build just turning you the

00:39:36.360 --> 00:39:40.520
constant avoiding the calculation is the
sine entirely. And that can be very

00:39:40.520 --> 00:39:49.930
important during hot calls and in a cycle.
This can sometimes go wrong or Julia can

00:39:49.930 --> 00:39:53.930
has heuristics in order to decide when or
whether or not these optimizations are

00:39:53.930 --> 00:40:00.680
valuable. And so when you introspect your
code, you might see the results that are

00:40:00.680 --> 00:40:05.670
not that are not quite, what you want. So
we don't know what the return value here

00:40:05.670 --> 00:40:10.150
is. It's just a tuple. We know it's a 
tuple, nothing else. Holistic to say, not

00:40:10.150 --> 00:40:14.060
specialize. But the nice thing about Julia
and where we get performance voice that we

00:40:14.060 --> 00:40:17.970
can actually do for specialisation and
hopefully at some point view makes a

00:40:17.970 --> 00:40:26.050
compiler smart enough that these edge
cases disappear. So I can use some secrets

00:40:26.050 --> 00:40:33.280
and foresee specialization to happen and
then I can actually infer the precise of

00:40:33.280 --> 00:40:40.270
return type of my function. Another thing
to know when you're coming for more

00:40:40.270 --> 00:40:45.050
traditional object oriented programing
language is that types are not extensible,

00:40:45.050 --> 00:40:50.190
extendable. So you can't inherit from
something like Int64. You can only subtype

00:40:50.190 --> 00:40:55.710
abstract types. We do that because
otherwise we couldn't do a lot of

00:40:55.710 --> 00:41:01.110
optimizations. When we, when we look at
programms, we can't never assume that you

00:41:01.110 --> 00:41:05.310
won't add code. We had a dinamic programming
language at any time in the runtime of your

00:41:05.310 --> 00:41:11.760
program you can't add code. And so we don't
have close word semantics, which doesn't

00:41:11.760 --> 00:41:16.500
doesn't allow us to say, hey, by the way,
we know all possible subtypes here. You

00:41:16.500 --> 00:41:20.800
might add a new type. Later on by
saying a common types are not extendable.

00:41:20.800 --> 00:41:27.740
We get a lot of the performance back. So
personally, for me, why do I like Julia?

00:41:27.740 --> 00:41:32.510
Or why do I work on Julia? It works like
Pyphon, it talks like Lisp and runs like

00:41:32.510 --> 00:41:38.120
Fortran. That's my five sales pitch.
It's very hackable and extendable.

00:41:38.120 --> 00:41:46.500
I can poke at the internals
and I can bend them if I need to. It's a

00:41:46.500 --> 00:41:52.330
bit of upon LVM. So in reality, for me as
a compiler writer, it's my favorite LVM

00:41:52.330 --> 00:41:59.190
front end. I can get the LVM code
that I need to actually run. But for users,

00:41:59.190 --> 00:42:04.870
that's hopefully not a concern. If you do
our job right and it has users in

00:42:04.870 --> 00:42:09.430
scientific computing and I'm in a prior
life whilst doing a lot of scientific

00:42:09.430 --> 00:42:15.040
computing in cognitive science wanting
models. And I care about these users

00:42:15.040 --> 00:42:21.190
because I've seen how hard it can be to
actually make progress when the tools you

00:42:21.190 --> 00:42:28.780
have are bad. And my personal goal is to
enable scientists and engineers to

00:42:28.780 --> 00:42:37.500
collaborate efficiently and actually make
change. Julia is a big project and Climate

00:42:37.500 --> 00:42:44.450
is a big project and many people to thank.
And with that, I would like to extend you

00:42:44.450 --> 00:42:50.180
an invitation if you're interested. There
is juliacon every year. Where you have a

00:42:50.180 --> 00:42:57.830
develop meet up. Last year we were about
60 people are much smaller than CCC. But

00:42:57.830 --> 00:43:02.240
next year it will be in Lisbon. So come
join us if you're interested and if you

00:43:02.240 --> 00:43:05.970
want to meet scientists who have
interesting problems and are looking for

00:43:05.970 --> 00:43:08.570
solutions. Thank you.

00:43:08.570 --> 00:43:17.120
<i>Applaus</i>

00:43:17.120 --> 00:43:23.860
Herald A: Time for questions and answers,
are there any questions?

00:43:23.860 --> 00:43:29.050 line:1
Herald H: Yeah, we've got microphones over
there. So just jump to the microphone and

00:43:29.050 --> 00:43:33.010
ask your questions so that
everybody could hear.

00:43:33.010 --> 00:43:38.510
Question: What do you mean when you say
dead? Julia talks like Lisp and how is

00:43:38.510 --> 00:43:43.280
that a good thing <i>Lachen</i>
Churavy: Well, it talks like Lisp, but it

00:43:43.280 --> 00:43:48.490
doesn't look like Lisp. I assume that's
what you mean. It doesn't have that many

00:43:48.490 --> 00:43:53.960
braces. But no, Lisp has another powerful meta 
programming capabilities and macros. And

00:43:53.960 --> 00:43:58.320
so we have a lot of that. If you read a
little bit about the history of Lisp. The

00:43:58.320 --> 00:44:03.650
original intention was to write NLisp,
which would be Lisp with a nice syntax. And

00:44:03.650 --> 00:44:07.320
I think Julia is my personal is NLisp.
It has all these nice features, but it

00:44:07.320 --> 00:44:13.390
doesn't have the packet syntax. 
Herald A: OK. Thank you.

00:44:13.390 --> 00:44:18.580
Question: Thanks for the talk. My question
is regarding the first part of the talk.

00:44:18.580 --> 00:44:22.920
You, if I understand correctly, you
simulating a deterministic system there.

00:44:22.920 --> 00:44:26.180
So there's no additional noise
term or anything, right?

00:44:26.180 --> 00:44:29.990
Ramadhan: Well, if you had infinite
precision, I think it would be

00:44:29.990 --> 00:44:34.850
deterministic. But I think by kind design
turbulence itself is not deterministic.

00:44:34.850 --> 00:44:38.400
Well, it's a chaotic system,
Question: But the district size version

00:44:38.400 --> 00:44:41.400
itself is deterministic. You don't have
the monte carlo part where you have

00:44:41.400 --> 00:44:44.470
some noise that you would add to 
which might actually be justified

00:44:44.470 --> 00:44:49.800
from the physics side. Right?
Ramadhan: Well, I mean, we, if you think if

00:44:49.800 --> 00:44:53.650
you ran the same simulation again, you
would not get that. Well, I think if you

00:44:53.650 --> 00:44:55.940
ran on the exact same machine, 
you would get the

00:44:55.940 --> 00:44:58.240
same answer. So in that 
sense, it is deterministic.

00:44:58.240 --> 00:45:00.910
But if you ran on a slightly 
different machine like truncation

00:45:00.910 --> 00:45:04.010
error, I'd like the 16th decimal place
could give you a completely different

00:45:04.010 --> 00:45:08.180
answer. Question: Sure. So the point I'm
trying. Am I allowed to continue?

00:45:08.180 --> 00:45:12.270
Herald H: Yes, of course. There's no one
else. Well, there is one person else. So you

00:45:12.270 --> 00:45:17.250
can continue a few minutes if you want to.
Thanks. <i>Laughter</i>

00:45:17.250 --> 00:45:20.030
Question: So the point I was
trying to make is,

00:45:20.030 --> 00:45:22.420
if you add noise in the 
sense that it's a physical

00:45:22.420 --> 00:45:24.790
system, you have noise in
there, it might actually allow you to

00:45:24.790 --> 00:45:28.890
solve a PDI or discretize a PD, but get a
stochastic simulation itself, which might

00:45:28.890 --> 00:45:34.480
be interesting because it often can make
things easier. And also, you mentioned

00:45:34.480 --> 00:45:39.010
neural differential equations, right? And
in particular, with physical systems, if

00:45:39.010 --> 00:45:43.231
you have an discontinuities, for example,
the DT integral can actually be quite the

00:45:43.231 --> 00:45:47.890
problem. And there is work on to just
plug my colleagues work, control neutral

00:45:47.890 --> 00:45:50.860
differential equations where you can
actually also built in these

00:45:50.860 --> 00:45:53.580
discontinuities, which might also be
interesting for you guys.

00:45:53.580 --> 00:45:56.470
Ali: That's why maybe we should talk
because I don't know much about that stuff

00:45:56.470 --> 00:45:59.690
where we're kind of just starting up. I
think that so we've been doing this maybe

00:45:59.690 --> 00:46:03.478
hopefully continuous, but maybe we'll hit
discontinuities. I don't know. We should

00:46:03.478 --> 00:46:07.359
talk, though. Q: And also the math is
beautiful and has no sickness. It's the

00:46:07.359 --> 00:46:10.359
physics that mightn't change. I'm a
mathematician. I have to say that. Ali: I know

00:46:10.359 --> 00:46:15.480
that the physics is ugly, trust me.
Churavy: Just as quickly, we do have

00:46:15.480 --> 00:46:24.570
stickers and I sell cookies, too. They are
in the cookie box and on. I think they for

00:46:24.570 --> 00:46:29.530
somebody from our community is giving a
juliaworkshop and we're trying to find a

00:46:29.530 --> 00:46:34.290
set up an assembly space and hopefully
that goes out as well.

00:46:34.290 --> 00:46:38.490
Herald H: Go on please.
Question: Also, one question for the first

00:46:38.490 --> 00:46:44.840
part of the talk I want. I wanted to ask
if it's possible or if you are using

00:46:44.840 --> 00:46:49.190
dynamic resolution in your climate models.
Well, you will maybe have a smaller grid

00:46:49.190 --> 00:46:54.900
size near the (???) and larger
in the areas that are not that

00:46:54.900 --> 00:46:58.140
interesting.
Ramadhan: Like adaptive grids? So I

00:46:58.140 --> 00:47:02.520
think we mostly do that in the vertical.
So usually in the ocean, the thinking

00:47:02.520 --> 00:47:04.260
things are interesting 
in the, you know,

00:47:04.260 --> 00:47:06.070
close to the surface. 
We have more resolution

00:47:06.070 --> 00:47:08.780
there. But as you go deeper,
things get less interesting. So you put

00:47:08.780 --> 00:47:14.380
less resolution there. Generally, I think
in general, the idea people have asked

00:47:14.380 --> 00:47:17.760
that before, you know, why do you always
use constant grids? Why don't you use

00:47:17.760 --> 00:47:21.700
these adaptive grids on your global, you
know, models? And you the answer I've

00:47:21.700 --> 00:47:24.670
heard I don't know if it's very
convincing. I think generally there hasn't

00:47:24.670 --> 00:47:28.950
been that much research or people who do
research into adaptive grids for kind of

00:47:28.950 --> 00:47:34.960
models. Their funding gets cut. But I like
the answer I've heard is a lot of the

00:47:34.960 --> 00:47:38.070
time, a lot of the atmosphere and ocean is
turbulent. So if you especially you do

00:47:38.070 --> 00:47:42.750
kind of adaptive refinement, then you just
kind of adapt everywhere because there's

00:47:42.750 --> 00:47:48.400
kind of turbulence everywhere. But yeah, I
don't I'm not. I guess first for our

00:47:48.400 --> 00:47:53.320
simulations we're kind of just some of
the numerical methods are only fast if you

00:47:53.320 --> 00:47:57.070
run it on a regular grid. So
that's the reason we don't use adaptive

00:47:57.070 --> 00:48:01.310
grids for our simulations. But in general,
adaptive grids for climate models is

00:48:01.310 --> 00:48:05.010
interesting beyond like it seems like
there needs to be more research in that

00:48:05.010 --> 00:48:07.990
area. So I don't know if I answered your
question, but I kind of just ranted it.

00:48:07.990 --> 00:48:10.890
Question: You did, thanks.
Herald H: Go go ahead, please.

00:48:10.890 --> 00:48:16.160
Question: Yeah, it's just a few guesses
about us. I think I have. I wept quite a

00:48:16.160 --> 00:48:22.170
bit of legacy fortune code in Python. And
my question is, would there be a simple

00:48:22.170 --> 00:48:28.740
pass converting Fortran code to Julia,
preferably automatically. Do you have any

00:48:28.740 --> 00:48:33.060
ideas about this one?
Churavy: You can do it. Your Julia code

00:48:33.060 --> 00:48:39.100
will look like Fortran code. So you
haven't won anything. So, yes. As a good

00:48:39.100 --> 00:48:42.170
starting point, you can do that.
Absolutely. But you can also just call

00:48:42.170 --> 00:48:46.080
Fortran from Julia and then totally move
over. I generally don't want people to

00:48:46.080 --> 00:48:50.060
rework their code, except if there's a
good reason. Like starting from scratch

00:48:50.060 --> 00:48:55.470
sometimes helps. It can be a good reason.
Or if you say the solutions, we don't have

00:48:55.470 --> 00:49:00.880
the necessary experts to to work with the
old solution anymore. But generally, if

00:49:00.880 --> 00:49:04.700
you have Fortran code, I would just say,
well, call Julia from Fortran or from

00:49:04.700 --> 00:49:11.040
Julia, get it up to speed and then start
transitioning. Piece by piece. That makes

00:49:11.040 --> 00:49:16.770
sense?
Herald H: So any more questions? No more

00:49:16.770 --> 00:49:23.530
questions. That's an early read. Ali Ramadhan,
and Valentin Churavy, thank you very much.

00:49:23.530 --> 00:49:25.980
<i>Applaus</i>

00:49:25.980 --> 00:49:29.370
<i>36C3 postroll music</i>

00:49:29.370 --> 00:49:51.000
Subtitles created by c3subtitles.de
in the year 2020. Join, and help us!