36C3 preroll music
Herald-Angel: This Talk will be about… I
have to read this. Mathematical diseases
in climate models and how to cure them.
And I don't have the slightest idea what
these two guys are talking about now. And
when I asked them, they said, just tell
the people it's about next generation
climate models and how to build them.
Which is cool. Throw that on Twitter.
Please welcome Ali Ramadhan and Valentin
Churavy.
applause
Ali Ramadhan: Can you guys hear us? Is
this OK?
Valentin Churavy: I'll stand back...
Ramadhan: I'll stand back a little bit.
OK, cool. Thank you. So if you guys saw
the last talk by karlab... karlabyrinth or
something. So we're kind of expanding on
her talk a little bit. So she talked a lot
about kind of uncertainties…
audio feedback from microphone
uncertainties in climate models. And one
point that she did make was that most of
the uncertainty actually comes from
humans. But there's a really huge
uncertainty that also comes from… comes
from the models. So we're talking more
about the model uncertainties, which is
kind of uncertainties because of unknown
or missing physics, kind of how to cure
them. So it'll be kind of a weird talk. So
I'll talk a little bit more about the
climate modeling part and then kind of how
to cure them involves using new programing
languages. And that's where Valentin will
talk about Julia. So we'll kind of just
start with maybe just giving kind of an
idea of why it's so hard to model the
climate. So if you've… maybe you've seen
images like this a lot where it's like a…
it's a satellite image basically of
clouds. It's used for like weather
forecasting. But you can immediately see
there's lots of, you know, lots of really
small clouds. So basically, if you want to
build a climate model, you've got to be
able to resolve all the physics in these
clouds. So you can actually zoom in a lot.
And see the clouds look pretty big over
here. But if you zoom in on kind of
Central America, then you see even smaller
clouds. And if you zoom in even more so,
so you zoom in on the Yucatan Peninsula,
then you can see the clouds are really,
really small. So you're… there are maybe
five smaller clouds, some some of the
clouds are, you know, a hundred meters or
something. And as the last talk kind of
suggests that most climate models are…
they resolve things of, you know, up to 50
kilometers. So anything smaller than 50
kilometers, the climate model can't really
see. So you have to kind of take that… it
kind of has to account for that because
clouds are important, and if you have more
clouds, then that reflects some of the
heat out. So maybe you cool. But it also
traps more of the heat in so maybe you
warm. And if you have more clouds, maybe
you warm more. But if you have less
clouds, maybe you warm even more. So it's
kind of unsure. We actually don't know if
clouds will make the climate warmer or if
they'll make the climate cooler. So it's
important for your climate models to kind
of resolve or see these little clouds. So
kind of where the mathematical disease
comes in, is that you don't… we don't know
what equation to solve. We don't know
exactly what physics to solve, to see, to
kind of resolve the effect of these little
clouds. So it's kind of the the
mathematical disease. We don't know how to
do it. So you instead use a… well, it's
called a parametrization, which is the
mathematical disease. So in the
atmosphere, the big mathematical disease
is clouds. But if you look at the ocean,
you kind of get a similar… You have also
similar mathematical diseases. So if you
for example, this is model output. We
don't have good satellite imagery of the
oceans. So if you if you look at, for
example, model output from an ocean model,
high resolution ocean model here, it's
kind of centered on the Pacific. So you
can kind of see Japan and China and the
white kind of lines. Those are streamlines
or that the lines tell you where the water
is going. So you could see a lot of kind
of straight lines. You see this curve here
current off of Japan, but you see lots of
circles. So the circles are these eddies
and they're kind of the turbulence of the
ocean. They move, they kind of stir and
mix and transport a lot of salt or heat or
carbon or nutrients or… you know, marine
life or anything. It's the main way the
ocean kind of moves heat from the equator
to the pole. It kind of stirs things
around. So they're really important for
kind of how carbon moves in the ocean, for
how the ocean heats up. And here they look
pretty big. But again, you can zoom in and
you'll see lots of small scale structures.
So we're going to switch to a different
model output and different colors. So here
here's kind of the same area. So you see
Japan in the top left. But what's being
plotted is vorticity. So you have to know
what that is. It's kind of a measure of
how much the fluid or the water is
spinning. But the point is that you have
lots of structure. So there's lots of, you
know, big circles, but there also lots of
really little circles. And again, your
climate model can only see something like
50 kilometers or 100 kilometers. But as
you can see here, there's lots of stuff
that's much smaller than a hundred
kilometers. So if you superimpose kind of
this this grid, maybe that's your climate
model grid. And, you know, basically for
the climate model, every one of these
boxes is like one number. So you can't
really see anything smaller than that. But
there's important dynamics and physics
that happens in like 10 kilometers, which
is a lot smaller than what the climate
model can see. And there's even important
physics that happens that like 100 meters
or 200 meters. So if you want if you want
to, you know, what the climate will look
like, we need to… we need to know about
the physics that happens at 200 meters. So
to give an example of some of the physics
that happens at 10 kilometers, here's kind
of a little animation where this kind of
explains why you get all these eddies or
all the circles in the ocean. So a lot of
times you have, say, hot water, say, in
the north. So the hot water here is all in
orange or yellow and you have a lot of
cold water. So the cold water is in the
south and it's purple. And then once this…
once you add rotation, you end up with
these eddies because what the hot water
wants to do, the hot water is lighter,
it's less dense. So it actually wants to
go on top of the cold water. So usually
have cold at the bottom, hot at the top.
So you have heavy at the bottom and light
at the top. So when you add… without
rotation, the hot water will just go on
top of the cold water. But when you have
rotation, you end up… it kind of wants to
tip over. But it's also rotating. So you
kind of get this beautiful swirling
patterns and these are kind of the same
circular eddies that you see in the real
ocean. But this model here is like two
hundred and fifty kilometers by five
hundred kilometers and it's like one
kilometer deep. So you need a lot of
resolution to be able to resolve this
stuff, but not… your climate model doesn't
have that much resolution. So some of the
features here, like the sharp prints
between the cold and the hot water, your
climate model might not see that. So maybe
if you if you don't resolve this properly,
you get the mixing rate wrong or maybe
that the ocean is the wrong temperature or
something. So it's kind of important to
resolve this stuff. Another one, the color
scheme here is really bad. laughs I'm
sorry, but another one, for example, is
here. Everything's under 100 meter, so
it's a cube of 100 meters on each side and
you're starting with 20 degrees Celsius
water at the top. You have 19 degrees
Celsius water at the bottom initially. So
it's kind of you're… as you go deeper in
the ocean, the water gets colder. And then
if you can imagine, the ocean kind of at
night, it's kind of cold. So the top is
being cooled and you end up with cold
water on the top. The cold water wants to be
at the bottom. So it ends up sinking and you
get all this convection going on. So this
is happening at a lot of places in the
ocean. You get a lot of mixing at the top.
You get this kind of layer at the top of
the ocean. It's kind of constant color,
constant temperature. So this mix layer is
important for the ocean. So knowing how
deep that mix layer is and knowing how
much of the water is being mixed is also
important for for climate. But as you can
imagine, you know, if this happens on very
small scales. So you're climate model has
to know something about what's happening
at this scale. So this isn't I guess the
mathematical diseases in the ocean is, the
climate model cannot see this, so it has
to do something else that's maybe
unphysical to resolve this stuff. And
that's a mathematical disease, I guess.
Aside from the ocean and the atmosphere.
You also have the same problem with sea
ice. So this is kind of just a satellite
picture of where sea ice is forming off
the coast of Antarctica. So you get winds
that kind of come off the
continent and they're
kind of blowing all the
ice that's beeing formed
away. So you get all these
little lines and streaks and they kind of
merge into sea ice. But in this whole
picture is like 20 kilometers. So the
climate model doesn't see this, but
somehow it has to represent all the
physics. And you have kind of similar
things happening with soil moisture, land
and dynamic vegetation, aerosols. So, you
know, these are kind of three places with
pretty pictures. But see, if you look at
the atmosphere, so it's not just clouds.
You also have aerosols, which are like
little particles, or sulfates that are
important for kind of cloud formation and
maybe atmospheric chemistry. But again, we
don't fully understand the physics of
these aerosols. So again, you have to kind
of parametrize them. Same thing with kind
of convictions. You maybe your climate
model doesn't resolve all the very deep
convection in the atmosphere so as to get
all sides to parametrize it. So I guess
you have many kind of mathematical
diseases in the atmosphere. So I'm not
expecting you to understand everything in
this in this picture. But the idea is: The
atmosphere is complicated. There's no way
a climate model is going to kind of, you
know, figure all this out by itself. And
again, you could you could do something
similar for the ocean. So we can just show
an image for like two little parts of
these. But the point is, you know, the
ocean is not kind of just a bucket of
water standing there. So there's lots of
stuff happening deep inside the ocean. And
some of it, we think is important for
climate. Some of it we don't know. Some
might not be important. But again, a lot
of this happens on very small spatial
scales. So we don't know or the climate
model can't always resolve all this stuff.
And again, same thing with kind of sea
ice. Lots of small scale stuff is
important for sea ice. And I think one
person asked about kind of tipping points
and there's kind of two with like sea ice
that are pretty important. One of them is
this CSL biofeedback. So if you have sea
ice that melts. Now you have more ocean
and the ocean can absorb more heat. But
now the earth is warmer, so it melts more
sea ice. So as soon as you kind of start
melting sea ice, maybe you melt even more
sea ice and eventually you reach an earth
with no sea ice. So there's kind of
research into that stuff going on, but
it's a possible tipping point. Another one
is this kind of marine ice sheet,
stability, instability at the bottom of
the ice shelf. So if you start melting
water, if you start melting ice from the
bottom of the ice shelf, then we create
kind of a larger area for more ice to
melt. So maybe once you start melting and
increasing sea level, you just keep
melting more and more and increasing sea
level even more. But again, it's kind of
hard to quantify these things on like 50
or 100 year timescales because it all
happens on very small scales. So yeah, the
point is there's lots of these kind of
parametrizations or mathematical diseases.
And once you start adding them all up, you
end up with lots and lots of kind of
parameters. So this is a really boring
table. But the point is, so this is like
one parametrization for like vertical
mixing in the ocean. It's basically the
process that I showed the rainbow color
movie about to see a climate model for
that. I'm trying to kind of parametrize
that, physics might have like 20
parameters. And, you know, some of them
are crazy like a surface layer fractional
like zero point one or something. And
usually they keep the same constants for
all these values. Usually it's like
someone in like 1994 came up with these 20
numbers and now we all use the same 20
numbers. But you know, maybe they're
different. And like the Pacific or the
Atlantic or like maybe they're different
when it's summer and winter and the
problem is, there's many of these
parametrizations. So you know here's like
20 parameters, but then you have a lot
more for clouds. You have a lot more sea
ice. We add them all up. Suddenly you have
like 100, maybe up to a thousand kind of
tunable parameters. Kind of going back to
this plot that was shown at the last talk.
You can see kind of the all the models
kind of agree really well from like 1850
to 2000, because they're all kind of being
they all have different kind of
parameters, but they all get kind of tuned
or optimized. So they get the 20th
century. Correct. So they get the black
line. Correct. But then when you run them
forward, you run them to like 2300. They
all are slightly different. So they all
start producing different physics and
suddenly you get a huge like red band. So
that's saying you have lots of model
uncertainty. So it's kind of on some
people might say like oh this like tuning
process is like optimization. It's like
not very scientific to be kind of right.
It's kind of like in the past. It's kind
of like the best live we've had. But I
think, you know, we should be able to do a
little bit better. Better than that. So
just to give you the idea, you know, some
people would say, you know, why don't you
just you know, most of the physics, which
you just, you know, resolve all the
physics, you know, but see if you want to
do like a direct numerical simulation, so
it's basically saying you want to resolve
all the motions in the ocean, in the
atmosphere. You basically need to resolve
things down to like one millimeter. So if
you have like a grid spacing of one
millimeter and you consider the volume of
the ocean and the atmosphere, you
basically say you need like 10 to the 28
grid points. You know, that's like imagine
putting cubes of like one millimeter
everywhere in the
ocean and atmosphere.
That's how many great
points you would need.
So unfortunately, you could do that.
But there's not enough computer power or
storage space in the world to do that. So
you're kind of stuck doing something a bit
coarser. Usually most climate models, he's
like 10 to the 8 great points so that you
10 to the 20 to little words. You don't
want to just run a big climate model once
you know you need to run them for very
long times, usually like you run them for
a thousand years or ten thousand years
when you want to run many of them because
you want to collect statistics. So
generally you don't run at the highest
resolution possible. You run kind of at a
lower resolution so you can run many, many
models. So because you can only use so
much resolution, it seems that power
transitions or these kind of mathematical
things, you have to live with them, you've
got to use them. But at least one idea is,
you know, instead of using numbers that
sum. But he came up with in 1994. You
might as well try to figure you know,
better numbers or maybe
you know if the numbers
are kind of different
in different places,
you should find that out. So one
thing you could do, one thing we are
trying to do is get the pressurization is
to kind of agree with like basic physics
or agree with observations. So we have
lots of observations. How many we can run
kind of high resolution
simulations to resolve
a lot of the physics
and then make sure
when you put the prioritization in
the climate model, it actually gives you
the right numbers according to basic
physics or observations. But sometimes
that might mean, you know, different
numbers in the Atlantic, in the Pacific or
different numbers for the winter and the
summer. And you have to run many high
resolution simulations to get enough data
to do this. But indeed, you know, these
days I think we have enough computing
power to do that. So it's kind of do all
these high resolution simulations. We
ended up building a new kind of ocean
model that we run on GPUs because these
are all faster for giving us these
results. So we ended up usually most
climate modeling is done in Fortran. We
decided to go with with Julia for a number
of reasons, which I'll talk about. But the
left figure is kind of that mixed layer or
boundary layer turbulence kind of movie.
But instead of the rainbow color map, now
it's using a more reasonable color maps.
It looks like the ocean, the right is that
old movie. So we're generating tons and
tons of data from using simulations like
this and then hopefully we can get enough
data and like figure out a way to explain
the prior transitions. But it's kind of a
work in progress. So a different idea that
might be more popular here, I don't know.
Is instead of kind of using the existing
permanent positions, you could say, OK,
well, now you have tons and tons of data.
Maybe you just throw in like a neural
network into the differential equations.
Basically, you put in the physics, you
know, and then the neural network is
responsible for the physics you don't
know. So, for example, you know, most
people here might not. I also don't want
to talk about differential equations
because I would take a long time. So just
imagine that the equation
in the middle is kind of
what a climate model
needs to solve.
And the question marks are kind of
physics we don't know. So we don't know
what to put there. But maybe you could put
out a neural network. So number one is
kind of a possible characterisation or a
possible way you could try to franchise
the missing physics where the neural
networks kind of responsible for
everything. We find that doesn't work as
well. So instead, maybe you tell it some
of the physics, maybe tell it about cue,
which is like the heating or cooling at
the surface. And then it's kind of
responsible for resolving the other stuff.
But it's still a work in progress because
the blue is kind explosive your data. The
orange is supposed to be the narwhal and
they don't agree. So it's still a work in
progress, but hopefully we'll be able to
do that better. So this is kind of stuff
that's like a week or two old. But kind of
reach a conclusion, at least from my half
of the talk. So the reason I personally
like Julia as a climate modeler is we were
able to kind of build an ocean model from
scratch basically in less than a year. And
one of the nice things
is that the user interface
or the scripting and the model
backend is all in one language,
whereas in the past
used to usually write the high
level and like Python and maybe the back
end is like Fortran or C. And we find, you
know, when we Julia, it's just as fast as
our legacy model, which was written in
Fortran. And one of the nicest things was
that basically able to write code once and
using there's a need of GPU compiler. So
basically you write your code one single
code base and you go to CPUs and GPUs.
So you'd want to write two different code
bases. And yeah, we find generally because
it's high level language, we're all kind
of more productive. We can give a more
powerful user API and Julia kind of has a
nice multiple dispatch backend so that we
find that makes it easy for the users to
kind of extend the model or hack the
model. And there's. Some people would say
the Julia community is pretty small. But
we find there's a pretty big Julia
community interest in scientific
computing. So we fund kind of all the
packages we need are pretty much
available. So with our client conclude my
half by saying there is most of the
uncertainty in climate modeling basically
comes from humans because they don't know
what humans will do. But there's a huge
model uncertainty basically because of
physics we don't understand or physics,
the kind of model cannot see. You can't
resolve every cloud and you know, every
wave in the oceans you've got you've got
to figure out a way to account for them.
So that's what our prioritization does.
And we're trying to kind of use a lot of
computing power to kind of make sure we
train or come up with good privatizations
instead of kind of tuning the model at the
end. And we're hoping this will lead to
better climate predictions. Maybe you
will. Maybe you won't. But at least, you
know, even if it doesn't. Hopefully we can
say we got rid of the model tuning problem
and hopefully we can make. We find it
that software development for climate
modeling is easier than if we did it in
Fortran. I will say this kind of an
advertisement, but I'm looking to bike her
on Germany for a week and apparently can't
take the next bike out of Leipzig. So if
anyone is looking to sell their bicycle or
wants to make some cash, I'm looking to
rent a bicycle. So yeah, if you have one,
come talk to me, please. Thank you. Danke.
applause
Churavy: So one big question for me always
is how can we ask technologists hub that
think most of us in this room are fairly
decent with computers? The internet is not
necessarily an new island for us. But how
do we use that knowledge to actually
impact real change? And if you haven't
there's some fantastic article:
worrydreams.com/ClimateChange. Which lists
all the possible or not all the possible
but a lot of good ideas to think about and
go like, okay, do my skills apply in that
area? Well, I'm a computer scientist. I do
programing language research. So how do my
skills really apply to climate change? How
can I help? And one of the things that
took me in this article was how, and one
of the realization, and why I do my work
is that the tools that we have built for
scientists and engineers, they are that
poor. Computer scientists like myself have
focused a lot on making programing easier,
more accessible. What we don't necessarily
have kept the scientific community as a
target audience. And then you get into
this position where models are written in
a language. Fortran 74 and isn't that a
nice language, but it's still not one that
is easily picked up and where you find
enthusiasm in younger students for using
it. So I work on Julia and my goal is
basically to make a scientific computing
easier, more accessible and make it easier
to access the huge computing power we have
available to do climate modeling. Ideas,
if you are interesting in this space is,
you don't need to work on Julia
necessarily, but you can think about maybe
I'm to look at modeling for physical
systems, modeling like one of the
questions is can be model air conditioning
units more precisely, get them more
efficient? Or any other technical system.
How do we get that efficiency? But we need
better tools to do that. So the language
down here as an example is modelicar.
There is a project right now, modelicar is
trying to see how we can push the
boundary there. The language up here is Fortran.
You might have seen a little bit of that
in the talk beforehand and it's most often
used to do climate science. So why
programing languages? Why do I think that
my time is best spent to actually work on
programing languages and do that in order
to help people? Well, Wittgenstein says:
"The limits of my language are the limits
of my world." What I can express is what I
think about. And I think people are
multilingual, know that that sometimes
it's easier to think for them. But certain
things in one language say it isn't the
other one. But language is about
communication. It's about communication
with scientists, but it's also about
communication with the computer. And too
often programing language fall into that
trap where it's about, oh, I want to
express my one particular problem or I
wanna express my problem very well for the
compiler, for the computer. I won't talk
to the machine. What if I found that
programming languages are very good to
talk to other scientists, to talk in a
community and to actually collaborate? And
so the project that Ali and I are both
part of has, I think, 30 ish. I don't
know. The numbers are as big as the big
coupe of climate scientists modelers. And
we have a couple of numerical scientists,
computer scientists and engineers and we
all working the same language, being able
to collaborate and actually work on the
same code instead of me working on some
low level implementation and Ali telling
me what to write. That wouldn't be really
efficient. So, yes, my goal is to make
this search easier. Do we really need yet
another high level language? That is a
question I often get. It's like why Julia?
And not why are you not spending your time
and effort doing this for Python? Well, so
this is as a small example, this is Julia
code. It looks rather readable. I find it
doesn't use a semantic whitespace. You may
like that or not. It has all the typical
features that you would expect from a high
level dynamic language. It is using the
M.I.T. license that has a built in package
manager. It's very good for interactive
development, but it has a couple of
unusual wants and those matter. You need
if you want to simulate a climate model,
you need to get top performance on a
supercomputer. Otherwise you won't get an
answer in the time that it matters. Julia
uses just in time ahead of time
compilation, the other great feature is
actually a spitting in Julia. So I can
just look at implementations. I can dive
and dive and dive deeper into somebodies
code and don't have a comprehension
barrier. If I if you ever have spent some
time and tried to figure out how Python
sums numbers under the hood to make it
reasonably fast. Good luck. It's hard.
It's written in C. See, and there is a lot
of barriers in order to understand what's
actually going on. Then reflection and
meta programing. You can do a lot of fun
stuff which we're going to talk about. And
then the big coin for me is that you have
native keep you code generation support so
you can actually take Julia code and run
it on the GPU. You you're not
relying on libraries because libraries
only are can express the things. That was
where writtenin there. So early on last
December, I think we met up for the
climate science project and after deciding
on using Julia for the entire project.
They were like, we we're happy with the
performance, but we have a problem. We
have to duplicate our code for GPUs
and CPUs. What really? It can't be! I mean, I
designed the damn thing, it should be working.
Well, what they had at that point was
basically always a copy of two functions
where one side of it was writing the CPU
code and the other side was implementing a
GPU code. And really, there were only a
couple of GPU specific parts in there. And
if anybody has ever written GPU Code, it's
this pesky which index am I calculation.
Worthy for loop on the CPU to would just
looks quite natural. And I was like, what?
Sit. Come on. What we can do is we can
just wait a kernel so he takes a body of
the for loop, extracts it in a new
function. Add a little bit of sugar and
magic to court GPU kernels and CPU
functions and then we're done. Problem
solved. What the code roughly would look
look like isn't actually this. You can
copy and paste this and it should work.
And so you have two functions. One of them
launches where you extract your kernel.
Then you write a function that takes
another function and runs it function in a
for loop or it launches that function on
the GPU. And then you have this little GPU
snippet is the only bit of us actually
GPU, which calculates the index and
then calls the function F with an index
argument. I'm done here. My, my
contribution to this project was done,
Well, they came back to me and we're like,
now it's not good enough. And I was like,
why? Well, the issue is they needed kernel
fusion. So that's the process of taking
two functions and merging them together.
I'm like, okay, fine. Why do they need
that? Because if you want to be white(?)
average efficient GPO code, you need to be
really concerned about the numbers of
global memory loads and stores. If you
have too many of them or if they are
irregular, you lose a lot of performance
and you need good performance. Otherwise,
we can't simulate the solution once. They
also actually wanted to take use GPU
functionality and low level controlled.
They wanted to look at their kernels and
use shared memory constructs. They wanted
to do precise risk working, minimizing the
number of registers used and they really cared
about low level performance. They were
like, well, we can't do this with the
abstraction you gave us because it builds
up too many barriers. And I could have
given you a few more typical computer
science answer, which would have been OK.
Give me two years and I'll come back to
you and there is a perfect solution which
is like a cloud cover in the sky. And I
write your speech spoke language that does
exactly what you need to do. And at the
end, we have a domain specific language
for climate simulation that will do final
volume and discontinuous cloaking in
everything you want. And I will have a
PhD. Kit. Fantastic. Well, we don't have
the time. The whole climate science
project that we are on has accelerated
timeline because the philanthropist that
the funding that research are. Well, if
you can't give us better answer anytime
soon, it won't matter anymore. So I sat
down and was like, okay, I need a box. I
need something. It has minimal effort.
Quick delivery. I need to be able to fix
it. If I do get it wrong the first time
around and I did, it needs to be hackable.
My collaborator needs to understand it and
actually be able to change it. And it
needs to be happened yesterday. Well,
Julia is good at these kinds of hacks. And
as I've learned, you can actually let them
go into bespoke solutions and have better
abstractions after the fact. So that
you're that you can actually do the fancy
computer science that I really wanted to
do. The product is called GPUify Loops
because I couldn't come up with a worse
name, nobody else could. So we stick with
it. It's a Macro based. And so, Julia, you
can write syntax macros that transform the
transform the written statements into
similar statements so you can insert code
or remove code if you want to. At, right
now target CPUs and GPUs and we are
talking about how do we get multi threaded
into the story, how do we target more on
different GPUs? There are other projects
that are very similar. So there's OCCA,
which is where a lot of these ideas are
coming from and Open ACC in C++ does
something really similar. But basically
you write a for loop, you write an at
loop in front of it, which is the magic
macro that takes a transformation. And you
have two indexed statements and now you
just say I want to launch it on the GPU
and it magically does a job. Get,
fantastic. So let's pick up the entire
implementation of the macro at loop
without the error checking that didn't fit
on the screen a couple of nights. So
everything is here and basically I'm just
manipulating the for loop so that on the
GPU it only iterates one iteration per
index and on CPU it iterates all of the
indices because CPU is single threaded
and a GPU is many, many
multithreaded. Of course there's a little
bit of magic hidden in the device function
because how do I know where I'm running?
And if you're curious how to do that and
then we can talk after afterwards. But
otherwise, it's a very simple,
straightforward transformation. It's
written in Julia. It's a Julia function.
And. Yeah. So you don't need to understand
the code here. I just want to show how quick it
can be to write something like this. If
you know anything about GPU Programming at
all, there should be a little voice in the
head, of the back of your head is like,
wait a second. How can you run a dynamic
programming on a GPU? That shouldn't be
possible. Well, Julia can run on the GPU
because it has a lot of meta programing
facilities for the port for stage
programing. So I can generate code based
on a specific call signature. It has
introspection, reflection mechanisms that
allow me to do some interesting stuff in
the background. It is built upon LVM,
which is a common compiler infrastructure.
And so I can actually write staged
function that would generate an LVM
specific code for my one function and do
so do that during compile time and is a
dynamic language that tries really hard to
avoid runtime uncertainties. And this is
one of the challenges if you're getting
into Julia is to understand that when
you're writing code that has a lot of
runtime uncertainties, you get relative
slow performance, or as fast as Python.
But if you work with the compiler and you
write runtime uncertainties you can get
very fast code and you can run your code
on the GPU, you basically that's the
limites test. If you can run your code on
the GPU, that you did your job well and it
provides tools to understand the behavior
of your code. So a warning runtime
uncertainty. It does that and I don't have
the time to go too deep into the answers.
There is actually a paper about this. It
has a type system that allows you to do
some sophisticated reasoning type
influence to figure out what your code is
doing. Mutable dispatchers actually
helping us quite a lot in making it easier
to do virtualized codes. It was a case of
specialization and just in time
compilation. And so just looking a little
bit closer at some of these topics, if you
want to look at the entire pipeline that
flow when you start while you're
functioning, call it what happens through
the Julia compiler. You have tools to
introspect and all of these on the right
hand side here and then you have tools to
interact on the left hand side. You can
inject code back into the compiler. The
other thing is Julia has dynamic
semantics. So when you difficult, you can
at runtime, redefine your function and
recall it new function and it uses
multiple dispatch. So if you look at the
absolute value call here, which of the 13
possible methods will it call? In C++ or
in other programing languages this called
a virtual function call. So isn't Julia
everything a virtual functional call? No.
This is one of the important points is
when we call a function, let's say because
sign of X, we look at the type of the
input arguments and then we first of all
look at which function is applicable to
our input argument. So in this case, it
would be the real down here because float
64 is a subtype of real. So we choose the
right method using dispatch and then we
specialize that method for the signature.
So the rule in multiople dispact is to
remember is we calling the most specific
method, whatever specific might mean. So
if you have this bit of example, where we
have a function F, which has three
different methods and we have an integer
argument that can be matched on X, or on Y,
and then we have a floating point argument
on Y and we call this with a "1,Hello".
Well, we will select the methods that is
most specific for this argument, which
would be the number 1 here. On the other
hand, if when we have a float 64 and
the second position, then we will call the
second method. Now what happens if I pass
in an integer and the first position and a
floating point in the second position?
Well, you would get a run time error
because we can't make this decision. What
is the most specific method? That's just
something to keep in mind. Method
specialization works really similarly when
you call a method for the first time. This
method sign right now has no
specializations. And then I look back,
call it once and Julia will insert a
speciallisation just for Float64. Before
that it could have been a Float32. The
Float64 is for this method. So
Julia specializes in compilers methods on
concrete called signatures instead of
keeping everything dynamic or everything
ambiguous. You can introspect this process
and there are several macros that are code
lowered or code type that will help you
understand that process. I think I don't
have enough time to go into detail here,
but just as a note, if you have a look at
this, the percentage for means it's an
assignment. So if you reference it later,
so in line 5, we will iterate on the 4
value. And then we can look at the type
information that Julia infers out of that
call. We're calling the function mandel
with the U in 32 and you can see how that
information propagates through the
function itself. And then if you actually
do agressive inlining .., we do aggressive
inlining and optimizations and
devirtualization. And so in the end, we
don't have calls anymore. We only have the
intrinsics that Julia provides on which
programs are actually implemented. So this
is a unsigned less than integer function.
So we are using time and find as an
optimization to find static or near static
site programs. It allows us to do us
agressive virtualization, inlining and
constant propagation. But it raises
problems of cash and validation. So in
bygone days, this used to be the case. I
could define a new function G after
calling G want a function, a new function,
f after calling G once and I would get the
old restore back. That's bad. That's
counter-intuitive. That's not dynamic. So
in Julia 1.0 and I think 0.5 and 0.6
already. We fix that. So we invalidating
the functions that have dependencies on
the function. We just changed. But can we
see latency of your program? If you change
a lot of the functions and you recall them
well hm we need to do a lot of work every
time. We do constant propagation, so it
isn't very simple example. We try to
reduce. We try to exploit as much
information as possible. And so if you
call if you want a function F and you call
the function sign with a constant value,
we actually build just turning you the
constant avoiding the calculation is the
sine entirely. And that can be very
important during hot calls and in a cycle.
This can sometimes go wrong or Julia can
has heuristics in order to decide when or
whether or not these optimizations are
valuable. And so when you introspect your
code, you might see the results that are
not that are not quite, what you want. So
we don't know what the return value here
is. It's just a tuple. We know it's a
tuple, nothing else. Holistic to say, not
specialize. But the nice thing about Julia
and where we get performance voice that we
can actually do for specialisation and
hopefully at some point view makes a
compiler smart enough that these edge
cases disappear. So I can use some secrets
and foresee specialization to happen and
then I can actually infer the precise of
return type of my function. Another thing
to know when you're coming for more
traditional object oriented programing
language is that types are not extensible,
extendable. So you can't inherit from
something like Int64. You can only subtype
abstract types. We do that because
otherwise we couldn't do a lot of
optimizations. When we, when we look at
programms, we can't never assume that you
won't add code. We had a dinamic programming
language at any time in the runtime of your
program you can't add code. And so we don't
have close word semantics, which doesn't
doesn't allow us to say, hey, by the way,
we know all possible subtypes here. You
might add a new type. Later on by
saying a common types are not extendable.
We get a lot of the performance back. So
personally, for me, why do I like Julia?
Or why do I work on Julia? It works like
Pyphon, it talks like Lisp and runs like
Fortran. That's my five sales pitch.
It's very hackable and extendable.
I can poke at the internals
and I can bend them if I need to. It's a
bit of upon LVM. So in reality, for me as
a compiler writer, it's my favorite LVM
front end. I can get the LVM code
that I need to actually run. But for users,
that's hopefully not a concern. If you do
our job right and it has users in
scientific computing and I'm in a prior
life whilst doing a lot of scientific
computing in cognitive science wanting
models. And I care about these users
because I've seen how hard it can be to
actually make progress when the tools you
have are bad. And my personal goal is to
enable scientists and engineers to
collaborate efficiently and actually make
change. Julia is a big project and Climate
is a big project and many people to thank.
And with that, I would like to extend you
an invitation if you're interested. There
is juliacon every year. Where you have a
develop meet up. Last year we were about
60 people are much smaller than CCC. But
next year it will be in Lisbon. So come
join us if you're interested and if you
want to meet scientists who have
interesting problems and are looking for
solutions. Thank you.
Applaus
Herald A: Time for questions and answers,
are there any questions?
Herald H: Yeah, we've got microphones over
there. So just jump to the microphone and
ask your questions so that
everybody could hear.
Question: What do you mean when you say
dead? Julia talks like Lisp and how is
that a good thing Lachen
Churavy: Well, it talks like Lisp, but it
doesn't look like Lisp. I assume that's
what you mean. It doesn't have that many
braces. But no, Lisp has another powerful meta
programming capabilities and macros. And
so we have a lot of that. If you read a
little bit about the history of Lisp. The
original intention was to write NLisp,
which would be Lisp with a nice syntax. And
I think Julia is my personal is NLisp.
It has all these nice features, but it
doesn't have the packet syntax.
Herald A: OK. Thank you.
Question: Thanks for the talk. My question
is regarding the first part of the talk.
You, if I understand correctly, you
simulating a deterministic system there.
So there's no additional noise
term or anything, right?
Ramadhan: Well, if you had infinite
precision, I think it would be
deterministic. But I think by kind design
turbulence itself is not deterministic.
Well, it's a chaotic system,
Question: But the district size version
itself is deterministic. You don't have
the monte carlo part where you have
some noise that you would add to
which might actually be justified
from the physics side. Right?
Ramadhan: Well, I mean, we, if you think if
you ran the same simulation again, you
would not get that. Well, I think if you
ran on the exact same machine,
you would get the
same answer. So in that
sense, it is deterministic.
But if you ran on a slightly
different machine like truncation
error, I'd like the 16th decimal place
could give you a completely different
answer. Question: Sure. So the point I'm
trying. Am I allowed to continue?
Herald H: Yes, of course. There's no one
else. Well, there is one person else. So you
can continue a few minutes if you want to.
Thanks. Laughter
Question: So the point I was
trying to make is,
if you add noise in the
sense that it's a physical
system, you have noise in
there, it might actually allow you to
solve a PDI or discretize a PD, but get a
stochastic simulation itself, which might
be interesting because it often can make
things easier. And also, you mentioned
neural differential equations, right? And
in particular, with physical systems, if
you have an discontinuities, for example,
the DT integral can actually be quite the
problem. And there is work on to just
plug my colleagues work, control neutral
differential equations where you can
actually also built in these
discontinuities, which might also be
interesting for you guys.
Ali: That's why maybe we should talk
because I don't know much about that stuff
where we're kind of just starting up. I
think that so we've been doing this maybe
hopefully continuous, but maybe we'll hit
discontinuities. I don't know. We should
talk, though. Q: And also the math is
beautiful and has no sickness. It's the
physics that mightn't change. I'm a
mathematician. I have to say that. Ali: I know
that the physics is ugly, trust me.
Churavy: Just as quickly, we do have
stickers and I sell cookies, too. They are
in the cookie box and on. I think they for
somebody from our community is giving a
juliaworkshop and we're trying to find a
set up an assembly space and hopefully
that goes out as well.
Herald H: Go on please.
Question: Also, one question for the first
part of the talk I want. I wanted to ask
if it's possible or if you are using
dynamic resolution in your climate models.
Well, you will maybe have a smaller grid
size near the (???) and larger
in the areas that are not that
interesting.
Ramadhan: Like adaptive grids? So I
think we mostly do that in the vertical.
So usually in the ocean, the thinking
things are interesting
in the, you know,
close to the surface.
We have more resolution
there. But as you go deeper,
things get less interesting. So you put
less resolution there. Generally, I think
in general, the idea people have asked
that before, you know, why do you always
use constant grids? Why don't you use
these adaptive grids on your global, you
know, models? And you the answer I've
heard I don't know if it's very
convincing. I think generally there hasn't
been that much research or people who do
research into adaptive grids for kind of
models. Their funding gets cut. But I like
the answer I've heard is a lot of the
time, a lot of the atmosphere and ocean is
turbulent. So if you especially you do
kind of adaptive refinement, then you just
kind of adapt everywhere because there's
kind of turbulence everywhere. But yeah, I
don't I'm not. I guess first for our
simulations we're kind of just some of
the numerical methods are only fast if you
run it on a regular grid. So
that's the reason we don't use adaptive
grids for our simulations. But in general,
adaptive grids for climate models is
interesting beyond like it seems like
there needs to be more research in that
area. So I don't know if I answered your
question, but I kind of just ranted it.
Question: You did, thanks.
Herald H: Go go ahead, please.
Question: Yeah, it's just a few guesses
about us. I think I have. I wept quite a
bit of legacy fortune code in Python. And
my question is, would there be a simple
pass converting Fortran code to Julia,
preferably automatically. Do you have any
ideas about this one?
Churavy: You can do it. Your Julia code
will look like Fortran code. So you
haven't won anything. So, yes. As a good
starting point, you can do that.
Absolutely. But you can also just call
Fortran from Julia and then totally move
over. I generally don't want people to
rework their code, except if there's a
good reason. Like starting from scratch
sometimes helps. It can be a good reason.
Or if you say the solutions, we don't have
the necessary experts to to work with the
old solution anymore. But generally, if
you have Fortran code, I would just say,
well, call Julia from Fortran or from
Julia, get it up to speed and then start
transitioning. Piece by piece. That makes
sense?
Herald H: So any more questions? No more
questions. That's an early read. Ali Ramadhan,
and Valentin Churavy, thank you very much.
Applaus
36C3 postroll music
Subtitles created by c3subtitles.de
in the year 2020. Join, and help us!