preroll music
Herald: Our next talk is going to be about AI and
it's going to be about proper AI.
It's not going to be about
deep learning or buzz word bingo.
It's going to be about actual psychology.
It's going to be about computational metapsychology.
And now please welcome Joscha!
applause
Joscha: Thank you.
I'm interested in understanding
how the mind works,
and I believe that the most foolproof perspective
at looking ... of looking at minds is to understand
that they are systems that if you saw patterns
at them you find meaning.
And you find meaning in those in very particular
ways and this is what makes us who we are.
So they way to study and understand who we
are in my understanding is
to build models of information processing
that constitutes our minds.
Last year about the same time, I've answered
the four big questions of philosophy:
"Whats the nature of reality?", "What can
be known?", "Who are we?",
"What should we do?"
So now, how can I top this?
applause
I'm going to give you the drama
that divided a planet.
Some of a very, very big events,
that happened in the course of last year,
so I couldn't tell you about it before.
What color is the dress
laughsapplause
I mean ahmm... If you have.. do not have any
mental defects you can clearly see it's white
and gold. Right?
[voices from audience]
Turns out, ehmm.. most people seem to have
mental defects and say it is blue and black.
I have no idea why. Well Ok, I have an idea,
why that is the case.
Ehmm, I guess that you got too, it has to
do with color renormalization
and color renormalization happens differently
apparently in different people.
So we have different wireing to renormalize
the white balance.
And it seems to work in real world
situations in pretty much the same way,
but not necessarily for photographs.
Which have only very small fringe around them,
which gives you hint about the lighting situation.
And that's why you get this huge divergencies,
which is amazing!
So what we see that our minds can not know
objective truths in any way. Outside of mathematics.
They can generate meaning though.
How does this work?
I did robotic soccer for a while,
and there you have the situation,
that you have a bunch of robots, that are
situated on a playing field.
And they have a model of what goes on
in the playing field.
Physics generates data for their sensors.
They read the bits of the sensors.
And then they use them to.. erghmm update
the world model.
And sometimes we didn't want
to take the whole playing field along,
and the physical robots, because they are
expensive and heavy and so on.
Instead if you just want to improve the learning
and the game play of the robots
you can use the simulations.
So we've wrote a computer simulation of the
playing field and the physics, and so on,
that generates pretty some the same data,
and put the robot mind into the simulator
robot body, and it works just as well.
That is, if you the robot, because you can
not know the difference if you are the robot.
You can not know what's out there. The only
thing that you get to see is what is the structure
of the data at you system bit interface.
And then you can derive model from this.
And this is pretty much the situation
that we are in.
That is, we are minds that are somehow computational,
they are able to find regularity in patterns,
and they are... we.. seem to have access to
something that is full of regularity,
so we can make sense out of it.
[ghulp, ghulp]
Now, if you discover that you are in the same
situation as these robots,
basically you discover that you are some kind
of apparently biological robot,
that doesn't have direct access
to the world of concepts.
That has never actually seen matter
and energy and other people.
All it got to see was little bits of information,
that were transmitted through the nerves,
and the brain had to make sense of them,
by counting them in elaborate ways.
What's the best model of the world
that you can have with this?
What will the state of affairs,
what's the system that you are in?
And what are the best algorithms that you
should be using, to fix your world model.
And this question is pretty old.
And I think that has been answered for the
first time by Ray Solomonoff in the 1960.
He has discovered an algorithm,
that you can apply when you discover
that you are an robot,
and all you have is data.
What is the world like?
And this algorithm is basically
a combination of induction and Occam's razor.
And we can mathematically prove that we can
not do better than Solomonoff induction.
Unfortunately, Solomonoff induction
is not quite computable.
But everything that we are going to do is
some... is going to be some approximation
of Salomonoff induction.
So our concepts can not really refer
to the facts in the world out there.
We do not get the truth by referring
to stuff out there, in the world.
We get meaning by suitably encoding
the patterns at our systemic interface.
And AI has recently made a huge progress in
encoding data at perceptual interfaces.
Deep learning is about using a stacked hierarchy
of feature detectors.
That is, we use pattern detectors and we build
them into a networks that are arranged in
hundreds of layers.
And then we adjust the links
between these layers.
Usually some kind of... using
some kind of gradient descent.
And we can use this to classify
for instance images and parts of speech.
So, we get to features that are more and more
complex, they started as very, very simple patterns.
And then get more and more complex,
until we get to object categories.
And now this systems are able
in image recognition task,
to approach performance that is very similar
to human performance.
Also what is nice is that it seems to be somewhat
similar to what the brain seems to be doing
in visual processing.
And if you take the activation in different
levels of these networks and you
erghm... improve the... that... erghmm...
enhance this activation a little bit, what
you get is stuff that look very psychedelic.
Which may be similar to what happens, if you
put certain illegal substances into people,
and enhance the activity on certain layers
of their visual processing.
[BROKEN AUDIO]If you want to classify the
differences what we do if we want quantify
this you filter out all the invariences in
the data.
The pose that she has, the lighting,
the dress that she is on.. has on,
her facial expression and so on.
And then we go to only to this things that
is left after we've removed all the nuance data.
But what if we... erghmm
want to get to something else,
for instance if we want to understand poses.
Could be for instance that we have several
dancers and we want to understand what they
have in common.
So our best bet is not just to have a single
classification based filtering,
but instead what we want to have is to take
the low level input
and get a whole universe of features,
that is interrelated.
So we have different levels of interrelations.
At the lowest levels we have percepts.
On the slightly higher level we have simulations.
And on even higher level we have concept landscape.
How does this representation
by simulation work?
Now imagine you want to understand sound.
[Ghulp]
If you are a brain and you want to understand
sound you need to model it.
Unfortunatly we can not really model sound
with neurons, because sound goes up to 20kHz,
or if you are old like me maybe to 12 kHz.
20 kHz is what babies could do.
And... neurons do not want to do 20 kHz.
That's way too fast for them.
They like something like 20 Hz.
So what do you do? You need
to make a Fourier transform.
The Fourier transform measures the amount
of energy at different frequencies.
And because you can not do it with neurons,
you need to do it in hardware.
And turns out this is exactly
what we are doing.
We have this cochlea which is this snail like
thing in our ears,
and what it does, it transforms energy of
sound in different frequency intervals into
energy measurments.
And then gives you something
like what you see here.
And this is something that the brain can model,
so we can get a neurosimulator that tries
to recreate this patterns.
And we can predict the next input from the
cochlea that then understand the sound.
Of course if you want to understand music,
we have to go beyond understanding sound.
We have to understand the transformations
that sound can have if you play it at different pitch.
We have to arrange the sound in the sequence
that give you rhythms and so on.
And then we want to identify
some kind of musical grammar
that we can use to again control the sequencer.
So we have stucked structures.
That simulate the world.
And once you've learned this model of music,
once you've learned the musical grammar,
the sequencer and the sounds.
You can get to the structure
of the individual piece of music.
So, if you want to model the world of music.
You need to have the lowest level of percepts
then we have the higher level of mental simulations.
And... which give the sequences of the music
and the grammars of music.
And beyond this you have the conceptual landscape
that you can use
to describe different styles of music.
And if you go up in the hierarchy,
you get to more and more abstract models.
More and more conceptual models.
And more and more analytic models.
And this are causal models at some point.
This causal models can be weakly deterministic,
basically associative models, which tell you
if this state happens, it's quite probable
that this one comes afterwords.
Or you can get to a strongly determined model.
Strongly determined model is one which tells
you, if you are in this state
and this condition is met,
You are are going to go exactly in this state.
If this condition is not met, or a different
condition is met, you are going to this state.
And this is what we call an alghorithm.
it's.. now we are on the domain of computation.
Computation is slightly different from mathematics.
It's important to understand this.
For a long time people have thought that the
universe is written in mathematics.
Or that.. minds are mathematical,
or anything is mathematical.
In fact nothing is mathematical.
Mathematics is just the domain
of formal languages. It doesn't exist.
Mathematics starts with a void.
You throw in a few axioms, and if you've chosen
a nice axioms, then you get infinite complexity.
Most of which is not computable.
In mathematics you can express arbitrary statements,
because it's all about formal languages.
Many of this statements will not make sense.
Many of these statements will make sense
in some way,
but you can not test whether they make sense,
because they're not computable.
Computation is different.
Computation can exist.
It's starts with an initial state.
And then you have a transition function.
You do the work.
You apply the transition function,
and you get into the next state.
Computation is always finite.
Mathematics is the kingdom of specification.
And computation is the kingdom of implementation.
It's very important to understand this difference.
All our access to mathematics of course is
because we do computation.
We can understand mathematics,
because our brain can compute
some parts of mathematics.
Very, very little of it, and to
very constrained complexity.
But enough, so we can map
some of the infinite complexity
and noncomputability of mathematics
into computational patterns,
that we can explore.
So computation is about doing the work,
it's about executing the transition function.
Now we've seen that mental representation
is about concepts,
mental simulations, conceptual representations
and this conceptual representations
give us concept spaces.
And the nice thing
about this concept spaces is
that they give us an interface
to our mental representations,
We can use to address and manipulate them.
And we can share them in cultures.
And this concepts are compositional.
We can put them together, to create new concepts.
And they can be described using
higher dimensional vector spaces.
They don't do simulation
and prediction and so on,
but we can capture regularity
in our concept wisdom.
With this vector space
you can do amazing things.
For instance, if you take the vector from
"King" to "Queen"
is pretty much the same vector
as to.. between "Man" and "Woman"
And because of this properties, because it's
really a high dimentional manifold
this concepts faces, we can do interesting
things, like machine translation
without understanding what it means.
That is without doing any proper mental representation,
that predicts the world.
So this is a type of meta representation,
that is somewhat incomplete,
but it captures the landscape that we share
in a culture.
And then there is another type of meta representation,
that is linguistic protocols.
Which is basically a formal grammar and vocabulary.
And we need this linguistic protocols
to transfer mental representations
between people.
And we do this by basically
scanning our mental representation,
disassembling them in some way
or disambiguating them.
And then we use it as discrete string of symbols
to get it to somebody else,
and he trains an assembler,
that reverses this process,
and build something that is pretty similar
to what we intended to convey.
And if you look at the progression of AI models,
it pretty much went the opposite direction.
So AI started with linguistic protocols, which
were expressed in formal grammars.
And then it got to concepts spaces, and now
it's about to address percepts.
And at some point in near future it's going
to get better at mental simulations.
And at some point after that we get to
attention directed and
motivationally connected systems,
that make sense of the world.
that are in some sense able to address meaning.
This is the hardware that we have can do.
What kind of hardware do we have?
That's a very interesting question.
It could start out with a question:
How difficult is it to define a brain?
We know that the brain must be
somewhere hidden in the genome.
The genome fits on a CD ROM.
It's not that complicated.
It's easier than Microsoft Windows. laughter
And we also know, that about 2%
of the genome is coding for proteins.
And maybe about 10% of the genome
has some kind of stuff
that tells you when to switch protein.
And the remainder is mostly garbage.
It's old viruses that are left over and has
never been properly deleted and so on.
Because there are no real
code revisions in the genome.
So how much of this 10%
that is 75 MB code for the brain.
We don't really know.
What we do know is we share
almost all of this with mice.
Genetically speaking human
is a pretty big mouse.
With a few bits changed, so.. to fix some
of the genetic expressions
And that is most of the stuff there is going
to code for cells and metabolism
and how your body looks like and so on.
But if you look at erghmm... how much is expressed
in the brain and only in the brain,
in terms of proteins and so on.
We find it's about... well of the 2% it's
about 5%. That is only the 5% of the 2% that
is only in the brain.
And another 5% of the 2% is predominantly
in the brain.
That is more in the brain than anywhere else.
Which gives you some kind of thing
like a lower bound.
Which means to encode a brain genetically
base on the hardware that we are using.
We need something like
at least 500 kB of code.
Actually ehmm.. this... we very conservative
lower bound.
It's going to be a little more I guess.
But it sounds surprisingly little, right?
But in terms of scientific theories
this is a lot.
I mean the universe,
according to the core theory
of the quantum mechanics and so on
is like so much of code.
It's like half a page of code.
That's it. That's all you need
to generate the universe.
And if you want to understand evolution
it's like a paragraph.
It's couple lines you need to understand
evolutionary process.
And there is a lots, lots of details, that's
you get afterwards.
Because this process itself doesn't define
how the animals are going to look like,
and in similar way is..
the code of the universe doesn't tell you
what this planet is going to look like.
And what you guys are going to look like.
It's just defining the rulebook.
And in the same sense genome defines the rulebook,
by which our brain is build.
erghmmm,.. The brain boots itself
into developer process,
and this booting takes some time.
So subliminal learning in which
initial connections are forged
And basic models are build of the world,
so we can operate in it.
And how long does this booting take?
I thing it's about 80 mega seconds.
That's the time that a child is awake until
it's 2.5 years old.
By this age you understand Star Wars.
And I think that everything after
understanding Star Wars is cosmetics.
laughterapplause
You are going to be online, if you get to
arrive old age for about 1.5 giga seconds.
And in this time I think you are going to
get not to watch more than 5 milion concepts.
Why? I don't know real...
If you look at this child.
If a child would be able to form a concept
let say every 5 minutes,
then by the time it's about 4 years old,
it's going to have
something like 250 thousands concepts.
And... so... a quarter million.
And if we extrapolate this into our lifetime,
at some point it slows down,
because we have enough concepts,
to describe the world.
Maybe it's something... It's I think it's
less that 5 million.
How much storage capacity does the brain has?
I think that the... the estimates
are pretty divergent,
The lower bound is something like a 100 GB,
And the upper bound
is something like 2.5 PB.
There is even...
even some higher outliers this..
If you for instance think that we need all
those synaptic vesicle to store information,
maybe even more fits into this.
But the 2.5 PB is usually based
on what you need
to code the information
that is in all the neurons.
But maybe the neurons
do not really matter so much,
because if the neuron dies it's not like the
word is changing dramatically.
The brain is very resilient
against individual neurons failing.
So the 100 GB capacity is much more
what you actually store in the neurons.
If you look at all the redundancy
that you need.
And I think this is much closer to the actual
Ballpark figure.
Also if you want to store 5 hundred...
5 million concepts,
and maybe 10 times or 100 times the number
of percepts, on top of this,
this is roughly the Ballpark figure
that you are going to need.
So our brain
is a prediction machine.
It... What it does is it reduces the entropy
of the environment,
to solve whatever problems you are encountering,
if you don't have a... feedback loop, to fix
them.
So normally if something happens, we have
some kind of feedback loop,
that regulates our temperature or that makes
problems go away.
And only when this is not working
we employ recognition.
And then we start this arbitrary
computational processes,
that is facilitated by the neural cortex.
And this.. arhmm.. neural cortex has really
do arbitrary programs.
But it can do so
with only with very limited complexity,
because really you just saw,
it's not that complex.
The modeling of the world is very slow.
And it's something
that we see in our eye models.
To learn the basic structure of the world
takes a very long time.
To learn basically that we are moving in 3D
and objects are moving,
and what they look like.
Once we have this basic model,
we can get to very, very quick
understanding within this model.
Basically encoding based
on the structure of the world,
that we've learned.
And this is some kind of
data compression, that we are doing.
We use this model, this grammar of the world,
this simulation structures that we've learned,
to encode the world very, very efficently.
How much data compression do we get?
Well... if you look at the retina.
The retina get's data
in the order of about 10Gb/s.
And the retina already compresses these data,
and puts them into optic nerve
at the rate of about 1Mb/s
This is what you get fed into visual cortex.
And the visual cortex
does some additional compression,
and by the time it gets to layer four of the
first layer of vision, to V1.
We are down to something like 1Kb/s.
So if we extrapolate this, and you get live
to the age of 80 years,
and you are awake for 2/3 of your lifetime.
That is you have your eyes open for 2/3 of
your lifetime.
The stuff that you get into your brain,
via your visual perception
is going to be only 2TB.
Only 2TB of visual data.
Throughout all your lifetime.
That's all you are going to get ever to see.
Isn't this depressing?
laughter
So I would really like to eghmm..
to tell you,
choose wisely what you
are going to look at. laughter
Ok. Let's look at this problem of neural compositionality.
Our brains has this amazing thing
that they can put
meta representation together very, very quickly.
For instance you read a page of code,
you compile it in you mind
into some kind of program
it tells you what this page is going to do.
Isn't that amazing?
And then you can forget about this,
disassemble it all, and use the
building blocks for something else.
It's like legos.
How you can do this with neurons?
Legos can do this, because they have
a well defined interface.
They have all this slots, you know,
that fit together
in well defined ways.
How can neurons do this?
Well, neurons can maybe learn
the interface of other neurons.
But that's difficult, because every neuron
looks slightly different,
after all this... some kind of biologically
grown natural stuff.
laughter
So what you want to do is,
you want to encapsulate this erhmm...
diversity of the neurons to make the predictable.
To give them well defined interface.
And I think that nature solution to this
is cortical columns.
Cortical column is a circuit of
between 100 and 400 neurons.
And this circuit has some kind of neural network,
that can learn stuff.
And after it has learned particular function,
and in between, it's able to link up these
other cortical columns.
And we have about 100 million of those.
Depending on how many neurons
you assume is in there,
it's... erghmm we guess it's something,
at least 20 million and maybe
something like a 100 million.
And this cortical columns, what they can do,
is they can link up like lego bricks,
and then perform,
by transmitting information between them,
pretty much arbitrary computations.
What kind of computation?
Well... Solomonoff induction.
And... they have some short range links,
to their neighbors.
Which comes almost for free, because erghmm..
well, they are connected to them,
they are direct neighborhood.
And they have some long range connectivity,
so you can combine everything
in your cortex with everything.
So you need some kind of global switchboard.
Some grid like architecture
of long range connections.
They are going to be more expensive,
they are going to be slower,
but they are going to be there.
So how can we optimize
what these guys are doing?
In some sense it's like an economy.
It's not enduring based system,
as we often use in machine learning.
It's really an economy. You have...
The question is, you have a fixed number of
elements,
how can you do the most valuable stuff with
them.
Fixed resources, most valuable stuff, the
problem is economy.
So you have an economy of information brokers.
Every one of these guys,
this little cortical columns,
is very simplistic information broker.
And they trade rewards against neg entropy,
Against reducing entropy in the...
in the world.
And to do this, as we just saw
that they need some kind of standardized interface.
And internally, to use this interface
they are going to
have some kind of state machine.
And then they are going to pass messages
between each other.
And what are these messages?
Well, it's going to be hard
to discover these messages,
by looking at brains.
Because it's very difficult to see in brains,
what the are actually doing.
you just see all these neurons.
And if you would be waiting for neuroscience,
to discover anything, we wouldn't even have
gradient descent or anything else.
We wouldn't have neuron learning.
We wouldn't have all this advances in AI.
Jürgen Schmidhuber said that the biggest,
the last contribution of neuroscience to
artificial intelligence
was about 50 years ago.
That's depressing, and it might be
overemphasizing the unimportance of neuroscience,
because neuroscience is very important,
once you know what are you looking for.
You can actually often find this,
and see whether you are on the right track.
But it's very difficult to take neuroscience
to understand how the brain is working.
Because it's really like understanding
flight by looking at birds through a microscope.
So, what are these messages?
You are going to need messages,
that tell these cortical columns
to join themselves into a structure.
And to unlink again once they're done.
You need ways that they can request each other
to perform computations for them.
You need ways they can inhibit each other
when they are linked up.
So they don't do conflicting computations.
Then they need to tell you whether the computation,
the result of the computation
that the are asked to do is probably false.
Or whether it's probably true,
but you still need to wait for others,
to tell you whether the details worked out.
Or whether it's confirmed true that the concepts
that they stand for is actually the case.
And then you want to have learning,
to tell you how well this worked.
So you will have to announce a bounty,
that tells them to link up
and kind of reward signal
that makes do computation in the first place.
And then you want to have
some kind of reward signal
once you got the result as an organism.
But you reach your goal if you made
the disturbance go away
or what ever you consume the cake.
And then you will have
some kind of reward signal
that's you give everybody.
That was involved in this.
And this reward signal facilitates learning,
so the.. difference between the announce reward
and consumption reward is the learning signal
for these guys.
So they can learn how to play together,
and how to do the Solomonoff induction.
Now, I've told you that Solomonoff induction
is not computable.
And it's mostly because of two things,
First of all it's needs infinite resources
to compare all the possible models.
And the other one is that we do not know
the priori probability for our Bayesian model.
If we do not know
how likely unknown stuff is in the world.
So what we do instead is,
we set some kind of hyperparameter,
Some kind of default
priori probability for concepts,
that are encoded by cortical columns.
And if we set these parameters very low,
then we are going to end up with inferences
that are quite probable.
For unknown things.
And then we can test for those.
If we set this parameter higher, we are going
to be very, very creative.
But we end up with many many theories,
that are difficult to test.
Because maybe there are
too many theories to test.
Basically every of these cortical columns
will now tell you,
when you ask them if they are true:
"Yes I'm probably true,
but i still need to ask others,
to work on the details"
So these others are going to be get active,
and they are being asked by the asking element:
"Are you going to be true?",
and they say "Yeah, probably yes,
I just have to work on the details"
and they are going to ask even more.
So your brain is going to light up like a
christmas tree,
and do all these amazing computations,
and you see connections everywhere,
most of them are wrong.
You are basically in psychotic state
if your hyperparameter is too high.
You're brain invents more theories
that it can disproof.
Would it actually sometimes be good
to be in this state?
You bet. So i think every night our brain
goes in this state.
We turn up this hyperparameter.
We dream. We get all kinds
weird connections, and we get to see connections,
that otherwise we couldn't be seeing.
Even though... because they are highly improbable.
But sometimes they hold, and we see... "Oh
my God, DNA is organized in double helix".
And this is what we remember in the morning.
All the other stuff is deleted.
So we usually don't form long term memories
in dreams, if everything goes well.
If you accidentally trip this up.. your modulators,
for instance by consuming illegal substances,
or because you just gone randomly psychotic
you was basically entering
a dreaming state I guess.
You get to a state
when the brain starts inventing more
concepts that it can disproof.
So you want to have a state
where this is well balanced.
And the difference between
highly creative people,
and very religious people is probably
a different setting of this hyperparameter.
So I suspect that people that people
that are genius,
like people like Einstein and so on,
do not simply have better neurons than others.
What they mostly have is a slightly hyperparameter,
that is very finely tuned, so they can get
better balance than other people
in finding theories that might be true,
but can still be disprooven.
So inventiveness could be
a hyperparameter in the brain.
If you want to measure
the quality of belief that we have
we are going to have to have
some kind of some cost function
which is based on motivational system.
And to identify if belief
is good or not we can abstract criteria,
for instance how well does it predict the
wourld, or how about does it reduce uncertainty
in the world,
or is it consistency and sparse.
And then of course utility, how about does
it help me to satisfy my needs.
And the motivational system is going
to evaluate all this things by giving a signal.
And the first signal.. kind of signal
is the possible rewards if we are able to compute
the task.
And this is probably done by dopamine.
So we have a very small area in the brain,
substantia nigra,
and the ventral tegmental area,
and they produce dopamine.
And this get fed into lateral frontal cortext
and the frontal lobe,
which control attention,
and tell you what things to do.
And if we have successfully done
what you wanted to do,
we consume the rewards.
And we do this with another signal
which is serotonine.
It's also announce to motivational system,
to this very small are the Raphe nuclei.
And it feeds into all the areas of the brain
where learning is necessary.
A connection is strengthen
once you get to result.
These two substances are emitted
by the motivational system.
The motivational system is a bunch of needs,
essentially you regulate it below the cortext.
They are not part of your mental representations.
They are part of something
that is more primary than this.
This is what makes us go,
this is what makes us human.
This is not our rationality, this is what we want.
And the needs are physiological,
they are social, they are cognitive.
And you pretty much born with them.
They can not be totally adaptive,
because if we were adaptive,
we wouldn't be doing anything.
The needs are resistive.
They are pushing us against the world.
If you wouldn't have all this needs,
If you wouldn't have this motivational system,
you would just be doing what best for you.
Which means collapse on the ground,
be a vegetable, rod, give into gravity.
Instead you do all this unpleasant things,
to get up in the morning,
you eat, you have sex,
you do all this crazy things.
And it's only because the
motivational system forces you to.
The motivational system
takes this bunch of matter,
and makes us to do all these strange things,
just so genomes get replicated and so on.
And... so to do this, we are going to build
resistance against the world.
And the motivational system
is in a sense forcing us,
to do all this things by giving us needs,
and the need have some kind
of target value and current value.
If we have a differential
between the target value and current value,
we perceive some urgency
to do something about the need.
And when the target value
approaches the current value
we get the pleasure, which is a learning signal.
If it gets away from it
we get a displeasure signal,
which is also a learning signal.
And we can use this to structure
our understanding of the world.
To understand what goals are and so on.
Goals are learned. Needs are not.
To learn we need success
and failure in the world.
But to do things we need anticipated reward.
So it's dopamine that's makes brain go round.
Dopamine makes you do things.
But in order to do this in the right way,
you have to make sure,
that the cells can not
produce dopamine themselves.
If they do this they can start
to drive others to work for them.
You are going to get something like
bureaucracy in your neural cortext,
where different bosses try
to set up others to they own bidding
and pitch against other groups in nerual cortext.
It's going to be horrible.
So you want to have some kind of central authority,
that make sure that the cells
do not produce dopamine themselves.
It's only been produce in
very small area and then given out,
and pass through the system.
And after you're done with it's going to be gone,
so there is no hoarding of the dopamine.
And in our society the role of dopamine
is played by money.
Money is not reward in itself.
It's in some sense way
that you can trade against the reward.
You can not eat money.
You can take it later and take
a arbitrary reward for it.
And in some sense money is the dopamine
that makes organizations
and society, companies
and many individuals do things.
They do stuff because of money.
But money if you compare to dopamine
is pretty broken,
because you can hoard it.
So you are going to have this
cortical columns in the real world,
which are individual people
or individual corporations.
They are hoarding the dopamine,
they sit on this very big pile of dopamine.
They are starving the rest
of the society of the dopamine.
They don't give it away,
and they can make it do it's bidding.
So for instance they can pitch
substantial part of society
against understanding of global warming.
because they profit of global warming
or of technology that leads to global warming,
which is very bad for all of us. applause
So our society is a nervous system
that lies to itself.
How can we overcome this?
Actually, we don't know.
To do this we would need
to have some kind of centrialized,
top-down reward motivational system.
We have this for instance in the military,
you have this system of
military rewards that you get.
And this are completely
controlled from the top.
Also within working organizations
you have this.
In corporations you have centralized rewards,
it's not like rewards flow bottom-up,
they always flown top-down.
And there was an attempt
to model society in such a way.
That was in Chile in the early 1970,
the Allende government had the idea
to redesign society or economy
in society using cybernetics.
So Allende invited a bunch of cyberneticians
to redesign the Chilean economy.
And this was meant to be the control room,
where Allende and his chief economists
would be sitting,
to look at what the economy is doing.
We don't know how this would work out,
because we know how it ended.
In 1973 there was this big putsch in Chile,
and this experiment ended among other things.
Maybe it would have worked, who knows?
Nobody tried it.
So, there is something else
what is going on in people,
beyond the motivational system.
That is: we have social criteria, for learning.
We also check if our ideas
are normativly acceptable.
And this is actually a good thing,
because individual may shortcut
the learning through communication.
Other people have learned stuff
that we don't need to learn ourselves.
We can build on this, so we can accelerate
learning by many order of magnitutde,
which makes culture possible.
And which makes many anything possible,
because if you were on your own
you would not be going to find out
very much in your lifetime.
You know how they say?
Everything that you do,
you do by standing on the shoulders of giants.
Or on a big pile of dwarfs
it works either way.
laughterapplause
Social learning usually outperforms
individual learning. You can test this.
But in the case of conflict
between different social truths,
you need some way to decide who to believe.
So you have some kind of reputation
estimate for different authority,
and you use this to check whom you believe.
And the problem of course is this
in existing society, in real society,
this reputation system is going
to reflect power structure,
which may distort your belief systematically.
Social learning therefore leads groups
to synchronize their opinions.
And the opinions become ...get another role.
They become important part
of signalling which group you belong to.
So opinions start to signal
group loyalty in societies.
And people in this, and that's the actual world,
they should optimize not for getting the best possible
opinions in terms of truth.
They should guess... they should optimize
for doing... having the best possible opinion,
with respect to agreement with their peers.
If you have the same opinion
as your peers, you can signal them
that you are the part of their ingroup,
they are going to like you.
If you don't do this, chances are
they are not going to like you.
There is rarely any benefit in life to be
in disagreement with your boss. Right?
So, if you evolve an opinion forming system
in these curcumstances,
you should be ending up
with an opinion forming system,
that leaves you with the most usefull opinion,
which is the opinion in your environment.
And it turns out, most people are able
to do this effortlessly.
laughter
They have an instinct, that makes them adapt
the dominant opinion in their social environment.
It's amazing, right?
And if you are nerd like me,
you don't get this.
laugingapplause
So in the world out there,
explanations piggyback on you group allegiance.
For instance you will find that there is a
substantial group of people that believes
the minimum wage is good
for the economy and for you
and another one believes that its bad.
And its pretty much aligned
with political parties.
Its not aligned with different
understandings of economy,
because nobody understands
how the economy works.
And if you are a nerd you try to understand
the world in terms of what is true and false.
You try to prove everything by putting it
in some kind of true and false level
and if you are not a nerd
you try to get to right and wrong
you try to understand
whether you are in alignment
with what's objectively right
in your society, right?
So I guess that nerds are people that have
a defect in there opinion forming system.
laughing
And usually that's maladaptive
and under normal circumstances
nerds would mostly be filtered
from the world,
because they don't reproduce so well,
because people don't like them so much.
laughing
And then something very strange happened.
The computer revolution came along and
suddenly if you argue with the computer
it doesn't help you if you have the
normatively correct opinion you need to
be able to understand things in terms of
true and false, right? applause
So now we have this strange situation that
the weird people that have this offensive,
strange opinions and that really don't
mix well with the real normal people
get all this high paying jobs
and we don't understand how is that happening.
And it's because suddenly
our maladapting is a benefit.
But out there there is this world of the
social norms and it's made of paperwalls.
There are all this things that are true
and false in a society that make
people behave.
It's like this japanese wall, there.
They made palaces out of paper basically.
And these are walls by convention.
They exist because people agree
that this is a wall.
And if you are a hypnotist
like Donald Trump
you can see that these are paper walls
and you can shift them.
And if you are a nerd like me
you can not see these paperwalls.
If you pay closely attention you see that
people move and then suddenly middair
they make a turn. Why would they do this?
There must be something
that they see there
and this is basically a normative agreement.
And you can infer what this is
and then you can manipulate it and understand it.
Of course you can't fix this, you can
debug yourself in this regard,
but it's something that is hard
to see for nerds.
So in some sense they have a superpower:
they can think straight in the presence
of others.
But often they end up in their living room
and people are upset.
laughter
Learning in a complex domain can not
guarantee that you find the global maximum.
We know that we can not find truth
because we can not recognize whether we live
on a plain field or on a
simulated plain field.
But what we can do is, we can try to
approach a global maximum.
But we don't know if that
is the global maximum.
We will always move along
some kind of belief gradient.
We will take certain elements of
our belief and then give them up
for new elements of a belief based on
thinking, that this new element
of belief is better than the one
we give up.
So we always move along
some kind of gradient.
and the truth does not matter,
the gradient matters.
If you think about teaching for a moment,
when I started teaching I often thought:
Okay, I understand the truth of the
subject, the students don't, so I have to
give this to them
and at some point I realized:
Oh, I changed my mind so many times
in the past and I'm probably not going to
stop changing it in the future.
I'm always moving along a gradient
and I keep moving along a gradient.
So I'm not moving to truth,
I'm moving forward.
And when we teach our kids
we should probably not think about
how to give them truth.
We should think about how to put them onto
an interesting gradient, that makes them
explore the world,
world of possible beliefs.
applause
And this possible beliefs
lead us into local minima.
This is inevitable. This are like valleys
and sometimes this valleys are
neighbouring and we don't understand
what the people in the neighbouring
valley are doing unless we are willing to
retrace the steps they have been taken.
And if you want to get from one valley
into the next, we will have to have some kind
of energy that moves us over the hill.
We have to have a trajectory were every
step works by finding reason to give up
bit of our current belief and adopt a
new belief, because it's somehow
more useful, more relevant,
more consistent and so on.
Now the problem is that this is not
monotonous we can not guarantee that
we're always climbing,
because the problem is, that
the beliefs themselfs can change
our evaluation of the belief.
It could be for instance that you start
believing in a religion and this religion
could tell you: If you give up the belief
in the religion, you're going to face
eternal damnation in hell.
As long as you believe in the religion,
it's going to be very expensive for you
to give up the religion, right?
If you truly belief in it.
You're now caught
in some kind of attractor.
Before you believe the religion it is not
very dangerous but once you've gotten
into the attractor it's very,
very hard to get out.
So these belief attractors
are actually quite dangerous.
You can get not only to chaotic behaviour,
where you can not guarantee that your
current belief is better than the last one
but you can also get into beliefs that are
almost impossible to change.
And that makes it possible to program
people to work in societies.
Social domains are structured by values.
Basically a preference is what makes you
do things, because you anticipate
pleasure or displeasure,
and values make you do things
even if you don't anticipate any pleasure.
These are virtual rewards.
They make us do things, because we believe
that is stuff
that is more important then us.
This is what values are about.
And these values are the source
of what we would call true meaning, deeper meaning.
There is something that is more important
than us, something that we can serve.
This is what we usually perceive as
meaningful life, it is one which
is in the serves of values that are more
important than I myself,
because after all I'm not that important.
I'm just this machine that runs around
and tries to optimize its pleasure and
pain, which is kinda boring.
So my PI has puzzled me, my principle
investigator in the Havard department,
where I have my desk, Martin Nowak.
He said, that meaning can not exist without
god; you are either religious,
or you are a nihilist.
And this guy is the head of the
department for evolutionary dynamics.
Also he is a catholic.. chuckling
So this really puzzled me and I tried
to understand what he meant by this.
Typically if you are a good atheist
like me,
you tend to attack gods that are
structured like this, religious gods,
that are institutional, they are personal,
they are some kind of person.
They do care about you, they prescribe
norms, for instance don't mastrubate
it's bad for you.
Many of this norms are very much aligned
with societal institutions, for instance
don't questions the authorities,
god wants them to be ruling above you
and be monogamous and so on and so on.
So they prescribe norms that do not make
a lot of sense in terms of beings that
creates world every now and then,
but they make sense in terms of
what you should be doing to be a
functioning member of society.
And this god also does things like it
creates world, they like to manifest as
burning shrubbery and so on. There are
many books that describe stories that
these gods have allegedly done.
And it's very hard to test for all these
features which makes this gods very
improbable for us. And makes Atheist
very dissatisfied with these gods.
But then there is a different kind of god.
This is what we call the spiritual god.
This spiritual god is independent of
institutions, it still does care about you.
It's probably conscious. It might not be a
person. There are not that many stories,
that you can consistently tell about it,
but you might be able to connect to it
spiritually.
Then there is a god that is even less
expensive. That is god as a transcendental
principle and this god is simply the reason
why there is something rather then
nothing. This god is the question the
universe is the answer to, this is the
thing that gives meaning.
Everything else about it is unknowable.
This is the god of Thomas of Aquinus.
The God that Thomas of Aquinus discovered
is not the god of Abraham this is not the
religious god.
It's a god that is basically a principle
that us ... the universe into existence.
It's the one that gives
the universe it's purpose.
And because every other property
is unknowable about this,
this god is not that expensive.
Unfortunately it doesn't really work.
I mean Thomas of Aquinus tried to prove
god. He tried to prove an necessary god,
a god that has to be existing and
I think we can only prove a possible god.
So if you try to prove a necessary god,
this god can not exist.
Which means your god prove is going to
fail. You can only prove possible gods.
And then there is an even more improper god.
And that's the god of Aristotle and he said:
"If there is change in the universe,
something in going to have to change it."
There must be something that moves it
along from one state to the next.
So I would say that is the primary
computational transition function
of the universe.
laughingapplause
And Aristotle discovered it.
It's amazing isn't it?
We have to have this because we
can not be conscious in a single state.
We need to move between states
to be conscious.
We need to be processes.
So we can take our gods and sort them by
their metaphysical cost.
The 1st degree god would be the first mover.
The 2nd degree god is the god of purpose and meaning.
3rd degree god is the spiritual god.
And the 4th degree god is this bound to
religious institutions, right?
So if you take this statement
from Martin Nowak,
"You can not have meaning without god!"
I would say: yes! You need at least
a 2nd degree god to have meaning.
So objective meaning can only exist
with a 2nd degree god. chuckling
And subjective meaning can exist as a
function in a cognitive system of course.
We don't need objective meaning.
So we can subjectively feel that there is
something more important to us
and this makes us work in society and
makes us perceive that we have values
and so on, but we don't need to believe
that there is something outside of the
universe to have this.
So the 4th degree god is the one
that is bound to religious institutions,
it requires a belief attractor and it
enables complex norm prescriptions.
It my theory is right then it should be
much harder for nerds to believe in
a 4th degree god then for normal people.
And what this god does it allows you to
have state building mind viruses.
Basically religion is a mind virus. And
the amazing thing about these mind viruses
is that they structure behaviour
in large groups.
We have evolved to live in small groups
of a few 100 individuals, maybe somthing
like a 150.
This is roughly the level
to which reputation works.
We can keep track of about 150 people and
after this it gets much much worse.
So in this system where you have
reputation people feel responsible
for each other and they can
keep track of their doings
and society kind of sort of works.
If you want to go beyond this, you have
to right a software that controls people.
And religions were the first software,
that did this on a very large scale.
And in order to keep stable they had to be
designed like operating systems
in some sense.
They give people different roles
like insects in a hive.
And they have even as part of this roles is
to update this religion but it has to be
done very carefully and centrally
because otherwise the religion will split apart
and fall together into new religions
or be overcome by new ones.
So there is some kind of
evolutionary dynamics that goes on
with respect to religion.
And if you look the religions,
there is actually a veritable evolution
of religions.
So we have this Israelic tradition and
the Mesoputanic mythology that gave rise
to Judaism. applause
It's kind of cool, right? laughing
Also history totally repeats itself.
roaring laughterapplause
Yeah, it totally blew my mind when
I discovered this. laughter
Of course the real tree of programming
languages is slightly more complicated,
And the real tree of religion is slightly
more complicated.
But still its neat.
So if you want to immunize yourself
against mind viruses,
first of all you want to check yourself
whether you are infected.
You should check: Can I let go of my
current beliefs without feeling that
meaning departures me and I feel very
terrible, when I let go of my beliefs.
Also you should check: All the other
people around there that don't
share my belief, are they either stupid,
or crazy, or evil?
If you think this chances are you are
infected by some kind of mind virus,
because they are just part
of the out group.
And does your god have properties that
you know but you did not observe.
So basically you have a god
of 2nd or 3rd degree or higher.
In this case you also probably got a mind virus.
There is nothing wrong
with having a mind virus,
but if you want to immunize yourself
against this people have invented
rationalism and enlightenment,
basically to act as immunization against
mind viruses.
loud applause
And in some sense its what the mind does
by itself because, if you want to
understand how you go wrong,
you need to have a mechanism
that discovers who you are.
Some kind of auto debugging mechanism,
that makes the mind aware of itself.
And this is actually the self.
So according to Robert Kegan:
"The development of ourself is a process,
in which we learn who we are by making
thing explicit", by making processes that
are automatic visible to us and by
conceptualize them so we no longer
identify with them.
And it starts out with understanding
that there is only pleasure and pain.
If you are a baby, you have only
pleasure and pain you identify with this.
And then you turn into a toddler and the
toddler understands that they are not
their pleasure and pain
but they are their impulses.
And in the next level if you grow beyond
the toddler age you actually know that
you have goals and that your needs and
impulses are there to serve goals, but its
very difficult to let go of the goals,
if you are a very young child.
And at some point you realize: Oh, the
goals don't really matter, because
sometimes you can not reach them, but
we have preferences, we have thing that we
want to happen and thing that we do not
want to happen. And then at some point
we realize that other people have
preferences, too.
And then we start to model the world
as a system where different people have
different preferences and we have
to navigate this landscape.
And then we realize that this preferences
also relate to values and we start
to identify with this values as members of
society.
And this is basically the stage if you
are an adult being, that you get into.
And you can get to a stage beyond that,
especially if you have people this, which
have already done this. And this means
that you understand that people have
different values and what they do
naturally flows out of them.
And this values are not necessarily worse
than yours they are just different.
And you learn that you can hold different
sets of values in your mind at
the same time, isn't that amazing?
and understand other people, even if
they are not part of your group.
If you get that, this is really good.
But I don't think it stops there.
You can also learn that the stuff that
you perceive is kind of incidental,
that you can turn it of and you can
manipulate it.
And at some point you also can realize
that yourself is only incidental that you
can manipulate it or turn it of.
And that your basically some kind of
consciousness that happens to run a brain
of some kind of person, that navigates
the world in terms to get rewards or avoid
displeasure and serve values and so on,
but it doesn't really matter.
There is just this consciousness which
understands the world.
And this is the stage that we typically
call enlightenment.
In this stage you realize that you are not
your brain, but you are a story that
your brain tells itself.
applause
So becoming self aware is a process of
reverse engineering your mind.
Its a different set of stages in which
to realize what goes on.
So isn't that amazing.
AI is a way to get to more self awareness?
I think that is a good point to stop here.
The first talk that I gave in this series
was 2 years ago. It was about
how to build a mind.
Last year I talked about how to get from
basic computation to consciousness.
And this year we have talked about
finding meaning using AI.
I wonder where it goes next.
laughter
applause
Herald: Thank you for this amazing talk!
We now have some minutes for Q&A.
So please line up at the microphones as
always. If you are unable to stand up
for some reason please very very visibly
rise your hand, we should be able to dispatch
an audio angle to your location
so you can have a question too.
And also if you are locationally
disabled, you are not actually in the room
if you are on the stream, you can use IRC
or twitter to also ask questions.
We also have a person for that.
We will start at microphone number 2.
Q: Wow that's me. Just a guess! What
would you guess, when can you discuss
your talk with a machine,
in how many years?
Joscha: I don't know! As a software
engineer I know if I don't have the
specification all bets are off, until I
have the implementation. laughter
So it can be of any order of magnitude.
I have a gut feeling but I also know as a
software engineer that my gut feeling is
usually wrong, laughter
until I have the specification.
So the question is if there are silver
bullets? Right now there are some things
that are not solved yet and it could be
that they are easier to solve
than we think, but it could be that
they're harder to solve than we think.
Before I stumbled on this cortical
self organization thing,
I thought it's going to be something like
maybe 60, 80 years and now I think it's
way less, but again this is a very
subjective perspective. I don't know.
Herald: Number 1, please!
Q: Yes, I wanted to ask a little bit about
metacognition. It seems that you kind of
end your story saying that it's still
reflecting on input that you get and
kind of working with your social norms
and this and that, but Colberg
for instance talks about what he calls a
postconventional universal morality
for instance, which is thinking about
moral laws without context, basically
stating that there is something beyond the
relative norm that we have to each other,
which would only be possible if you can do
kind of, you know, meta cognition,
thinking about your own thinking
and then modifying that thinking.
So kind of feeding back your own ideas
into your own mind and coming up with
stuff that actually can't get ...
well processing external inputs.
Joscha: Mhm! I think it's very tricky.
This project of defining morality without
societies exists longer than Kant of
course. And Kant tried to give this
internal rules and others tried to.
I find this very difficult.
From my perspective we are just moving
bits of rocks. And this bits of rocks they
are on some kind of dust mode in a galaxy
out of trillions of galaxies and how can
there be meaning?
It's very hard for me to say:
One chimpanzee species is better than
another chimpanzee species or
a particular monkey
is better than another monkey.
This only happens
within a certain framework
and we have to set this framework.
And I don't think that we can define this
framework outside of a context of
social norms, that we have to agree on.
So objectively I'm not sure
if we can get to ethics.
I only think that is possible based on
some kind of framework that people
have to agree on implicitly or explicitly.
Herald: Microphone number 4, please.
Q: Hi, thank you, it was a fascinating talk.
I have 2 thought that went through my mind.
And the first one is that it's so
convincing the models that you present,
but it's kind of like you present
another metaphor of understanding the
brain which is still something that we try
to grasp on different levels of science
basically. And the 2nd one is that your
definition of the nerd who walks
and doesn't see the walls is kind of
definition... or reminds me
Richard Rortys definition of the ironist
which is a person who knows that their
vocabulary is finite and that other people
have also a finite vocabulary and
then that obviously opens up the whole question
of meaning making which has been
discussed in so many
other disciplines and fields.
And I thought about Darridas
deconstruction of ideas and thoughts and
Butler and then down the rabbit hole to
Nietzsche and I was just wondering,
if you could maybe
map out other connections
where basically not AI helping us to
understand the mind, but where
already existing huge, huge fields of
science, like cognitive process
coming from the other end could help us
to understand AI.
Joscha: Thank you, the tradition that you
mentioned Rorty and Butler and so on
are part of a completely different belief
attractor in my current perspective.
That is they are mostly
social constructionists.
They believe that reality at least in the
domains of the mind and sociality
are social constructs they are part
of social agreement.
Personally I don't think that
this is the case.
I think that patterns that we refer to
are mostly independent of your mind.
The norms are part of social constructs,
but for instance our motivational
preferences that make us adapt or
reject norms, are something that builds up
resistance to the environment.
So they are probably not part
of social agreement.
And the only thing I can invite you to is
try to retrace both of the different
belief attractors, try to retrace the
different paths on the landscape.
All this thing that I tell you, all of
this is of course very speculative.
These are that seem to be logical
to me at this point in my life.
And I try to give you the arguments
why I think that is plausible, but don't
believe in them, question them, challenge
them, see if they work for you!
I'm not giving you any truth.
I'm just going to give you suitable encodings
according to my current perspective.
Q:Thank you!
applause
Herald: The internet, please!
Signal angel: So, someone is asking
if in this belief space you're talking about
how is it possible
to get out of local minima?
And very related question as well:
Should we teach some momentum method
to our children,
so we don't get stuck in a local minima.
Joscha: I believe at some level it's not
possible to get out of a local minima.
In an absolute sense, because you only get
to get into some kind of meta minimum,
but what you can do is to retrace the
path that you took whenever you discover
that somebody else has a fundamentally
different set of beliefs.
And if you realize that this person is
basically a smart person that is not
completely insane but has reasons to
believe in their beliefs and they seem to
be internally consistent it's usually
worth to retrace what they
have been thinking and why.
And this means you have to understand
where their starting point was and
how they moved from their current point
to their starting point.
You might not be able to do this
accurately and the important thing is
also afterwards you discover a second
valley, you haven't discovered
the landscape inbetween.
But the only way that we can get an idea
of the lay of the land is that we try to
retrace as many paths as possible.
And if we try to teach our children, what
I think what we should be doing is:
To tell them how to explore
this world on there own.
It's not that we tell them this is the
valley, basically it's given, it's
the truth, but instead we have to tell
them: This is the path that we took.
And these are the things that we saw
inbetween and it is important to be not
completely naive when we go into this
landscape, but we also have to understand
that it's always an exploration that
never stops and that might change
everything that you believe now
at a later point.
So for me it's about teaching my own
children how to be explorers,
how to understand that knowledge is always
changing and it's always a moving frontier.
applause
Herald: We are unfortunately out of time.
So, please once again thank Joscha!
applause
Joscha: Thank you!
applause
postroll music
subtitles created by c3subtitles.de
Join, and help us!