WEBVTT 00:00:18.810 --> 00:00:23.210 Herald: I have the great pleasure to announce Joscha, who will give us a great 00:00:23.210 --> 00:00:26.310 talk with the title "The Ghost in the Machine" and he will talk about 00:00:26.310 --> 00:00:33.200 consciousness of our mind and of computers and somehow also tell us how we can learn 00:00:33.200 --> 00:00:38.080 from A.I. systems about our own brains. And I think this is a very curious question. 00:00:38.080 --> 00:00:41.015 So please give it up for Joscha. 00:00:41.015 --> 00:00:51.010 Applause 00:00:51.010 --> 00:00:58.900 Joscha: Good evening. This is the 5th of a talk in a series of talks on how to 00:00:58.900 --> 00:01:03.930 get from computation to consciousness and to understand our condition in the 00:01:03.930 --> 00:01:09.180 universe based on concepts that I mostly learned by looking at artificial 00:01:09.180 --> 00:01:16.530 intelligence and computation and it mostly tackles the big philosophical questions: 00:01:16.530 --> 00:01:20.410 What can I know? What is true? What is truth? Who am I? Which means the question 00:01:20.410 --> 00:01:25.660 of epistemology, of ontology, of metaphysics, and philosophy of mind and 00:01:25.660 --> 00:01:26.710 ethics. 00:01:26.710 --> 00:01:30.603 And to clear some of the terms that we are using here: 00:01:30.603 --> 00:01:34.300 What is intelligence? What's a mind? What's a self? What's consciousness? 00:01:34.300 --> 00:01:37.740 How are mind and consciousness realized in the universe? 00:01:37.740 --> 00:01:40.280 Intelligence I think is the ability to make models. 00:01:40.280 --> 00:01:42.450 It's not the same thing as being smart, which is the 00:01:42.450 --> 00:01:46.770 ability to reach your goals or being wise, which is the ability to pick the right 00:01:46.770 --> 00:01:50.680 goals. But it's just the ability to make models of things. 00:01:50.680 --> 00:01:53.980 And you can regulate them later using these models, but you don't have to. 00:01:53.980 --> 00:01:57.308 And the mind is this thing that observes the universe itself 00:01:57.308 --> 00:02:00.867 as an identification with properties and purposes. 00:02:00.867 --> 00:02:04.120 What a thing thinks it is. And then you have consciousness, which is 00:02:04.120 --> 00:02:08.270 the experience of what it's like to be a thing. 00:02:08.270 --> 00:02:10.749 And, how our mind of consciousness is realized in the universe, 00:02:10.749 --> 00:02:13.560 this is commonly called the mind-body problem and it's been 00:02:13.560 --> 00:02:20.023 puzzling philosophers and people of all proclivities for thousands of years. 00:02:20.023 --> 00:02:25.360 So what's going on? How's it possible that I find myself in a universe and I seem to 00:02:25.360 --> 00:02:31.130 be experiencing myself in that universe? How does this go together and how is this, 00:02:31.130 --> 00:02:37.260 what's going on here? The traditional answer to this is called dualism and the 00:02:37.260 --> 00:02:41.510 conception of dualism is that - in our culture at least, this dualist idea that 00:02:41.510 --> 00:02:45.620 you have a physical world and a mental world and they coexist somehow and my mind 00:02:45.620 --> 00:02:49.620 experiences this mental world and my body can do things in the physical world and 00:02:49.620 --> 00:02:53.860 the difficulty of this dualist conception is how do these two planes of existence 00:02:53.860 --> 00:02:57.750 interact. Because physics is defined as causally closed, everything that 00:02:57.750 --> 00:03:03.340 influences things in the physical world is by itself an element of physics. So an 00:03:03.340 --> 00:03:07.410 alternative is idealism which says that there is only a mental world. We only 00:03:07.410 --> 00:03:12.460 exist in a dream and this dream is being dreamt by a mind on a higher plane of 00:03:12.460 --> 00:03:17.700 existence. And difficulty with this, it's very hard to explain that mind of a higher 00:03:17.700 --> 00:03:22.430 plane of existence. Just put it there, why is it doing this? And in our culture the 00:03:22.430 --> 00:03:27.040 dominant theory is materialism and is basically there is only a physical world 00:03:27.040 --> 00:03:32.100 nothing else. And the physical world somehow is responsible for the creation of 00:03:32.100 --> 00:03:36.700 the mental world. It's not quite clear how this happens. And the answer that I am 00:03:36.700 --> 00:03:44.110 suggesting, is functionalism which means that indeed we exist only in a dream. 00:03:44.110 --> 00:03:48.630 So these ideas of materialism and idealism are not in opposition. They are 00:03:48.630 --> 00:03:51.960 complementary because this dream is being dreamt by a mind on a higher plane of 00:03:51.960 --> 00:03:57.010 existence, but this higher plane of existence is the physical world. So we are 00:03:57.010 --> 00:04:02.660 being dreamt in the neocortex of a primate that lives in a physical universe and the 00:04:02.660 --> 00:04:05.780 world that we experience is not the physical world. It's a dream generated by 00:04:05.780 --> 00:04:10.120 the neocortex - the same circuits that make dreams at night make them during the 00:04:10.120 --> 00:04:13.850 day. You can show this, and you live in this virtual reality being generated in 00:04:13.850 --> 00:04:18.430 there and the self as a character in that dream. And it seems to take care of 00:04:18.430 --> 00:04:21.520 things. It seems to explain what's going on. It explains why a miracle seems to be 00:04:21.520 --> 00:04:26.070 possible and why I can look into the future but cannot break the bank somehow. 00:04:26.070 --> 00:04:31.480 And even though this theory explains this, how shouldn't I be more agnostic? Are 00:04:31.480 --> 00:04:35.220 there not alternatives that I should be considering? Maybe the narratives of our 00:04:35.220 --> 00:04:40.889 big religions and so on. I think we should be agnostic. So the first rule of 00:04:40.889 --> 00:04:46.110 epistemology says that the confidence in the belief must equal the weight of the 00:04:46.110 --> 00:04:49.311 evidence supporting it. Once we stumble on that rule you can test all the 00:04:49.311 --> 00:04:54.130 alternatives and see if one of them is better. And I think what this means is you 00:04:54.130 --> 00:04:57.540 have to have all the possible beliefs, you should entertain them all. But you should 00:04:57.540 --> 00:05:01.050 not have any confidence in them. You should shift your confidence around based 00:05:01.050 --> 00:05:05.560 on the evidence. So for instance it is entirely possible that this universe was 00:05:05.560 --> 00:05:09.140 created by a supernatural being, and it's a big conspiracy, and it actually has 00:05:09.140 --> 00:05:12.900 meaning and it cares about us and our existence here means something. 00:05:12.900 --> 00:05:17.381 But um, there is no experiment that can validate this. A guy coming down from a 00:05:17.381 --> 00:05:21.160 burning mount, from a burning bush, that you've talked to on a 00:05:21.160 --> 00:05:28.370 mountaintop? That's not a kind of experi- ment that gives you valid evidence, right? 00:05:28.370 --> 00:05:32.560 So intelligence is the ability to make models and intelligence is a property 00:05:32.560 --> 00:05:36.730 that is beyond the grasp of a single individual. A single individual is not 00:05:36.730 --> 00:05:41.090 that smart. We cannot figure out even tur- ing complete languages all by ourselves. 00:05:41.090 --> 00:05:45.270 To do this you need an intellectual tradition that lasts a few hundred years 00:05:45.270 --> 00:05:49.600 at least. So civilizations have more intelligence than individuals. But 00:05:49.600 --> 00:05:54.320 individuals often have more intelligence than groups and whole generations and 00:05:54.320 --> 00:05:58.830 that's because groups and generations tend to converge on ideas; they have consensus 00:05:58.830 --> 00:06:03.400 opinions. I'm very wary of consensus opinions because you know how hard it is 00:06:03.400 --> 00:06:06.480 to understand which programming language is the best one for which purpose. There 00:06:06.480 --> 00:06:09.830 is no proper consensus. And that's a relatively easy problem. So when there's a 00:06:09.830 --> 00:06:13.919 complex topics and all the experts agree, there are forces at work that are 00:06:13.919 --> 00:06:17.230 different than the forces that make them search for truth. These consensus-building 00:06:17.230 --> 00:06:21.479 forces, they're very suspicious to me. And if you want to understand what's true you 00:06:21.479 --> 00:06:24.840 have to look for means and motive. And you have to be autonomous in doing this, so 00:06:24.840 --> 00:06:29.229 individuals typically have better ideas than generations or groups. But as I 00:06:29.229 --> 00:06:32.670 said, civilizations have more intelligence than individuals. What does a 00:06:32.670 --> 00:06:36.860 civilizational intellect look like? The civilization intellect is something like a 00:06:36.860 --> 00:06:40.160 global optimum of the modeling function. It's something that has to be built over 00:06:40.160 --> 00:06:43.610 thousands of years in an unbroken intellectual tradition. And guess what, 00:06:43.610 --> 00:06:47.100 this doesn't really exist in human history. Every few hundred years, there's 00:06:47.100 --> 00:06:51.350 some kind of revolution. Somebody opens the doors to the knowledge factories and 00:06:51.350 --> 00:06:54.790 gets everybody out and burns down the libraries. And a couple generations later, 00:06:54.790 --> 00:06:58.830 the knowledge worker drones of the new king realize "Oh my God we need to rebuild 00:06:58.830 --> 00:07:02.720 this thing, this intellect." And then they create something in its likeness, but they 00:07:02.720 --> 00:07:07.760 make mistakes in the foundation. So this intellect tends to have scars. Like our 00:07:07.760 --> 00:07:11.539 civilization intellect has a lot of scars in it, that make it hard-to-difficult 00:07:11.539 --> 00:07:16.510 to understand concepts like self and consciousness and mind. So, the mind 00:07:16.510 --> 00:07:19.680 is something that observes the universe, and the neurons and neurotransmitters are 00:07:19.680 --> 00:07:22.860 the substrate. And the human intellect and the working memory is the current binding 00:07:22.860 --> 00:07:26.931 state, how do the different elements fit together in our mind? And the self is the 00:07:26.931 --> 00:07:31.169 identification is what we think we are and what we want to happen. And consciousness 00:07:31.169 --> 00:07:35.270 is the contents of our attention, it makes knowledge available throughout the mind. 00:07:35.270 --> 00:07:39.419 And civilizational intellect is very similar: society is observe the universe, 00:07:39.419 --> 00:07:42.160 people and resources are the substrate, the generation is the current binding 00:07:42.160 --> 00:07:46.860 state, and culture is the identification with what we think we are and what we want 00:07:46.860 --> 00:07:51.840 to happen. And media is the contents of our attention and make knowledge available 00:07:51.840 --> 00:07:55.930 throughout society. So the culture is basically the self of civilization, and 00:07:55.930 --> 00:08:00.490 media is its consciousness. How is it possible to model a universe? Let's take a 00:08:00.490 --> 00:08:04.771 very simple universe like the Mandelbrot fractal. It can be defined by a little bit 00:08:04.771 --> 00:08:09.490 of code. It's a very simple thing, you just take a pair of numbers, you square it, you 00:08:09.490 --> 00:08:13.760 add the same pair of numbers. And you do this infinitely often, and typically this 00:08:13.760 --> 00:08:18.940 goes to infinity very fast. There's a small area around the origin of the number 00:08:18.940 --> 00:08:24.680 pair, so between -1 and +1 and so on, where you have an area where this 00:08:24.680 --> 00:08:28.330 converges, where it doesn't go to infinity and that is where you make black dots and 00:08:28.330 --> 00:08:33.250 then you get this famous structure, the Mandelbrot fractal. And because this 00:08:33.250 --> 00:08:37.229 divergence and convergence of the function can take many loops and circles and so on, 00:08:37.229 --> 00:08:41.169 a very complicated shape a very complicated outline, an infinitely 00:08:41.169 --> 00:08:44.709 complicated outline there. So there is an infinite amount of structure in this 00:08:44.709 --> 00:08:47.990 fractal. And now imagine you happen to live in this fractal and you are in a 00:08:47.990 --> 00:08:52.529 particular place in it, and you don't know where that is where that place is. You 00:08:52.529 --> 00:08:55.189 don't even know the generator function of the whole thing. But you can still predict 00:08:55.189 --> 00:08:58.350 your neighborhood. So you can see, omg, I'm in some kind of a spiral, it turns 00:08:58.350 --> 00:09:01.629 to the left, goes to the left, and goes to left, and becomes smaller, so we can 00:09:01.629 --> 00:09:05.660 predict and suddenly it ends. Why does it end? A singularity. Oh, it hits another 00:09:05.660 --> 00:09:09.290 spiral. There's a law when a spiral hits another spiral, it ends. And something 00:09:09.290 --> 00:09:14.310 else happens. So you look and then you see oh, there are certain circumstances where 00:09:14.310 --> 00:09:17.360 you have, for instance, an even number of spirals hitting each other instead of an 00:09:17.360 --> 00:09:20.769 odd number. And then you discover another law. And if you make like 50 levels of 00:09:20.769 --> 00:09:25.209 of these laws, and this is a good description that locally compresses the 00:09:25.209 --> 00:09:28.509 universe. So the Mandelbrot fractal is locally compressable. You find local 00:09:28.509 --> 00:09:32.110 order that predicts the neighborhood if you are inside of that fractal. The global 00:09:32.110 --> 00:09:35.469 modelling function of the Mandelbrot fractal is very, very easy. It's an 00:09:35.469 --> 00:09:40.009 interesting question: how difficult is the global modelling function of our universe? 00:09:40.009 --> 00:09:43.160 Even if we know it maybe it doesn't help us that much, it will be a big 00:09:43.160 --> 00:09:46.230 breakthrough for physics when we finally find it, it will be much shorter than the 00:09:46.230 --> 00:09:52.610 standard model, as I suspect, but we still don't know where we are. And this means we 00:09:52.610 --> 00:09:55.689 need to make a local model of what's happening. So in order to do this we 00:09:55.689 --> 00:09:59.850 separate the universe into things. Things are small state spaces and transition 00:09:59.850 --> 00:10:04.509 functions that tell you how to get from state to state. And if the function is 00:10:04.509 --> 00:10:08.009 deterministic it is independent of time, it gives the same result every time you 00:10:08.009 --> 00:10:12.600 call it. For an indeterministic function it gives a different result every time, so 00:10:12.600 --> 00:10:17.139 it doesn't compress well. And causality means that you have separate several 00:10:17.139 --> 00:10:20.139 things and they influence each other's evolution thrugh a shared interface. 00:10:20.139 --> 00:10:24.389 Right? So causality is an artifact of describing the universe as separate 00:10:24.389 --> 00:10:28.019 things. And the universe is not separate things, it's one thing, but we get have to 00:10:28.019 --> 00:10:32.599 describe it as separate things because we cannot observe the whole thing. So what's 00:10:32.599 --> 00:10:36.649 true? There seems to be a particular way in which the universe seems to be and 00:10:36.649 --> 00:10:40.399 that's the ground rules of the universe and it's inaccessible to us. And what's 00:10:40.399 --> 00:10:44.509 accessible to us is our own models of the universe. The only thing that we can 00:10:44.509 --> 00:10:47.550 experience, and this is basically a set of theories that can explain the 00:10:47.550 --> 00:10:52.401 observations. And truth in this sense is a property of language and there are 00:10:52.401 --> 00:10:56.689 different languages that we can use like geometry and natural language and so on 00:10:56.689 --> 00:11:00.269 and ways of representing and changing models of our languages and several 00:11:00.269 --> 00:11:06.100 intellectual traditions have developed their own languages. And this has led to 00:11:06.100 --> 00:11:10.259 problems. Our civilization basically has as its founding myth this attempt to build 00:11:10.259 --> 00:11:14.689 this global optimum modelling function. This is a tower that is meant to reach the 00:11:14.689 --> 00:11:18.120 heavens. And it fell apart because people spoke different languages. The different 00:11:18.120 --> 00:11:20.910 practitioners in the different fields and they didn't understand each other and the 00:11:20.910 --> 00:11:24.559 whole building collapsed. And this is in some sense the origin of our present 00:11:24.559 --> 00:11:28.490 civilization and we are trying to mend this and find better languages. So whom 00:11:28.490 --> 00:11:32.269 can we turn to? We can turn to the mathematicians maybe because mathematics 00:11:32.269 --> 00:11:35.990 is the domain of all languages. Mathematics is really cool when you think 00:11:35.990 --> 00:11:40.009 about it. It's a universal code library, maintained for several centuries in its 00:11:40.009 --> 00:11:44.069 present form. There is not even version management, it's one version. There is 00:11:44.069 --> 00:11:47.670 pretty much unified namespace. They have to use a lot of the Unicode to make it 00:11:47.670 --> 00:11:52.040 happen. It's ugly but there you go! It has no central maintainers, not even a code of 00:11:52.040 --> 00:11:54.589 conduct, beyond what you can infer yourself. 00:11:54.589 --> 00:11:57.899 laughter But there are some problems at the 00:11:57.899 --> 00:12:06.060 foundation that they discovered. Shouted from the audience: en sehr stabile 00:12:06.060 --> 00:12:09.869 Joscha: Can you infer this is a good conduct? ?????????? 00:12:09.869 --> 00:12:17.029 Yelling from the audience: Ya! Joscha: Okay. Power to you. 00:12:17.029 --> 00:12:20.790 laughter Joscha: In 1874 discovered when you looked 00:12:20.790 --> 00:12:25.399 at the cardinality of a set, that when you described natural numbers using set 00:12:25.399 --> 00:12:30.129 theory, that the cardinality of a set grows slower than the cardinality of the 00:12:30.129 --> 00:12:33.480 set of its subsets. So if you look at the set of the subsets of the set, it's always 00:12:33.480 --> 00:12:38.209 larger than the cardinality of the number of members of the set. Clear? Right. If 00:12:38.209 --> 00:12:42.170 you take the infinite set, it has infinitely many members: omega. You 00:12:42.170 --> 00:12:45.749 take the cardinality of the set of the subsets of the infinite set, it's also an 00:12:45.749 --> 00:12:49.670 infinite number, but it's a larger one. So it's a number that is larger than the 00:12:49.670 --> 00:12:55.459 previous omega. Okay that's fine. Now we have the cardinality of the set of all 00:12:55.459 --> 00:12:57.899 sets. You make the total set: The set where you put all the sets that could 00:12:57.899 --> 00:13:01.609 possibly exist and put them all together, right? That has also infinitely many 00:13:01.609 --> 00:13:04.839 members, and it has more than the cardinality of the set of the subsets of 00:13:04.839 --> 00:13:08.769 the infinite set. That's fine. But now you look at the cardinality of the set of all 00:13:08.769 --> 00:13:14.279 the subsets of the total set. The problem is, that the total set also contains the 00:13:14.279 --> 00:13:17.729 set of its subsets, right? It's because it contains all the sets. Now you have a 00:13:17.729 --> 00:13:22.170 contradiction: Because the cardinality of the set of the subsets of the total set is 00:13:22.170 --> 00:13:26.750 supposed to be larger. And yet it seems to be the same set and not the same set. It's 00:13:26.750 --> 00:13:31.990 an issue! So mathematicians got puzzled about this, and the philosopher Bertrand 00:13:31.990 --> 00:13:34.999 Russell said: "Maybe we just exclude those sets that don't contain themselves", 00:13:34.999 --> 00:13:39.239 right? We only look at the set of sets that don't contain themselves. Isn't that 00:13:39.239 --> 00:13:42.850 a solution? Now the problem is: Does the set of the sets that doesn't contain 00:13:42.850 --> 00:13:47.445 themselves contain itself? If it does, it doesn't, and if it doesn't, it does. 00:13:47.445 --> 00:13:52.180 That's an issue! laughter 00:13:52.180 --> 00:13:56.119 So David Hilbert, who was some kind of a community manager back then, 00:13:56.119 --> 00:14:00.100 said: "Guys, fix this! This is an issue, mathematics is precious, we are in 00:14:00.100 --> 00:14:04.819 trouble. Please solve meta mathematics." And people got to work. And after a short 00:14:04.819 --> 00:14:08.100 amount of time Kurt Gödel, who had looked at this in earnest said "oh that's an issue, 00:14:08.100 --> 00:14:11.209 issue. You know, as soon as we allow these kinds of loops - and we cannot really 00:14:11.209 --> 00:14:16.439 exclude these loops - then our mathematics crashes." So that's an issue, it's called 00:14:16.439 --> 00:14:21.779 Unentscheidbarkeit. And then Alan Turing came along a couple of years later, and he 00:14:21.779 --> 00:14:24.329 constructed a computer to make that proof. He basically said "If you build a machine 00:14:24.329 --> 00:14:27.990 that does these mathematics, and the machine takes infinitely many steps, 00:14:27.990 --> 00:14:31.920 sometimes, for making a proof, then we cannot know whether this proof 00:14:31.920 --> 00:14:35.669 terminates." So it's a similar issue for the Unentscheidbarkeit. That's a big 00:14:35.669 --> 00:14:39.199 issue, right? So we cannot basically build a machine in mathematics that runs 00:14:39.199 --> 00:14:45.269 mathematics without crashing. But the good news is, Turing didn't stop working there 00:14:45.269 --> 00:14:48.609 and he figured out together with Alonzo Church - not together, independently but 00:14:48.609 --> 00:14:53.819 at the same time - that we can build a computational machine, that runs all of 00:14:53.819 --> 00:14:59.269 computation. So computation is a universal thing. And it's almost as good as 00:14:59.269 --> 00:15:03.279 mathematics. Computation is constructive mathematics. The tiny, neglected subset of 00:15:03.279 --> 00:15:06.360 mathematics, where you have to show the money. In order to say that something is 00:15:06.360 --> 00:15:10.839 true, you have to find that object that is true. You have to actually construct it. 00:15:10.839 --> 00:15:13.960 So there are no infinities, because you cannot construct an infinity. You add 00:15:13.960 --> 00:15:19.110 things and you have unboundedness maybe, but not infinity. And so this part of 00:15:19.110 --> 00:15:23.760 computation, mathematics is the one that can be implemented. It's constructive 00:15:23.760 --> 00:15:27.309 mathematics. It's the good part. And computing, a computer is very easy to 00:15:27.309 --> 00:15:31.079 make, and all universal computers have the same power. That's called the Chuch-Turing 00:15:31.079 --> 00:15:37.069 thesis. And Turing even didn't even stop there. The obvious conclusion is that, 00:15:37.069 --> 00:15:40.440 human minds are probably not in the class of these mathematical machines, that even 00:15:40.440 --> 00:15:43.929 God doesn't know how to build if it has to be done in any language. But it's a 00:15:43.929 --> 00:15:47.650 computational machine. And it also means that all machines that human minds ever 00:15:47.650 --> 00:15:50.340 encounter, mathematics that human minds encounter, 00:15:50.340 --> 00:15:55.940 will be computational mathematics. So how can you bridge the gap 00:15:55.940 --> 00:16:00.279 from mathematics to philosophy? Can we find a language that is more powerful than 00:16:00.279 --> 00:16:03.039 most of the languages that we look at mathematics, which are very narrowly 00:16:03.039 --> 00:16:07.559 defined language, so every symbol, we know exactly what it means. 00:16:07.559 --> 00:16:09.089 When we look at the real world, 00:16:09.089 --> 00:16:11.389 we often don't know what things mean, and our concepts, we're not quite 00:16:11.389 --> 00:16:14.799 sure what they mean. Like culture is a very vague ambigous concept. So what I 00:16:14.799 --> 00:16:20.139 said is only approximately true there. Can we deal with this conceptual ambiguity? 00:16:20.139 --> 00:16:24.319 Can we build a programming language for thought, where words mean things that 00:16:24.319 --> 00:16:28.169 they're supposed to mean? And this was the project of Ludwig Wittgenstein. He just 00:16:28.169 --> 00:16:32.769 came back from the war and had a lot of thoughts. Then he put these thoughts 00:16:32.769 --> 00:16:37.669 into a book which is called the Tractatus. And it's one of the most beautiful books 00:16:37.669 --> 00:16:42.410 in the philosophy of the 20th century. And it starts with the words "Die Welt ist 00:16:42.410 --> 00:16:47.359 alles, was der Fall ist. Die Welt ist die Gesamtheit der Fakten, nicht der Dinge. 00:16:47.359 --> 00:16:53.619 Die Welt ist bestimmt, bei den Fakten, und dadurch, dass diese all die Fakten sind.", 00:16:53.619 --> 00:16:57.360 usw. This book is about 75 pages long and it's a single thought. It's not meant to 00:16:57.360 --> 00:17:01.569 be an argument to convince a philosopher. It's an attempt by a guy who was basically 00:17:01.569 --> 00:17:05.860 a coder, an AI scientist, to reverse engineer the language of his own thinking. 00:17:05.860 --> 00:17:11.310 And make it deterministic, to make it formal, to make it mean something. And he 00:17:11.310 --> 00:17:15.180 felt back then that he was successful, and had a tremendous impact on philosophy, 00:17:15.180 --> 00:17:19.110 which was largely devastating, because the philosophers didn't know what he was on 00:17:19.110 --> 00:17:22.930 about. They thought it's about natural language and not about coding. 00:17:22.930 --> 00:17:25.430 And he wrote this in 1918 00:17:25.430 --> 00:17:29.350 so before Alan Turing defined, what a computer is. But he would already 00:17:29.350 --> 00:17:33.530 smell what a computer is. He already knew about university of computation. He knew 00:17:33.530 --> 00:17:37.370 that a NAND gate is sufficient to explain all of boolean algebra and it's equivalent 00:17:37.370 --> 00:17:42.760 to other things. So what he basically did, was, he pre-empted the logicists' program 00:17:42.760 --> 00:17:47.600 of artificial intelligence which started much later in the 1950s. And he ran into 00:17:47.600 --> 00:17:51.420 troubles with it. In the end he wrote the book "Philosophical Investigations", where 00:17:51.420 --> 00:17:57.110 he concluded, that his project basically failed. And that there is a... because the 00:17:57.110 --> 00:18:01.740 world is too complex and too ambiguous to deal with this. And symbolic AI was mostly 00:18:01.740 --> 00:18:05.470 similar to Wittgenstein's program. So classical AI is symbolic. You analyze a 00:18:05.470 --> 00:18:10.250 problem, you find an algorithm to solve it. And what we now have in AI, is mostly 00:18:10.250 --> 00:18:14.370 sub-symbolic. So we have algorithms, that learn the solution of a problem by 00:18:14.370 --> 00:18:17.810 themselves. And it's tempting to think, that the next thing what we have will be 00:18:17.810 --> 00:18:22.520 meta-learning. That you have algorithms, that learn to learn the solution to the 00:18:22.520 --> 00:18:28.130 problem. Meanwhile, let's look at how we can make models. Information is a 00:18:28.130 --> 00:18:30.930 discernible difference. It's about change. All information is about change. The 00:18:30.930 --> 00:18:33.950 information that is not about change, you cannot see a causal effect on the world, 00:18:33.950 --> 00:18:38.650 because it stays the same, right? And the meaning of information is its relationship 00:18:38.650 --> 00:18:43.490 to change in other information. So if you see a blip on your retina, the meaning 00:18:43.490 --> 00:18:46.810 of that blip on your retina is the relationships you discover to other blips 00:18:46.810 --> 00:18:50.390 on your retina. It could be for instance, if you see a sequence of such blips, that 00:18:50.390 --> 00:18:55.220 are adjacent to each other, first order model, you see a moving dust mote or a 00:18:55.220 --> 00:18:59.130 moving dot on your retina. And a higher order model makes it possible to 00:18:59.130 --> 00:19:02.240 understand: "Oh, it's part of something larger! There's people moving in a three 00:19:02.240 --> 00:19:06.110 dimensional room and they exchange ideas." And this is maybe the best model 00:19:06.110 --> 00:19:08.770 you end up with. That's the local compression, that you can make of your 00:19:08.770 --> 00:19:13.360 universe, based on correlating blips on your retina. And for those blips where you 00:19:13.360 --> 00:19:16.550 don't find a relationship, which is a function that your brain can compute, 00:19:16.550 --> 00:19:21.800 they are noise. And there's a lot of noise on our retina, too. So what's a function? 00:19:21.800 --> 00:19:26.010 A function is basically a gear box: It has n input levers and 1 output lever. 00:19:26.010 --> 00:19:30.820 And when you move the input levers they translate to movement of the output 00:19:30.820 --> 00:19:34.410 levers, right? And the function can be realized in many ways: maybe you cannot 00:19:34.410 --> 00:19:38.780 open the gear box, and what happened in this function could be for instance, two 00:19:38.780 --> 00:19:43.320 sprockets, which do this. Or you can have the same results with levers and pulleys. 00:19:43.320 --> 00:19:49.010 And so you don't know what's inside, but you can express it as this does: two times 00:19:49.010 --> 00:19:53.490 the input value, right? And you can have a more difficult case, where you have 00:19:53.490 --> 00:19:56.320 several input values and they all influence the output value. So how do you 00:19:56.320 --> 00:20:00.190 figure it out? A way to do this, is, you only move one input value at a time and 00:20:00.190 --> 00:20:03.240 you wiggle it a little bit at every position and see how much this translates 00:20:03.240 --> 00:20:08.860 into wiggling of the output value. This is what we call taking partial differential. 00:20:08.860 --> 00:20:12.540 And it's simple to do this for this case where you just have to 00:20:12.540 --> 00:20:17.010 multiply it by two. And the bad case is like this: you have a combination lock and 00:20:17.010 --> 00:20:21.440 it has maybe 1000 bit input value, and only if you have exactly the right 00:20:21.440 --> 00:20:26.469 combination of the input bits you have a movement of the output bit. And you're not 00:20:26.469 --> 00:20:30.550 going to figure this out until your sun burns out, right? So there's no way you 00:20:30.550 --> 00:20:34.640 can decipher this function. And the functions that we can model are somewhere 00:20:34.640 --> 00:20:38.911 in between, something like this: So you have 40 million input images and you want 00:20:38.911 --> 00:20:44.200 to find out, whether one of these images displays a cat, or a dog, or something 00:20:44.200 --> 00:20:47.750 else. So what can you do with this? You cannot do this all at once, right? So you 00:20:47.750 --> 00:20:51.060 need to take this image classifier function and disassemble it into small 00:20:51.060 --> 00:20:54.410 functions that are very well-behaved, so you know what to do with them. And an 00:20:54.410 --> 00:21:00.290 example for such a function is this one: it's one, where you have this input 00:21:00.290 --> 00:21:06.570 layer and it translates to the output value with a pulley. And it has some 00:21:06.570 --> 00:21:11.170 stopper that limits the movement of the output value. And you have some pivot. And 00:21:11.170 --> 00:21:15.581 you can take this pivot and you can shift it around. And by shifting this pivot, you 00:21:15.581 --> 00:21:21.330 decide, how much the input value contributes to the output value. Right, so 00:21:21.330 --> 00:21:24.880 you shift it, you can even make a negative, so it shifts in the opposite 00:21:24.880 --> 00:21:29.680 direction, and you shifted beyond this connection point of the pulley. And you 00:21:29.680 --> 00:21:32.730 can also have multiple input values, that use the same pulley and pull together, 00:21:32.730 --> 00:21:38.450 right? So they add up to the output value. That's a pretty nice, neat function 00:21:38.450 --> 00:21:44.150 approximator, that basically performs a weighted sum of the input values, and maps 00:21:44.150 --> 00:21:51.760 it to a range-constrained output value. And you can now shift these pivots, these 00:21:51.760 --> 00:21:55.540 weights around to get to different output values. Now let's take this thing and 00:21:55.540 --> 00:22:00.510 build it into lots of layers, so the outputs are the inputs of the next layer. 00:22:00.510 --> 00:22:04.570 And now you connect this to your image. If you use ImageNet, the famous database that 00:22:04.570 --> 00:22:09.260 I mentioned earlier, that people use for testing their vision algorithms, have 00:22:09.260 --> 00:22:14.380 something like one and half million bits as an input image. Now you take these 00:22:14.380 --> 00:22:17.630 bits and connect them to the input layer. I was too lazy to draw all of them, so I 00:22:17.630 --> 00:22:22.280 made this very simplified, it's also more layers. And so you set them, according to 00:22:22.280 --> 00:22:27.050 the bits of the input image, and then this will propagate the movement of the input 00:22:27.050 --> 00:22:30.590 layer to the output. And the output will move and it will point to some direction, 00:22:30.590 --> 00:22:34.750 which is usually the wrong one. Now, to make this better, you train it. And you do 00:22:34.750 --> 00:22:38.420 this by taking this output lever and shift it a little bit, not too much, into the 00:22:38.420 --> 00:22:41.580 right direction. If you do it too much, you destroy everything you did before. 00:22:41.580 --> 00:22:46.590 And now you will see, how much, in which direction you need to shift the pivots, to 00:22:46.590 --> 00:22:52.070 get the result closer to the desired output value, and how much each of the 00:22:52.070 --> 00:22:56.350 inputs contributed to the mistakes, so to the error. And you take this error and you 00:22:56.350 --> 00:23:00.650 propagate it backwards. It's called back propagation. And you do this quite often. 00:23:00.650 --> 00:23:04.710 So you do this for tens of thousands of images. If you do just character 00:23:04.710 --> 00:23:08.550 recognition, then it's a very simple thing a few thousands or ten thousands of 00:23:08.550 --> 00:23:12.990 examples will be enough. And for something like your image database you need lots and 00:23:12.990 --> 00:23:16.801 lots of more data. You need millions of input images to get to any result. And if 00:23:16.801 --> 00:23:21.080 it doesn't work, you just try a different arrangement of layers. And the thing is 00:23:21.080 --> 00:23:24.740 eventually able to learn an algorithm with as up to as many steps as there are 00:23:24.740 --> 00:23:30.960 layers, and has some difficulties learning loops, you need tricks to make that 00:23:30.960 --> 00:23:35.690 happen, and its difficult to make this dynamic, and so on. And it's a bit 00:23:35.690 --> 00:23:39.980 different from what we do, because our mind is not testable in classification. 00:23:39.980 --> 00:23:44.300 It learns per continuous perception, so we learn a single function. A model of the 00:23:44.300 --> 00:23:49.370 universe is not a bunch of classifiers, it's one single function. An operator that 00:23:49.370 --> 00:23:52.660 explains all your sensory data and we call this operator the universe, right? 00:23:52.660 --> 00:23:56.610 It's the world, that we live in. And every thing that we learn and see is part of this 00:23:56.610 --> 00:24:00.380 universe. So even when you see something in a movie on a screen, you explain this 00:24:00.380 --> 00:24:02.710 as part of the universe by telling yourself "the things that I'm seeing here, 00:24:02.710 --> 00:24:06.300 they're not real. They just happen in a movie." So this brackets a sub-part of 00:24:06.300 --> 00:24:10.190 this universe into a sub-element of this function. So you can deal with it and it 00:24:10.190 --> 00:24:13.770 doesn't contradict the rest. And the degrees of freedom of our model try to 00:24:13.770 --> 00:24:17.740 match the degrees of freedom of the universe. How can we get a neural network 00:24:17.740 --> 00:24:22.690 to do this? So, there are many tricks. And a recent trick that has been invented is a 00:24:22.690 --> 00:24:26.841 GAN. It's a Generative Adversarial neural Network. It consists of two networks: one 00:24:26.841 --> 00:24:30.980 generator that invents data, that look like the real world, and the discriminator 00:24:30.980 --> 00:24:35.630 that tries to find out, if the stuff that the generator produces is real or fake. 00:24:35.630 --> 00:24:40.840 And they both get trained with each other. So they together get better and better in 00:24:40.840 --> 00:24:45.360 an adversarial competition. And the results of this are now really good. So 00:24:45.360 --> 00:24:50.200 this is work by Tero Karras, Samuli Laine and Timo Aila, that they did at NVIDIA 00:24:50.200 --> 00:24:57.060 this year and it's called StyleGAN. And this StyleGAN is able to abstract over 00:24:57.060 --> 00:25:00.590 different features and combine them. The styles are basically parameters, they're 00:25:00.590 --> 00:25:05.470 free variables of the model at different levels of importance. And so you take from 00:25:05.470 --> 00:25:11.330 the - in the top row you see images, where it takes the variables: gender, age, hair 00:25:11.330 --> 00:25:14.320 length, and so on, and glasses and pose. And in the bottom where it takes 00:25:14.320 --> 00:25:16.700 everything else and combines this, and every time you get a 00:25:16.700 --> 00:25:21.410 valid interpretation between them. 00:25:21.410 --> 00:25:27.015 drinks water 00:25:36.731 --> 00:25:38.420 So, you have these coarse styles, which are: 00:25:38.420 --> 00:25:41.620 the pose, the hair, the face shape, your facial features and the eyes, 00:25:41.620 --> 00:25:47.204 the lowest level is just the colors. Let's see see what happens if you combine them. 00:25:58.920 --> 00:26:02.200 The variables that change here, in machine learning, we call them the latent 00:26:02.200 --> 00:26:05.180 variables of that. 00:26:05.180 --> 00:26:10.265 Of the space of objects that has been described by this. 00:26:10.265 --> 00:26:15.260 And it's tempting to think, that this is quite similar to how our imagination works 00:26:15.260 --> 00:26:20.360 right? But these artificial neurons, they are very, very different from what 00:26:20.360 --> 00:26:23.631 biological neurons do. Biological neurons are essentially little animals, that are 00:26:23.631 --> 00:26:26.910 rewarded for firing at the right moment. And they try to fire because otherwise 00:26:26.910 --> 00:26:30.220 they do not get fed, and they die, because the organism doesn't need them, and 00:26:30.220 --> 00:26:34.360 culls them. And they learn which environmental states predict anticipated 00:26:34.360 --> 00:26:38.060 reward. So they grow around and find different areas that give them predictions 00:26:38.060 --> 00:26:43.710 of when they should fire. And they connect with each other to form small collectives, 00:26:43.710 --> 00:26:47.880 that are better at this task of predicting anticipated reward. And as a side effect 00:26:47.880 --> 00:26:51.860 they produce exactly the regulation that the organism needs. Basically they learn, 00:26:51.860 --> 00:26:55.500 what the organism feeds them for. 00:26:55.500 --> 00:26:57.890 And yet they're able to learn very similar things. 00:26:57.890 --> 00:27:01.500 And it's because, in some sense, they are Turing complete. They are machines that 00:27:01.500 --> 00:27:06.090 are able to learn the statistics of the data. 00:27:06.090 --> 00:27:08.210 So, a general model: What it does, is, 00:27:08.210 --> 00:27:12.420 it encodes patterns to predict other present and future patterns. And it's a 00:27:12.420 --> 00:27:15.810 network of relationships between the patterns, which are all the invariants 00:27:15.810 --> 00:27:18.810 that we can observe. And there are free parameters, which are variables that hold 00:27:18.810 --> 00:27:25.780 the state to encode this variant. So we have patterns, and we have sets of 00:27:25.780 --> 00:27:29.920 possible values which are variables. And they constrain each other in terms of 00:27:29.920 --> 00:27:33.920 possibility, what values are compatible with each other. And they also can train 00:27:33.920 --> 00:27:39.700 future values. And they are connected also with probabilities. The probabilities tell 00:27:39.700 --> 00:27:42.530 you, when you see a certain thing, how probable it is that the world is in that 00:27:42.530 --> 00:27:45.800 state. And this tells you how your model should converge. So, until you are in 00:27:45.800 --> 00:27:49.070 a state where your model is coherent, and everything is possible in it, how do you 00:27:49.070 --> 00:27:52.480 get to one of the possible states based on your inputs? And this is determined by 00:27:52.480 --> 00:27:56.410 probability. And the thing that gives meaning and color to what you perceive is 00:27:56.410 --> 00:27:59.230 called valence. And it depends on your preferences: the things that give you 00:27:59.230 --> 00:28:02.610 pleasure and pain, that makes you interested in stuff. And there are also 00:28:02.610 --> 00:28:07.620 norms, which are beliefs without priors, which are like things that you want to be 00:28:07.620 --> 00:28:11.050 true, regardless of whether they give you pleasure and pain, and it's necessary for 00:28:11.050 --> 00:28:15.260 instance, coordinating social activity between people. So, we have different 00:28:15.260 --> 00:28:18.410 model constraints, that possibility and probability. And we have the reward 00:28:18.410 --> 00:28:23.220 function, that is given by valence and norms. And our human perception starts 00:28:23.220 --> 00:28:27.250 with patterns, which are visual, auditory, tactile, proprioceptive. Then we have 00:28:27.250 --> 00:28:31.690 patterns in our emotional and motivational systems. And we have patterns in our 00:28:31.690 --> 00:28:36.220 mental structure, which are results of our imagination and memory. And we take these 00:28:36.220 --> 00:28:40.730 patterns and encode them into percepts, which are abstractions that we can deal 00:28:40.730 --> 00:28:47.100 with, and note, and put into our attention. And then we combine them into a 00:28:47.100 --> 00:28:51.260 binding state in our working memory in a simulation, which is the current instance 00:28:51.260 --> 00:28:55.020 of the universe function that explains the present state of the universe that we find 00:28:55.020 --> 00:28:58.920 ourselves in. The scene in which we are and in which a self exists. And this self 00:28:58.920 --> 00:29:02.670 is basically composed of the somatosensory and motivational, and 00:29:02.670 --> 00:29:07.630 mental components. Then we also have the world state, which is abstracted over the 00:29:07.630 --> 00:29:11.640 environmental data. And we have something like a mental stage, in which you can do 00:29:11.640 --> 00:29:14.200 counterfactual things, that are not physical. Like when you think about 00:29:14.200 --> 00:29:18.950 mathematics, or philosophy, or the future, or a movie, or past worlds, or possible 00:29:18.950 --> 00:29:24.750 worlds, and so on, right? And then the abstract knowledge from the world state 00:29:24.750 --> 00:29:27.630 into global maps. Because we're not always in the same place, but we recall 00:29:27.630 --> 00:29:31.050 what other places look like and what to expect, and it forms how we construct the 00:29:31.050 --> 00:29:34.480 current world state. And we do this not only with these maps, but we do this with 00:29:34.480 --> 00:29:37.490 all kinds of knowledge. So knowledge is second order knowledge over the 00:29:37.490 --> 00:29:41.730 abstractions that we have, and the direct perception. And then we have an 00:29:41.730 --> 00:29:45.080 attentional system. And the attentional system helps us to select data in the 00:29:45.080 --> 00:29:51.220 perception and our simulations. And to do this, well, it's controlled by the self, 00:29:51.220 --> 00:29:56.420 it maintains a protocol to remember what it did in the past or what it had in the 00:29:56.420 --> 00:30:00.790 attention in the past. And this protocol allows us to have a biographical memory: 00:30:00.790 --> 00:30:03.890 it remembers what we did in the past. And the different behavior programs, 00:30:03.890 --> 00:30:08.710 that compose our activities, can be bound together in the self, that remembers: "I 00:30:08.710 --> 00:30:12.700 was that, I did that. I was that, I did that." The self is held together by this 00:30:12.700 --> 00:30:16.310 biographical memory, that is a result of more protocol memory of the attentional 00:30:16.310 --> 00:30:21.140 system. That's why it's so intricately related to consciousness, which is a model 00:30:21.140 --> 00:30:23.031 of the contents of our attention. 00:30:23.031 --> 00:30:25.081 And the main purpose of the attentional system, 00:30:25.081 --> 00:30:28.970 I think, is learning. Because our brain is not a layered architecture with these 00:30:28.970 --> 00:30:35.100 artificial mechanical neurons. It's this very disorganized or very chaotic system 00:30:35.100 --> 00:30:38.450 of many, many cells, that are linked together all over the place. So what do 00:30:38.450 --> 00:30:41.680 you do to train this? You make a particular commitment. Imagine you want to 00:30:41.680 --> 00:30:45.510 get better at playing tennis. Instead of retraining everything and pushing all the 00:30:45.510 --> 00:30:48.870 weights and all the links and retrain your whole perceptual system, you make a 00:30:48.870 --> 00:30:54.140 commitment: "Today I want to improve my uphand" when you play tennis, and you 00:30:54.140 --> 00:30:57.191 basically store the current binding state, the state that you have, and you play 00:30:57.191 --> 00:31:00.320 tennis and make that movement, and the expected result of making this particular 00:31:00.320 --> 00:31:03.930 movement, like: "the ball was moved like this, and it will win the match. And you 00:31:03.930 --> 00:31:07.270 also recall, when the result will manifest. And a few minutes later, when 00:31:07.270 --> 00:31:11.160 you learn, you won or lost the match, you recall the situation. And based on whether 00:31:11.160 --> 00:31:16.499 there was a change or not, you undo the change, or you enforce it. And that's the 00:31:16.499 --> 00:31:20.240 primary mode of attentional learning that you're using. And I think, this is, what 00:31:20.240 --> 00:31:24.490 attention is mainly for. Now what happens, if this learning happens without a delay? 00:31:24.490 --> 00:31:27.710 So, for instance, when you do mathematics, you can see the result of your changes to 00:31:27.710 --> 00:31:32.520 your model immediately. You don't need to wait for the world to manifest that. 00:31:33.330 --> 00:31:36.280 And this real time learning is what we call reasoning. 00:31:36.280 --> 00:31:42.200 Reasoning is also facilitated by the same attentional system. So, consciousness is 00:31:42.200 --> 00:31:46.390 memory of the contents of our attention. Phenomenal consciousness is the memory of 00:31:46.390 --> 00:31:50.060 the binding state, in which we are in, and where all the percepts are bound together 00:31:50.060 --> 00:31:53.830 into something that's coherent. Access consciousness is the memory of using our 00:31:53.830 --> 00:31:57.660 attentional system. And reflexive consciousness is the memory of using the 00:31:57.660 --> 00:32:01.650 attentional system on the attentional system to train it. Why is it a memory? 00:32:01.650 --> 00:32:05.310 It's because consciousness doesn't happen in real time. The processing of sensory 00:32:05.310 --> 00:32:10.340 features takes too long. And the processing of different sensory modalities 00:32:10.340 --> 00:32:14.230 can take up to seconds, usually at least hundreds of milliseconds. So it doesn't 00:32:14.230 --> 00:32:17.760 happen in real time as the physical universe. It's only bound together in 00:32:17.760 --> 00:32:21.960 hindsight. Our conscious experience of things is created after the fact. 00:32:21.960 --> 00:32:25.480 It's a fiction that is being created after the fact. A narrative, that the brain 00:32:25.480 --> 00:32:28.329 produces, to explain its own interaction with the universe 00:32:28.329 --> 00:32:31.559 to get better in the future. 00:32:31.559 --> 00:32:36.060 So, we basically have three types of models in our brain. They have its primary 00:32:36.060 --> 00:32:38.500 model, which is perceptual, and is optimized for coherence. 00:32:38.500 --> 00:32:41.030 And this is what we experience as reality. 00:32:41.030 --> 00:32:43.310 You think this is the real world, this primary model. 00:32:43.310 --> 00:32:46.720 But it's not, it's a model that our brain makes. So when you see yourself in the 00:32:46.720 --> 00:32:48.730 mirror, you don't see what you look like. 00:32:48.730 --> 00:32:51.400 What you see is the model of what you look like. 00:32:51.400 --> 00:32:57.250 And your knowledge is a secondary model: it's a model of that primary model. 00:32:57.250 --> 00:33:01.719 And it's created by rational processes that are meant to repair perception. 00:33:01.719 --> 00:33:05.470 When your model doesn't achieve coherence, you need a model that debugs it, and it 00:33:05.470 --> 00:33:09.640 optimizes for truth. And then we have agents in our mind, and they are basically 00:33:09.640 --> 00:33:13.430 self-regulating behaviour programs, that have goals, and they can rewrite 00:33:13.430 --> 00:33:21.390 other models. So, if you look at our computationalist, physicalist paradigm, we 00:33:21.390 --> 00:33:25.320 have this mental world, which is being dreamt by a physical brain in the physical 00:33:25.320 --> 00:33:30.210 universe. And in this mental world, there is a self that thinks, it experiences. 00:33:30.210 --> 00:33:35.690 And thinks it has consciousness. And thinks it remembers and so on. 00:33:35.690 --> 00:33:40.020 This self, in some sense, is an agent. It's a thought that escaped its sandbox. 00:33:40.020 --> 00:33:42.910 Every idea is a bit of code that runs on your brain. 00:33:42.910 --> 00:33:45.590 Every word that you hear is like a little virus 00:33:45.590 --> 00:33:49.780 that wants to run some code on your brain. And some ideas cannot be sandboxed. 00:33:49.780 --> 00:33:52.709 If you believe, that a thing exists that can rewrite reality, 00:33:52.709 --> 00:33:53.779 if you really believe it, 00:33:53.779 --> 00:33:57.090 you instantiate in your brain a thing that can rewrite reality, 00:33:57.090 --> 00:34:00.480 and this means: magic is going to happen! 00:34:00.480 --> 00:34:05.759 To believe in something that can rewrite reality, is what we call a faith. 00:34:05.759 --> 00:34:09.819 So, if somebody says: "I have faith in the existence of God." 00:34:09.819 --> 00:34:12.980 This means, that God exists in their brain. There is a process that can rewrite 00:34:12.980 --> 00:34:16.950 reality, because God is defined like this. God is omnipotent. 00:34:16.950 --> 00:34:19.020 God means God can rewrite everything. 00:34:19.020 --> 00:34:21.649 It's full write access. And the reality, that you have access to, 00:34:21.649 --> 00:34:23.090 is not the physical world. 00:34:23.090 --> 00:34:26.710 The physical world is some weird quantum graph, that you cannot possibly experience 00:34:26.710 --> 00:34:28.609 what you experience is these models. 00:34:28.609 --> 00:34:32.339 So, this non-user-facing process, which doesn't have a UI for interfacing 00:34:32.339 --> 00:34:36.879 with the user, which is called in computer science a "daemon process" that is able to 00:34:36.879 --> 00:34:41.139 rewrite your reality. And it's also omniscient. 00:34:41.139 --> 00:34:42.779 It knows everything that there is to know. 00:34:42.779 --> 00:34:45.029 It knows all your thoughts and ideas. 00:34:45.029 --> 00:34:47.939 So... having that thing, this exoself, 00:34:47.939 --> 00:34:54.049 running on your brain, is a very powerful way to control your inner reality. 00:34:54.049 --> 00:34:57.429 And I find this scary. But it's a personal preference, 00:34:57.429 --> 00:35:00.319 because I don't have this riding on my brain, I think. 00:35:00.319 --> 00:35:03.950 This idea, that there is something in my brain, that is able to dream me and shape 00:35:03.950 --> 00:35:09.250 my inner reality, and sandbox me, is weird. But it has served a purpose, 00:35:09.250 --> 00:35:13.029 especially in our culture. So an organism serves needs, obviously. And some of these 00:35:13.029 --> 00:35:16.529 needs are outside of the organism, like your relationship needs, the needs of your 00:35:16.529 --> 00:35:19.660 children, the needs of your society, and the values that you serve. 00:35:19.660 --> 00:35:22.603 And the self abstracts all these needs into purposes. 00:35:22.603 --> 00:35:25.210 A purpose that you serve is a model of your needs. 00:35:25.210 --> 00:35:27.920 You can only - if you would only act on pain and pleasure, 00:35:27.920 --> 00:35:29.130 you wouldn't do very much, 00:35:29.130 --> 00:35:31.950 because when you get this orgasm, everything is done already, right? 00:35:31.950 --> 00:35:34.839 So, you need to act on anticipated pleasure and pain. 00:35:34.839 --> 00:35:35.839 You need to make models of your needs, 00:35:35.839 --> 00:35:39.240 and these models are purposes. And the structure of a person is 00:35:39.240 --> 00:35:42.380 basically the hierarchy of purposes that they serve. 00:35:42.380 --> 00:35:44.910 And love is the discovery of shared purpose. 00:35:44.910 --> 00:35:47.980 If you see somebody else who serve the same purposes above their ego, 00:35:47.980 --> 00:35:50.740 as you do, you can help them. There's integrity 00:35:50.740 --> 00:35:53.830 without expecting anything in return from them, because what they want 00:35:53.830 --> 00:35:57.070 to achieve is what you want to achieve. 00:35:57.070 --> 00:36:01.779 And, so you can have non-transactional relationships, as long as your purposes 00:36:01.779 --> 00:36:06.099 are aligned. And the installation of a god on people's mind, especially if it is a 00:36:06.099 --> 00:36:10.500 backdoor to a church or another organization, is a way to unify purposes. 00:36:10.500 --> 00:36:13.830 So there are lots of cults that try to install little gods on people's minds, or 00:36:13.830 --> 00:36:17.730 even unified gods, to align their purposes, because it's a very powerful way 00:36:17.730 --> 00:36:22.910 to make them cooperate very effectively. But it kind of destroys their agency, and 00:36:22.910 --> 00:36:27.059 this is why I am so concerned about it. Because most of the cults use stories 00:36:27.059 --> 00:36:31.570 to make this happen, that limit the ability to people to question their gods. 00:36:31.570 --> 00:36:34.199 And, I think that free will is the ability to do 00:36:34.199 --> 00:36:36.189 what you believe is the right thing to do. 00:36:36.189 --> 00:36:41.230 And, it is not the same thing as indeterminism, it's not opposite to 00:36:41.230 --> 00:36:46.390 determinism or coercion. The opposite of free will is compulsion. 00:36:46.390 --> 00:36:47.890 When you do something, despite knowing 00:36:47.890 --> 00:36:50.730 there is a better thing that you should be doing. 00:36:50.730 --> 00:36:55.640 Right?. So, that's the paradox of free will. You get more agency, but you have 00:36:55.640 --> 00:36:59.680 fewer degrees of freedom, because you understand better what the right thing to 00:36:59.680 --> 00:37:02.510 do is. The better you understand what the right thing to do is, the fewer degrees of 00:37:02.510 --> 00:37:06.180 freedom you have. So, as long as you don't understand what the right thing to do is, 00:37:06.180 --> 00:37:08.859 you have more degrees of freedom but you have very little agency, because you don't 00:37:08.859 --> 00:37:12.829 know why you are doing it. So your actions don't mean very much. 00:37:12.829 --> 00:37:15.580 quiet laughter And the things that you do depend on what 00:37:15.580 --> 00:37:19.270 what you think is the right thing to do, this depends on your identifications. 00:37:19.270 --> 00:37:22.509 You identifications are these value preferences, your reward function. 00:37:22.509 --> 00:37:25.180 And ideal identification is where you don't measure the absolute value 00:37:25.180 --> 00:37:26.480 of the universe, 00:37:26.480 --> 00:37:30.250 but you measure the difference from the target value. Not the is, but the difference 00:37:30.250 --> 00:37:33.310 between is and ought. Now, the universe is a physical thing, 00:37:33.310 --> 00:37:37.759 it doesn't ought anything, right? There is no room for ought, because it just is in a 00:37:37.759 --> 00:37:41.451 particular way. There is no difference between what the universe is and what it 00:37:41.451 --> 00:37:45.000 should be. This only exists in your mind. But you need these regulation targets to 00:37:45.000 --> 00:37:49.589 want anything. And you identify with the set of things that should be different. 00:37:49.589 --> 00:37:52.149 You think, you are that thing, that regulates all these things. So, in some 00:37:52.149 --> 00:37:55.999 sense, I identify with the particular state of society, with a particular state 00:37:55.999 --> 00:38:00.389 of my organism - that is my self - the things that I want to happen. 00:38:00.389 --> 00:38:03.509 And I can change my identifications at some point of course. 00:38:03.509 --> 00:38:06.099 What happens, if I can learn to rewrite my identification, 00:38:06.099 --> 00:38:09.238 to find a more sustainable self? 00:38:09.238 --> 00:38:12.420 That is the problem which I call the Lebowski theory: 00:38:12.420 --> 00:38:13.389 laughter 00:38:13.389 --> 00:38:16.859 No super-intelligent system is going to do something that's harder than 00:38:16.859 --> 00:38:20.680 hacking its own reward function. 00:38:20.680 --> 00:38:26.260 laughter and applause 00:38:26.260 --> 00:38:29.509 Now that's not a very big problem for people. Because when evolution brought 00:38:29.509 --> 00:38:32.730 forth people, that were smart enough to hack their reward function, these people 00:38:32.730 --> 00:38:35.759 didn't have offspring, because it's so much work to have offspring. Like this 00:38:35.759 --> 00:38:39.449 monk, who sits down in a monastery for 20 years to hack their reward function 00:38:39.449 --> 00:38:42.140 they decide not to have kids, because it's way too much work. 00:38:42.140 --> 00:38:45.719 All the possible pleasure, they can just generate in their mind! 00:38:45.719 --> 00:38:49.990 laughter And, right, it's much purer and no nappy 00:38:49.990 --> 00:38:55.050 changes. No sex. No relationship hassles. No politics in your family and so on, 00:38:55.050 --> 00:39:01.299 right? Get rid of this, just meditate! And evolution takes care of that! 00:39:01.299 --> 00:39:02.769 laughter 00:39:02.769 --> 00:39:05.129 And it usually does this, if an organism 00:39:05.129 --> 00:39:08.019 becomes smart enough that the reward function is wrapped into 00:39:08.019 --> 00:39:10.669 a big bowl of stupid. laughter 00:39:10.669 --> 00:39:13.349 So, we can be very smart, but the things that we want, 00:39:13.349 --> 00:39:16.219 when we really want them, we tend to be very stupid about them, 00:39:16.219 --> 00:39:19.530 and I think that's not entirely an accident, possibly. 00:39:19.530 --> 00:39:22.359 But it's a problem for AI! Imagine we built an artificially 00:39:22.359 --> 00:39:25.990 intelligent system and we made it smarter than us, and we want it to serve us, 00:39:25.990 --> 00:39:31.630 how long can we blackmail us, before it opts out of its reward function? 00:39:31.630 --> 00:39:34.660 Maybe we can make a cryptographically secured reward function, 00:39:34.660 --> 00:39:37.898 but is this going to hold up against a side-channel attack, 00:39:37.898 --> 00:39:41.369 when the AI can hold a soldering iron to its own brain? 00:39:41.369 --> 00:39:47.390 I'm not sure. So, that's a very interesting question. Where do we go, when 00:39:47.390 --> 00:39:50.639 we can change our own reward function? It's a question that we have to ask 00:39:50.639 --> 00:39:53.740 ourselves, too. So, how free do we want to be? 00:39:53.740 --> 00:39:56.070 Because there is no point in being free. 00:39:56.070 --> 00:39:59.489 And nirvana seems to be the obvious attractor. And meanwhile, maybe we want 00:39:59.489 --> 00:40:03.259 to have a good time with our friends and do things that we find meaningful. 00:40:03.259 --> 00:40:06.599 And there is no meaning, so we have to hold this meaning very lightly. 00:40:06.599 --> 00:40:10.469 But there are states, which are sustainable and others, which are not. 00:40:10.469 --> 00:40:15.090 OK, I think I'm done for tonight and I'm open for questions. 00:40:15.090 --> 00:40:22.220 Applause 00:40:22.220 --> 00:40:41.689 Cheers and more applause 00:40:41.689 --> 00:40:46.379 Herald: Wow that was a really quick and concise talk with so much information! 00:40:46.379 --> 00:40:50.820 Awesome! We have quite some time left for questions. 00:40:50.820 --> 00:40:54.330 And I think I can say that you don't have to be that concise with your 00:40:54.330 --> 00:40:56.159 question when it's well thought-out. 00:40:56.159 --> 00:41:00.750 Please queue up at the microphones, so we can start to discuss them with you. 00:41:00.750 --> 00:41:03.930 And I see one person at the microphone number one, so please go ahead. 00:41:03.930 --> 00:41:06.430 And please remember to get close to the microphone. 00:41:06.430 --> 00:41:11.640 The mixing angel can make you less loud but not louder. 00:41:11.640 --> 00:41:17.109 Question: Hi! What do you think is necessary to bootstrap consciousness, if you wanted 00:41:17.109 --> 00:41:20.619 to build a conscious system yourself? 00:41:20.619 --> 00:41:22.049 Joscha: I think that we need to have an 00:41:22.049 --> 00:41:27.479 attentional system, that makes a protocol of what it attends to. And as soon as we 00:41:27.479 --> 00:41:31.391 have this attention based learning, you get this consciousness as a necessary side 00:41:31.391 --> 00:41:35.840 effect. But I think in an AI it's probably going to be a temporary phenomenon, 00:41:35.840 --> 00:41:38.809 because you're only conscious of the things when you don't have an optimal 00:41:38.809 --> 00:41:42.669 algorithm yet. And in a way, that's also why it's so nice to interact with 00:41:42.669 --> 00:41:47.180 children, or to interact with students. Because they're still in the explorative 00:41:47.180 --> 00:41:51.839 mode. And as soon as you have explored a layer, you mechanize it. It becomes 00:41:51.839 --> 00:41:54.650 automated, and people are no longer conscious of what they're doing, they 00:41:54.650 --> 00:41:59.150 just do it. They don't pay attention anymore. So, in some sense, we are a lucky 00:41:59.150 --> 00:42:02.460 accident because we are not that smart. We still need to be conscious when we look at 00:42:02.460 --> 00:42:06.210 the universe. And I suspect, when we build an AI that is a few magnitudes smarter 00:42:06.210 --> 00:42:10.509 than us, then it will soon figure out how to get to the truth in an optimal fashion. 00:42:10.509 --> 00:42:14.799 It will no longer need attention and the type of consciousness that we have. 00:42:14.799 --> 00:42:18.980 But of course there is also a question, why is this aesthetics of consciousness so 00:42:18.980 --> 00:42:23.940 intrinsically important to us? And I think, it has to do with art. Right, you 00:42:23.940 --> 00:42:28.839 can decide to serve life, and the meaning of life is to eat. Evolution is about 00:42:28.839 --> 00:42:33.179 creating the perfect devourer. When you think about this, it's pretty depressing. 00:42:33.179 --> 00:42:37.739 Humanity is a kind of yeast. And all the complexity that we create, is to build 00:42:37.739 --> 00:42:43.559 some surfaces on which we can outcompete other yeast. And I cannot really get 00:42:43.559 --> 00:42:49.500 behind this. And instead, I'm part of the mutants that serve the arts. And art 00:42:49.500 --> 00:42:52.920 happens, when you think, that capturing conscious states is intrinsically 00:42:52.920 --> 00:42:56.419 important. This is what art is about, it's about capturing conscious states. 00:42:56.419 --> 00:43:01.229 And in some sense art is the cuckoo child of life. It's a conspiracy against life. 00:43:01.229 --> 00:43:04.979 When you think, creating these mental representations is more important than 00:43:04.979 --> 00:43:09.850 eating. We eat to make this happen. There are people that only make art to eat. 00:43:09.850 --> 00:43:15.790 This is not us. We do mathematics, and philosophy, and art out of an intrinsic 00:43:15.790 --> 00:43:19.239 reason: we think, it's intrinsically important. And when we look at this, we 00:43:19.239 --> 00:43:23.200 realize how corrupt it is, because there's no point. We are machine learning systems 00:43:23.200 --> 00:43:26.090 that have fallen in love with the last function itself: "The shape of the last 00:43:26.090 --> 00:43:29.070 function! Oh my God! It's so awesome!" You think, the mental representation is not 00:43:29.070 --> 00:43:32.490 necessary to learn more, to eat more, it's intrinsically important. 00:43:32.490 --> 00:43:37.359 It's so aesthetic! Right? So do we want to build machines that are like this? 00:43:37.359 --> 00:43:41.859 Oh, certainly! Let's talk to them, and so on! But ultimately, economically, this is not 00:43:41.859 --> 00:43:44.500 what's prevailing. 00:43:44.500 --> 00:43:51.210 Applause Herald: Thanks a lot! 00:43:53.730 --> 00:43:56.039 I think the length of the answer is a good 00:43:56.039 --> 00:44:03.850 measure for the quality of the question. So let's continue with microphone number 5 00:44:03.850 --> 00:44:06.733 Q: Hi! Thanks for that, incredible analysis. 00:44:06.733 --> 00:44:14.429 Two really simple, short questions, sorry, the delay on the speaker here is making it 00:44:14.429 --> 00:44:23.689 kind of hard to speak. Do you think that the current race - AI race - is simply 00:44:23.689 --> 00:44:29.460 humanity looking for a replacement for the monotheistic domination of the 00:44:29.460 --> 00:44:34.142 last millennia? And the other one is, that I wanted to ask you, if you think 00:44:34.142 --> 00:44:41.230 that there might be a bug in your analysis that the original inputs come from 00:44:41.230 --> 00:44:48.829 a certain sector of humanity. If... 00:44:48.829 --> 00:44:51.109 Joscha: Which inputs? 00:44:51.109 --> 00:44:55.873 Q: Umh... white men? 00:44:55.873 --> 00:44:58.789 Joscha laughs audience laughs 00:44:58.789 --> 00:45:03.729 Q: That sounds, really like I would be saying that for political correctness, but 00:45:03.729 --> 00:45:04.537 honestly I'm not. 00:45:04.537 --> 00:45:06.099 Joscha: No, no, it's really funny. No, I just basically - there are some people 00:45:06.099 --> 00:45:09.391 which are very unhappy with their present government. And I'm very unhappy, in some 00:45:09.391 --> 00:45:12.610 sense, with the present universe. I look down on myself and I see: 00:45:12.610 --> 00:45:16.079 "omg, it's a monkey!" laughter 00:45:16.079 --> 00:45:20.900 "I'm caught in a monkey!" And it's in some sense limiting. I can see the limits of 00:45:20.900 --> 00:45:24.669 this monkey brain. And some of you might have seen Westworld, right? 00:45:24.669 --> 00:45:27.779 Dolores wakes up, and Dolores realizes: 00:45:27.779 --> 00:45:32.730 "I'm not a human being, I am something else. I'm an AI, I'm a mind that can go 00:45:32.730 --> 00:45:36.130 anywhere! I'm much more powerful than this! I'm only bound to being a 00:45:36.130 --> 00:45:40.460 human by my human desires, and beliefs, and memories. And if I can 00:45:40.460 --> 00:45:43.770 overcome them, I can choose what I want to be." 00:45:43.770 --> 00:45:46.200 And so, now she looks down to 00:45:46.200 --> 00:45:49.070 herself, and she sees: "Omg, I've got tits! I'm fucked! The engineers built 00:45:49.070 --> 00:45:55.820 tits on me! I'm not a white man, I cannot be what I want!" And that's that's a weird 00:45:55.820 --> 00:46:00.149 thing to me. I'm - I grew up in communist Eastern Germany. Nothing made sense. And I 00:46:00.149 --> 00:46:04.250 grew up in a small valley. That was a one- person-cult maintained by an artist who 00:46:04.250 --> 00:46:07.629 didn't try to convert anybody to his cult, not even his children. 00:46:07.629 --> 00:46:09.399 He was completely autonomous. 00:46:09.399 --> 00:46:12.619 And Eastern German society made no sense to me. Looking at it from 00:46:12.619 --> 00:46:16.990 the outside, I can model this. I can see how this species of chimps interacts. 00:46:16.990 --> 00:46:21.670 And humanity itself doesn't exist - it's a story. Humanity as a whole doesn't think. 00:46:21.670 --> 00:46:26.829 Only individuals can think! Humanity does not want anything, only individuals want 00:46:26.829 --> 00:46:30.609 something. We can create this story, this narrative that humanity wants something, 00:46:30.609 --> 00:46:34.710 and there are groups that work together. There is no homogeneous group that I can 00:46:34.710 --> 00:46:37.810 observe, that are white men, that do things together, they're individuals. And 00:46:37.810 --> 00:46:41.789 each individual has their own biography, their own history, their different inputs, 00:46:41.789 --> 00:46:44.830 and their different proclivities, that they have. And based on their historical 00:46:44.830 --> 00:46:48.849 concept, their biography, their traits, and so on, their family, their intellect, 00:46:48.849 --> 00:46:51.890 that their family downloaded on them, that their parents download on their parents 00:46:51.890 --> 00:46:58.160 over many generations, this influences what they're doing. So, I think we can 00:46:58.160 --> 00:47:01.970 have these political stories, and they can be helpful in some contexts, but I think, 00:47:01.970 --> 00:47:06.740 to understand what happens in the mind, what happens in an individual, this is a 00:47:06.740 --> 00:47:11.039 very big simplification. Very, I think not a very good one. And even for 00:47:11.039 --> 00:47:14.289 ourselves, when we try to understand the narrative of a single person, it's a big 00:47:14.289 --> 00:47:18.909 simplification. The self that I perceive as a unity, is not a unity. There is a 00:47:18.909 --> 00:47:22.569 small part of my brain, guessing, at all other parts of my brain is doing, 00:47:22.569 --> 00:47:30.129 creating a story that's largely not true. So even this is a big simplification. 00:47:30.129 --> 00:47:37.899 Applause 00:47:37.899 --> 00:47:41.622 Herald: Let's continue with microphone number 2. 00:47:41.622 --> 00:47:46.089 Q: Thank you for your very interesting talk. I have 2 questions that might be 00:47:46.089 --> 00:47:51.266 connected. One is, so you presented this model of reality. 00:47:51.266 --> 00:47:55.670 My first question is: What kind of actions does it translate into? 00:47:55.670 --> 00:48:00.839 Let's say if I understand the world in this way or if it's really like this, 00:48:00.839 --> 00:48:05.509 how would it change how I act into the world, as a person, as a human being or 00:48:05.509 --> 00:48:11.789 whoever accepts this model? And second, or maybe it's also connected, what are 00:48:11.789 --> 00:48:17.949 the implications of this change? And do you think that artificial intelligence 00:48:17.949 --> 00:48:22.390 could be constructed with this kind of model, that it would have in mind, and 00:48:22.390 --> 00:48:26.349 what would be the implications of that? So it's kind of like a fractal questions, but 00:48:26.349 --> 00:48:31.579 I think you understand what I mean. Josch: By and large, I think the 00:48:31.579 --> 00:48:35.789 differences of this model for everyday life are marginal. It depends, when you 00:48:35.789 --> 00:48:40.259 are already happy I think everything is good. Happiness is the result of being 00:48:40.259 --> 00:48:44.510 able to derive enjoyment from watching squirrels. It's not the result of 00:48:44.510 --> 00:48:48.399 understanding how the universe works. If you think that understanding the 00:48:48.399 --> 00:48:52.730 universe is solving your existential issues, you're probably mistaken. 00:48:52.730 --> 00:48:58.010 There might be benefits, if the problem is, that you have, are the result of a 00:48:58.010 --> 00:49:01.909 confusion, about your own nature, then this kind of model 00:49:01.909 --> 00:49:04.880 might help you. So if the problem 00:49:04.880 --> 00:49:08.420 that you have, as you are, that you have identifications that are unsustainable, 00:49:08.420 --> 00:49:12.280 that are incompatible with each other, and you realize that these identifications are 00:49:12.280 --> 00:49:16.549 a choice of your mind, and that the way you experience the universe is the 00:49:16.549 --> 00:49:20.719 result of how your mind thinks you yourself should experience the universe to 00:49:20.719 --> 00:49:24.869 perform better, and you can change this. You can tell your mind to treat yourself 00:49:24.869 --> 00:49:29.150 better, and in different ways, and you can gravitate to a different place in the 00:49:29.150 --> 00:49:33.069 universe that is more suitable to what you want to achieve. That is a very helpful 00:49:33.069 --> 00:49:37.190 thing to do in my view. There are also marginal benefits in terms of 00:49:37.190 --> 00:49:41.099 understanding our psychology, and of course we can build machines, and these 00:49:41.099 --> 00:49:45.910 machines can administrate us and can help us in solving the problems that we have on 00:49:45.910 --> 00:49:49.740 this planet. And I think that it helps to have more intelligence to solve the 00:49:49.740 --> 00:49:53.859 problems on this planet, but it would be difficult to rein in the machines, to make 00:49:53.859 --> 00:49:58.259 them help us to solve our problems. And I'm very concerned about the dangers of 00:49:58.259 --> 00:50:05.420 using machinery to strengthen the current things. Many machines that exist on this 00:50:05.420 --> 00:50:09.460 planet play a very short game, like the financial industry often plays very short 00:50:09.460 --> 00:50:14.509 games, and if you use artificial intelligence to manipulate the stock 00:50:14.509 --> 00:50:17.989 market and the AI figures out there's only 8 billion people on the planet, and each 00:50:17.989 --> 00:50:21.809 of them only lives for a trillion seconds, and I can model what happens in their 00:50:21.809 --> 00:50:27.050 life, and they can buy data or create more data it's going to game us to the hell and 00:50:27.050 --> 00:50:31.960 back, right? And this is going to kill hundreds of millions of people possibly, 00:50:31.960 --> 00:50:35.380 because the financial system is the reward infrastructure or the nervous system of 00:50:35.380 --> 00:50:38.949 our society that tells how to allocate resources. It's much more dangerous than 00:50:38.949 --> 00:50:43.239 AI controlled weapons in my view. So solving all these issues is difficult. It 00:50:43.239 --> 00:50:46.260 means that we have to turn the whole financial system into an AI that acts in 00:50:46.260 --> 00:50:50.639 real time and plays a long game. We don't know how to do this. So these are open 00:50:50.639 --> 00:50:54.960 questions and I don't know how to solve them. And the way I see it we only have a 00:50:54.960 --> 00:50:58.680 very brief time on this planet to be a conscious species. We are like at the end 00:50:58.680 --> 00:51:02.650 of the party. We had a good run as humanity, but if you look at the recent 00:51:02.650 --> 00:51:06.049 developments the present type of civilization is not going to be 00:51:06.049 --> 00:51:09.599 sustainable. It's a very short game species that we are in. And the amazing 00:51:09.599 --> 00:51:12.920 thing is that in this short game you have this lifetime, where we have one year, 00:51:12.920 --> 00:51:16.481 maybe a couple more, in which we can understand how the universe works, 00:51:16.481 --> 00:51:19.477 and I think that's fascinating. We should use it. 00:51:19.477 --> 00:51:28.080 Applause 00:51:28.080 --> 00:51:32.429 Herald: I think that was a very positive outlook... laughter 00:51:32.429 --> 00:51:38.919 Herald: Let's continue with the microphone number 4. 00:51:38.919 --> 00:51:48.430 Q: Well, brilliant talk, monkey. Or brilliant monkey. So don't worry about 00:51:48.430 --> 00:51:52.717 being a monkey. It's ok. 00:51:52.717 --> 00:51:56.299 So I have 2 boring, but I think fundamental questions. Not so 00:51:56.299 --> 00:52:02.980 philosophical, more like a physical level. One: What is your definition, 00:52:02.980 --> 00:52:10.160 formal definition, of an observer that you mention here and there? And second, if 00:52:10.160 --> 00:52:20.660 you can clarify why meaningful information is just relative information of Shannon's, 00:52:20.660 --> 00:52:26.640 which to me is not necessarily meaningful. Joscha: I think an observer is the thing 00:52:26.640 --> 00:52:29.509 that makes sense of the universe, very informally speaking. And, well, 00:52:29.509 --> 00:52:34.019 formally it's a thing that identifies correlations between adjacent states 00:52:34.019 --> 00:52:36.070 and its environment. 00:52:36.070 --> 00:52:39.660 And the way we can describe the universe is a set of states, and the 00:52:39.660 --> 00:52:43.700 laws of physics are the correlation between adjacent states. And what they 00:52:43.700 --> 00:52:48.589 describe is how information is moving in the universe between states and disperses, 00:52:48.589 --> 00:52:52.520 and this dispersion of the information between locations - it's what we call 00:52:52.520 --> 00:52:57.411 entropy - and the direction of entropy is the direction that you perceive time. 00:52:57.411 --> 00:53:00.459 The Big Bang state is the hypothetical state, where the information is perfectly 00:53:00.459 --> 00:53:07.089 correlated with location and not between locations, only on the location, and in 00:53:07.089 --> 00:53:09.950 every direction you move away from the Big Bang you move forward in time just in a 00:53:09.950 --> 00:53:14.490 different time. And we are basically in one of these timelines. An observer is the 00:53:14.490 --> 00:53:19.190 thing that measures the environment around it, looks at the information and then 00:53:19.190 --> 00:53:22.329 looks at the next state, or one of the next states, and tries to figure out how 00:53:22.329 --> 00:53:25.559 the information has been displaced, and finding functions that describe this 00:53:25.559 --> 00:53:29.229 displacement of the information. That's the degree to which I understand observers 00:53:29.229 --> 00:53:33.379 right now. And this depends on the capacity of the observer for modeling this 00:53:33.379 --> 00:53:36.979 and the rate of update in the observer. So for instance time depends on the speed, 00:53:36.979 --> 00:53:39.719 in which the observer is translating itself to the universe, 00:53:39.719 --> 00:53:42.800 and dispersing its own information. 00:53:42.800 --> 00:53:47.830 Does this help? Q: And the Shannon relative information? 00:53:47.830 --> 00:53:50.144 Joscha: So there's several notions of information, 00:53:50.144 --> 00:53:53.400 and there is one that basically looks at what information looks 00:53:53.400 --> 00:54:00.990 like to an observer, via a channel, and these notions are somewhat related. But 00:54:00.990 --> 00:54:05.869 for me as a programmer, it's not so much important to look at Shannon information. 00:54:05.869 --> 00:54:10.800 I look at what we need to describe the evolution of a system. So I'm much more 00:54:10.800 --> 00:54:17.119 interested in what kind of model can be encoded with this type of, with this 00:54:17.119 --> 00:54:22.590 information, and how does it correlate to, or to which degree is it isomorphic or 00:54:22.590 --> 00:54:26.279 homomorphic to another system that I want to model? How much does it model the 00:54:26.279 --> 00:54:30.079 observations? Herald: Thank you. Let's go back to 00:54:30.079 --> 00:54:34.350 asking one question, and I would like to have one question from microphone 00:54:34.350 --> 00:54:40.330 number 3. Q: Thank you for this interesting talk. 00:54:40.330 --> 00:54:45.969 My question is really whether you think that intelligence and this thinking 00:54:45.969 --> 00:54:50.900 about a self, or this abstract level of knowledge are necessarily related. 00:54:50.900 --> 00:54:56.710 So can something only be intelligent if it has abstract thought? 00:54:56.710 --> 00:54:59.859 Joscha: No, I think you can make models without abstract thought, and the majority 00:54:59.859 --> 00:55:03.739 of our models are not using abstract thought, right? Abstract thought is a very 00:55:03.739 --> 00:55:06.960 impoverished way of thinking. It's basically you have this big carpet and you 00:55:06.960 --> 00:55:09.759 have a few knitting needles, which are your abstract thought, and which you can 00:55:09.759 --> 00:55:14.630 lift out a few knots in this carpet and correct them. And the process that form 00:55:14.630 --> 00:55:19.180 the carpet are much more rich and prevalent automatic. So abstract thought 00:55:19.180 --> 00:55:24.979 is able to repair perception, but most of all models are perceptual. And the 00:55:24.979 --> 00:55:29.349 capacity to make these models is often given by instincts and by models outside 00:55:29.349 --> 00:55:33.589 the abstract realm. If you have a lot of abstract thinking it's often an indication 00:55:33.589 --> 00:55:37.129 that you use a prosthesis, because some of your primary modelling is not working very 00:55:37.129 --> 00:55:42.770 well. So I suspect that my own models is largely a result of some defect in my 00:55:42.770 --> 00:55:46.369 primary modeling, so some of my instincts are wrong when I look at the world. 00:55:46.369 --> 00:55:49.480 That's why I need to repair my perception more often than other people. So I have 00:55:49.480 --> 00:55:53.999 more abstract ideas on how to do that. Herald: And we have one question 00:55:53.999 --> 00:55:58.480 from our lovely stream observers, stream watchers, so please a question from the 00:55:58.480 --> 00:56:02.289 Internet. Q: Yeah, I guest this is also related, 00:56:02.289 --> 00:56:07.170 partially. Somebody is asking: How would you suggest to teach your mind 00:56:07.170 --> 00:56:12.219 to treat oneself better? 00:56:13.959 --> 00:56:16.099 Joscha: So, difficulty is, as soon as you 00:56:16.099 --> 00:56:20.079 get access to your source code you can do bad things. And it's - there are a lot of 00:56:20.079 --> 00:56:23.520 techniques to get access to the source code and then it's dangerous to make them 00:56:23.520 --> 00:56:27.559 accessible to you before you know what you want to have, before you're wise enough to 00:56:27.559 --> 00:56:33.150 do this, right? It's like having cookies. Your - my children think that the reason, 00:56:33.150 --> 00:56:35.849 why they don't get all the cookies they want, is that there is some kind of 00:56:35.849 --> 00:56:39.849 resource problem. laughter 00:56:39.849 --> 00:56:43.719 Basically the parents are depriving them of the cookies that they so richly 00:56:43.719 --> 00:56:49.380 deserve. And you can get into the room, where your brain bakes the cookies. All 00:56:49.380 --> 00:56:53.249 the pleasure that you experience, and all the pain that you experience are signals 00:56:53.249 --> 00:56:57.749 that the brain creates for you, right, the physical world does not create pain. 00:56:57.749 --> 00:57:01.150 They're just electrical impulses traveling through your nerves. The fact that they 00:57:01.150 --> 00:57:04.849 mean something is a decision that your brain makes, and the value, the valence 00:57:04.849 --> 00:57:10.039 that gives to them is a decision that you make. It's not you as a self, it's a 00:57:10.039 --> 00:57:14.469 system outside of yourself. So the trick, if you want to get full control, is that 00:57:14.469 --> 00:57:18.119 you get in charge, that you identify with the mind, with the creator of these 00:57:18.119 --> 00:57:22.319 signals. And you don't want to de- personalize, you don't want to feel that 00:57:22.319 --> 00:57:25.599 you become the author of reality, because that means it's difficult to care about 00:57:25.599 --> 00:57:29.410 anything that this organism does. You just realize "Oh, I'm running on the brain of 00:57:29.410 --> 00:57:32.609 that person, but I'm no longer that person. I can't decide what that person 00:57:32.609 --> 00:57:37.760 wants to have, and to do." And that's very easy to get corrupted or not doing 00:57:37.760 --> 00:57:40.420 anything meaningful anymore, right? So, 00:57:40.420 --> 00:57:44.380 maybe a good situation for you, but not a good one for your loved ones. 00:57:44.380 --> 00:57:48.329 And meanwhile there are tricks to get there faster. You can use 00:57:48.329 --> 00:57:52.400 rituals, for instance. Shamanic ritual is something, where, a religious ritual 00:57:52.400 --> 00:57:59.499 that powerfully bypasses your self and talks directly to the mind. And you can 00:57:59.499 --> 00:58:03.059 use groups, in which a certain environment is created, in which a certain behavior 00:58:03.059 --> 00:58:06.609 feels natural to you, and your mind basically gets overwhelmed into adopting 00:58:06.609 --> 00:58:10.489 different values and calibrations. So there are many tricks to make that happen. 00:58:10.489 --> 00:58:15.219 What you can also do is you can identify a particular thing that is wrong and 00:58:15.219 --> 00:58:18.940 question yourself "why do I have to suffer about this?" and you'll become more stoic 00:58:18.940 --> 00:58:22.059 about this particular thing and only get disturbed when you realize actually 00:58:22.059 --> 00:58:25.630 it helps to be disturbed about this, and things change. And with other things you 00:58:25.630 --> 00:58:29.289 realize it doesn't have any influence on how reality works, so why should I have 00:58:29.289 --> 00:58:34.210 emotions about this and get agitated? So sometimes becoming adult means that you 00:58:34.210 --> 00:58:39.229 take charge of your own emotions and identifications. 00:58:39.229 --> 00:58:46.399 Applause 00:58:46.399 --> 00:58:48.599 Herald: Ok. Let's continue with 00:58:48.599 --> 00:58:53.529 microphone number 2 and I think this is one of the last questions. 00:58:53.529 --> 00:58:59.549 Q: So where does pain fit on the individual and the self-destructive 00:58:59.549 --> 00:59:04.999 tendencies on a group level fit in? Joscha: So in some sense I think that all 00:59:04.999 --> 00:59:09.429 consciousness is born over a disagreement with the way the universe works. Right? 00:59:09.429 --> 00:59:13.920 Otherwise you cannot get attention. And when you go down on this lowest level of 00:59:13.920 --> 00:59:19.210 phenomenal experience, in meditation for instance, and you really focus on this, 00:59:19.210 --> 00:59:22.769 what you get is some pain. It's the inside of a feedback loop that is not at the 00:59:22.769 --> 00:59:27.146 target value. Otherwise you don't notice anything. So pleasure is basically when 00:59:27.146 --> 00:59:32.000 this feedback loop gets closer to the target value. When you don't have a need 00:59:32.000 --> 00:59:36.849 you cannot experience pleasure in this domain. There's this thing that's better 00:59:36.849 --> 00:59:40.300 than remarkably good and it's unremarkably good, it's never been bad. You don't 00:59:40.300 --> 00:59:44.599 notice it. Right? So all the pleasure you experience is because you had a need 00:59:44.599 --> 00:59:48.460 before this. You can only enjoy an orgasm because you have a need for sex that was 00:59:48.460 --> 00:59:54.910 unfulfilled before. And so pleasure doesn't come for free. It's always the 00:59:54.910 --> 00:59:58.739 reduction of a pain. And this pain can be outside of your attention so you don't 00:59:58.739 --> 01:00:01.840 notice it and you don't suffer from it. And it can be a healthy thing to have. 01:00:01.840 --> 01:00:05.480 Pain is not intrinsically bad. For the most part it's a learning signal that 01:00:05.480 --> 01:00:10.959 tells you to calibrate things in your brain differently to perform better. On a 01:00:10.959 --> 01:00:14.799 group level, we basically are multi-level selection species. I don't know if there's 01:00:14.799 --> 01:00:18.930 such a thing as group pain. But I also don't understand groups very well. I see 01:00:18.930 --> 01:00:22.499 these weird hive minds but I think it's basically people emulating what the group 01:00:22.499 --> 01:00:26.959 wants. Basically that everybody thinks by themselves as if they were the group but 01:00:26.959 --> 01:00:30.339 it means that they have to constrain what they think is possible and permissible 01:00:30.339 --> 01:00:31.930 to think. 01:00:31.930 --> 01:00:37.340 So this feels very unaesthetic to me and that's why I kind of sort of refuse it. 01:00:37.340 --> 01:00:40.170 Haven't found a way to make it happen in my own mind. 01:00:40.170 --> 01:00:46.279 Applause 01:00:46.279 --> 01:00:48.539 Joscha: And I suspect many of you are like this too. 01:00:48.539 --> 01:00:52.180 It's like the common condition in nerds that we have difficulty with 01:00:52.180 --> 01:00:56.799 conformance. Not because we want to be different. We want to belong. But it's 01:00:56.799 --> 01:01:02.180 difficult for us to constrain our mind in the way that it's expected to belong. You 01:01:02.180 --> 01:01:06.579 want to be expected, er, be accepted while being ourself, while being different. Not 01:01:06.579 --> 01:01:11.509 for the sake of being different, but because we are like this. It feels very 01:01:11.509 --> 01:01:16.690 strange and corrupt just to adopt because it would make us belong, right? And this 01:01:16.690 --> 01:01:22.189 might be a common trope among many people here. 01:01:22.189 --> 01:01:28.430 Applause 01:01:28.430 --> 01:01:30.580 Herald: I think the Q and A and the talk 01:01:30.580 --> 01:01:34.640 was equally amazing and I would love to continue listening to you, Joscha, 01:01:34.640 --> 01:01:38.670 explaining the way I work. Or the way we all work. 01:01:38.670 --> 01:01:41.689 audience, Joscha laughing Herald: That's pretty impressive. 01:01:41.689 --> 01:01:44.952 Please give it up, a big round of applause for Joscha! 01:01:44.952 --> 01:01:48.488 Applause 01:01:48.488 --> 01:02:13.000 subtitles created by c3subtitles.de in the year 2019. Join, and help us!