Transcript of Episode 72 – Joscha Bach on Minds, Machines & Magic

The following is a rough transcript which has not been revised by The Jim Rutt Show or by Joscha Bach. Please check with us before using any quotations from this transcript. Thank you.

Jim: Today’s guest is Joscha Bach. He is vice president of research at the AI foundation. He’s previously been a research scientist at MIT and Harvard. He is author of the book Principles of Synthetic Intelligence: Psi, that’s P-S-I, An Architecture of Motivated Cognition. And he has published many papers. I also have to add this. He has one of the most interesting tweet streams that I follow. You can follow him at @Plinz P-L-I-N-Z. And only some of it has anything much directly to do with artificial intelligence or cognitive science, but it sure is entertaining. He’s certainly in my top handful of tweet streams that I enjoy following. Let’s start with understanding sort of your motivation. How did you get into this? And here’s a quote from, I think, from one of your talks or maybe it’s from a paper, I don’t remember. You said we need to understand the nature of AI to understand who we are.

Joscha: Yeah. I think that artificial intelligence, as an idea, is in some sense, the missing link between philosophy and mathematics, it’s the attempt to make the execution of processes that allow us to use language and make it refer to meaning automatic, and understandable, and scalable. And this basically allows us to ground our use of languages that have meaning in machinery that we understand in a mechanical universe. There is an idea that the universe is mechanical might sound limiting to many people, but it’s not a very limiting idea. It simply means that the universe is not magic, that it’s not built over symbolic correlations or symbolic causation, but over things that have no conspiracy inside of them. And the other thing is that if you think of yourself as a system that has no conspiracy inside of it, what kind of system is it? And AI is the attempt to build a testable theory of what that is.

Joscha: And if we are able to test that theory successfully, we will have built a system that in the ways in which it matters is going to be like us, which means it’s going to be a system that is able to reflect on its environment and its own role in it and make a model of that, understand that and understand its own nature as well, right? This project of artificial intelligence in my view is something like a capstone of a certain philosophical tradition or maybe of all philosophical traditions of the question of what are we in this universe? And what’s our relationship to the universe that contains us? What is the observer? All the other questions come from that. And so it’s in my view, the most important question there is.

Jim: Yeah, I have to agree that as I’ve gotten into artificial intelligence, particularly artificial general intelligence over the last about six years, I’ve started digging into it in some depth. I’ve found myself being forced to ask just those of questions, which kind of surprised me, right? And of course we both know that not everybody interested in AI is interested in it at this kind of a level. I mean, there’s an awful lot of narrow AI these days, which we’ll talk about later, the distinction between cognitive AI and narrow AI and artificial general intelligence and more applied things. I take it that you would consider yourself not just an AI guy, but also interested in artificial general intelligence.

Joscha: I think that artificial intelligence is an attempt to reboot the original idea of artificial intelligence. It’s basically AGI and the original AI are the same thing. When AI started out as a field, it was done by a number of people across disciplines. There were some cyberneticians, there were adding computer scientists, computer science was just getting started, some information theorists and even psychologists involved in the whole thing. And the idea was, we now understand how computers work, we understand that everything that we understand we can express as constructive mathematical paradigm, constructive mathematics is the part of maths that works and it happens to be the same as computation, so let’s go and teach the computers how to think. And the first generation of people that set out to do this were very optimistic, they basically thought this is going to be an extended summer project. Maybe a couple of years then we will have made tremendous progress.

Joscha: And as it turned out, they did make tremendous progress because hindsight. They did a lot of amazing feats like it didn’t take that long to teach computers how to play decent chess, which means chess that is much better than make abilities even before computers became super human at chess. And how to get this thing to do very simple language understanding, simple planning. And so this didn’t take long. The set of programming languages that we use today. Almost all of them have been invented in their structure and principles in the first couple of decades after AI has started. And the effort was very much connected throughout computer science. In some sense, almost everything that didn’t work in computer science was AI. And when it worked, then it became something boring. And AI has in some sense has been always the pioneer battalion of computer science and very, very productive as a field.

Joscha: But most of the people that worked on it realized that this optimism of building a machine that thinks in a short time, that is for instance within the time of a couple of grant proposals or even your entire career, that’s very daunting. It’s probably not going to succeed. They’re focused on things that we’re going to give results within the duration of a grant proposal or the duration of a career. And these were also the things that we’re going to give you tenure, so AI became more and more applied, more, more narrow. And it was about improving the automation of statistics, and developing mathematics around that, and theory around that, and so on. And this philosophical project itself has only captured the attention of relatively few people in the field as philosophical projects happen to do. I think in some sense, that’s correct. And it’s the right thing to do because philosophical projects are daunting, hard, risky, and often have only marginal benefits.

Joscha: Why not go for this thing that gives very tangible benefits right here right now. And this is what the majority of the field did. There are also some political upheavals within the field that happened, right? When Minsky claimed that cognitive AI was in some sense, the same thing as symbolic AI, I think he made the wrong bet. And Minsky was somebody who was an extreme visionary, but he was also somebody who was not so interested in the visions of other people. And so he’s basically screamed at people that did cybernetics and that did neural networks and actively delayed the development of a dynamical systems models of cognition and of machine learning systems for more than a decade. Basically removed the funding for neural networks and apparently also impaired the funding of cybernetics and contributed to the ending of cybernetics as a field in the US to get more, I say suspect, funding in the airtime for his own approaches.

Joscha: And in some sense, it’s not his fault that he could not see that he was not right, that his approaches ultimately would lead to difficulties in grounding concepts, and building an understanding that goes beyond symbolic systems. But he inadvertently created a division between cognitive AI, that was his own followers and disciples, and everybody else. And so the other people did not read PRG anymore. And they did not think about psychology very much anymore. And this division within artificial intelligence, between people that think about cognition and psychology, and people that think about how to for instance process images and that how to interact with an environment. This division lasts until today in many ways, even though the gap is closing more and more.

Jim: And yet, if you talk to people, even at Google, they will admit late in the afternoon or after several beers that the reason they got into the field was something like AGI.

Joscha: Absolutely. When I entered university, I only did this because I wanted to understand how the mind works and that’s why I studied computer science and philosophy and a few other things. And it was very disappointing to me to see that philosophers were completely not into this. I mostly didn’t understand the ideas that computer science has set developed during the last 50 years at all. And if so, then only in a very superficial and derisive way and the computer scientists were not interested in philosophy. And it was really depressing to me and we was not alone on this. When I was a grad student or not even a grad student, so basically post equivalent of a bachelor, other students would ask me when they entered the field. And I was a tutor, where can I do AI here? Who’s offering real AI classes, even though we had a very large and active AI department at our university.

Joscha: And so I decided that I had to offer AI classes, right? As a student, I started doing a seminar on building cognitive architectures and on thinking about the mind. And so got a dozen students who started building things with me, and this is the origin of the microsite architecture by the way, this is how I got into working in academia as a student.

Jim: Oh, that’s a very interesting story. You sort of reacted against the prevailing trends in academia, where as you said earlier, unfortunately, the funding and promotional carrots, rewards are focused on relatively small, incremental steps against known benchmarks. You raise the benchmark by half a percent and you have a paper you can publish, and when you have published seven papers, you get tenure, right. And that’s not the grand question of how do we make a machine that thinks sort of like a human. Are you still optimistic that we can create computer intelligences that are at human level and beyond?

Joscha: Of course. There is no obvious reason why we shouldn’t. There’s nothing magic going on as far as we know in the brain, I also don’t think that there are… And this might be very controversial to some deep open philosophical questions left that have to be answered. What needs to be answered is a lot of technical details. And for a lot of these details, I think doors have opened to work on them in the last few years. Even though I don’t know how long it’s going to take and whether it’s going to happen in our lifetime, I think there’s a significant probability that we will see something that is not human like, but in many areas super human like, and in other areas good enough like in our lifetime, or maybe even in the next two years.

Jim: Yeah. It’s hard to say, it’s one of these damn questions. It could be five years, it could be a hundred years or more, right? And of course it also as you say depends on where you’re measuring. If you talk about every single facility, it might be a hundred years. But if we’re talking about the superhuman capability in certain domains that could happen or frankly already has happened in narrow domains, like image recognition under certain very constrained cases. And those windows will open larger and larger. And I will also say that one of the things that’s made me more optimistic about superhuman capacity across the board is the more I’ve learned about cognitive science and cognitive neuroscience, frankly how limited the human brain is things like working memory size, the fidelity of memory, et cetera. We’re not that smart compared to what we could be.

Jim: Working memory alone is a huge bottleneck on for instance, the practical level of recursion in language and chunking size of concepts, et cetera. The low fidelity of our memory and lack of persistence is certainly a major cognitive limitation. I’m not one of those people who thinks that the humans are the top of the intelligence spectrum. Frankly, I believe we’re at approximately the stupidest possible general intelligence, which is not surprising. Evolution is seldom profligate with its gifts and since we’re over the line of general intelligence, at least or so it appears we’re probably just over the line. And so I’m one of those who is also very hopeful that we can get not just over the line, but way over line, at least in some very interesting dimensions, such as language understanding, the ability to read the literature of a discipline that actually makes sense of it, et cetera.

Joscha: I remember that I’ve always been very disappointed in the capacity of my brain as a child and later on. And I also felt bad about this because at the same time I was confronted with the superstitious belief of most people that if you apply yourself, there’s basically no limit to what the brain can do, right? We could have maybe infinite memory, maybe if they would just pay attention all the time, we could read all the books and retain all the books. We could retain all the movie, maybe you could have photographic memory for everything. Maybe there is no limit to our intelligence if we really apply ourselves and meditate enough, right? And later occurred to me that this is probably not the case. I do know many people that are much, much smarter than myself and know much more than myself, but it’s often at the expense of other things.

Joscha: It means that they tend to have a much narrower view on things and know these things much, much more deeply and apply their attention to exclusively these things and not to others. Of course, there are some people which are way smarter than me across the board, and I am totally in all of them, but still it’s just a shoddy human brain. And this shoddy human brain, I still don’t know how much compute we need and how much ingenuity we need to replicate this shoddiness of course, right? Even though we can see the limits of what it’s doing to get this to run, it might be possible that it’s doable or something that you can, as an average person already buy and put into your basement, if you really want to. Something in the order of say 20K of compute, maybe that’s possible and I’m optimistic enough to say that there is a chance that this is the case.

Jim: Yeah, it’s interesting. I’ve had numerous discussions with our mutual friend, Ben Goertzel about that, right? And you know, I think we both have come to the view that I think it will turn out to be, what is the appropriate level of representation? My home academic discipline is evolutionary computing and in evolutionary computing again, and again, and again, it turns out to be what is the representation on whether a technique will provide traction on a problem? And so if it turns out that the actual right level of representation is indeed the neuron, then probably 20K computers won’t do it. If it’s symbolic, you know, truly symbolic in the old fashioned AI sense, then probably 10 computers, call it $5,000 worth will be enough.

Jim: But as I suspect, if it’s some hybrid between the two and kind of messy and has some very low level stuff, and some very high level stuff, and some transducers between them, it may be on the order. This is my best guess. And I think Ben thinks my number is high, but he agrees that it’s a plausible ceiling, essentially the equivalent of a thousand powerful desktop computers. We’re talking on the order of a million dollars of hardware today. If we got the right level of representation at it, and of course at a million dollars worth of hardware is no barrier at all, to produce something as valuable as human level artificial intelligence.

Joscha: If we look at a thing like for instance, GPT-2 or 3, the models that open AI has recently been working on and publishing, what goes into these models is a relatively moderate cost that still will drive academic researchers sweat on their forehead, but it’s something like two digit million dollars that go into training these models. And they are trained on a few years of almost a full take of the internet that has been filtered down to something that removes most of the most obvious crap. And then if it’s very basic, the common [call 00:16:19] is a large part of what you will find in a text that’s written in a given year on the internet. And then it is able to go through this enormous amount of text that is like a very, very large library. There’s lots of babbling inside and extract then all the meaningful correlations or a significant part of the meaningful correlations from it in a productivity short time, right?

Joscha: It’s a task that is way beyond what a large group of researchers could do in their lifetime. And it’s done in the course of a few days or weeks with these computers, and that’s a tremendous achievement that you can do, right? And if you imagine you would have something that is only able to process these data at a human level across all modalities, maybe you would try to transmogrify the data so you get it with the highest benefits possible in a nervous system that has a capacity similar to ours. That would be a tremendous achievement, if so much data could pre-process and extract it in such a short amount of time. And I don’t mean that GPT-2 as we are human like.

Joscha: For me, the fascinating thing is that they get so far that you have a relatively simple algorithm, you can produce embeddings over texts, and as they show recently shown images that allow you to make continuations of the texts and the images that are basically by having a type of Turing test where the audience is unable to tell better than chance whether the text that they are looking at has been written by a human or whether the image that they’re looking at has been generated by a photographing those [inaudible 00:17:55] as a resolution image.

Jim: Yeah, that is interesting. I mean, as you say, this is not the way humans do it at all. And I had looked quite a bit at GPT-2. I haven’t looked at GPT-3 yet, and I did some experiments, et cetera, and it wouldn’t have fooled me for very long, but apparently it could fool some people, but it’s essentially an extraordinarily deep pattern matching system. And is that all human language understanding is, I don’t know. It seems to me that there’s something missing in these brute force, deep learning approaches when it comes particularly to language. That, yes, it’s amazing what they’ll do. The translation algorithms that Google famously developed GPT-2 and presumably GPT-3 and GBT-4 someday may well be able to fool us, but do they actually understand language at the level of say for instance, reading every paper and every textbook in cognitive science and actually then being able to make some inferences about what’s missing in cognitive science and what new theories or experiments are needed.

Jim: I mean, that’s the kind of thing a human could do. Not very well yet, but that’s what professional cognitive scientist does could GPT-4 do that, I don’t think so.

Joscha: The thing is that GPT-2 and 3 are not sentient apparently, right? They have no model of the connected unified universe onto which they match everything that happens in real time and they understand their own role in it and so on. It’s not that they are far from it. It’s just that it’s not part of their task. They haven’t been asked to do that. This model has not been trained to do such a thing. And yet the capacity that this thing has is tremendous. And there’s a lot of people will say, Oh, this is not grounded and has no connection to reality. The question is, imagine that all you would have as access to the world is your own imagination. The mental space in which you can produce mental simulations and lots and lots of books and a way to parse them.

Joscha: And at some point you basically try to hash out the space of possibilities in which all these symbols that you’re confronting can possibly make sense and give rise to a universe that contains you. Is this something where you can prove [inaudible 00:20:10] that this is impossible, that something like GPT-2 was it’s really a simple text processor, cannot hash it out based just on finding one of the possible orders and the patterns after noting there are not that many. And once it figured out the relationship between concepts in some kind of relational space that is dynamic and produces an evolving world, can they start parsing the Wikipedia articles on physics and all the papers on physics that they read and understand possible solutions to the big puzzles in physics and so on. That allow us how to understand the automata from which our world is generated I don’t know that.

Joscha: GPT-3 seems to be an incremental extension over the GPT-2 so basically users magnitudes more data, and the learning curves are not bending yet, so it seems that you can extend this even further. And the quality of texts that GPT-3 is producing in terms of coherence. The illusion that what it writes corresponds to a self coherent universe, where concepts are consistent for at least the duration of the story. This is much, much better than GPT-2. In GPT-2 you could easily see when it’s construction fell apart, when it lost coherence. This is when the story becomes unbelievable, right? You’re willing to entertain a completely fictional story about a completely fictional universe as long as it refers to a coherent universe. It might be a fantastic universe, one that doesn’t make sense, but it’s one that works by certain rules.

Joscha: Like if you take Buffy the Vampire Slayer is a fictional universe. It’s a universe where everything is normal or a sanitized version of normal. It’s a suburbia that doesn’t really exist because it’s so perfect and stereotypical. And the only thing that’s different from this perfect, and normal, and stereotypical world is that every year 20% of the population dies because of an invasion of vampires. And everybody goes on with their life. And this is probably not realistic, right? We know what happens if 1% of the population dies in every given year because of COVID, everything comes to a halt. And so this is a universe that is only meant to highlight things in the stereotypes in the normal interactions, but introducing this new element and deliberately leaving everything else that would be downstream from this unchanged. And this is an interesting construction that our mind can make. It’s producing a universe that is derived from our standard universe, with all the right constraints.

Joscha: And it’s referring to this global meaning. And that it liberally only changes very few things but leaving the rest alone. And the people who understand this, understand these layers of meaning that exist. And as this would be an interesting benchmark to see how many layers of meaning can GPT-3 distinguish and construct versus GPT-2. With respect to the complexity of a brain I think it’s possible that the unit is not a neuron, but that it’s the column. It might be that this is a simplification and the actual units are somewhat orthogonal to both neurons and columns, so in some sense, you have units that are made from columns. You have some that are made from multiple columns and a couple of neurons, and you have some where the column and a couple of neurons interact in a specific way, or you could have processes inside of the neuron that play an important role that you need to understand in order to model the behavior of that thing.

Joscha: It’s similar to understanding the role of people in society, but at a certain level of regularity, you don’t have people, you just have organizations. And yet to understand the interplay of the organizations, sometimes you need to look at individual people that played a role in historical developments. And to understand the behavior of these people sometimes you need to look into very particular things in their own mind, that happened in a certain moment in history and without that, you cannot understand it. This simple granularity that we put on the model is often too coarse to make it happen. But if we just entertain the idea that columns might be the thing, because there are somewhat ubiquitous over the neocortex, they are interchangeable, you can basically cut out non-specialized columns from the brain of an infant mouse and transplant them to another area of the brain and when they take, they will adapt to fulfilling the role of that part of the cortex.

Joscha: The column seem to be pretty general, as far as we know right now, and a column is something like 300 to 400 neurons, so we end up something like an order of a hundred million units. And if you imagine we have give or take 50 brain areas, it would mean that each of these brain areas has in the order of a million units and the units can do way more than an individual neuron can do. And even an individual neuron would probably need a three layer network to bottle it as a perception, but eventually what they do is, each of them can link up potentially with a few thousand other units. And these few thousand other units are not what they’re linked up to permanently. It’s the address space. It’s what they can talk to. And each of them has a number of states that they can have, and dynamics that they can undergo, a set of functions that can model and it’s limited what they can do, right?

Joscha: And so if you imagine that you could understand that each column is something like an adaptive agent that is doing local reinforcement learning by some policy. And that it’s wired up into a global architecture with the others, and maybe this does fit on a few larger GPU’s already in real time.

Jim: Yeah, that could be. Now I want to clarify something that in my readings in your book, and your talks, and papers, you talk about a hundred million columns. As I recall from my reading and I actually did look it up this afternoon. There’s probably more like a hundred million minicolumns and a million columns. Columns seem to be made from multiple minicolumns. Which one are you talking about when you’re-

Joscha: I was talking about minicolumns and the formation of columns is also different in different brains. For instance, if you look at a mouse brain, you’ll find that a large part of the neocortex seems to model the activity of whiskers.

Jim: Okay. Yep. Makes sense.

Joscha: And so the input space of whiskers. And the columns formed together into macrocolumns in a way which are almost like regions. And it’s not like you could look in the brain, you see very clear cut cells that are… It’s just that you have groups of neurons that have more interconnectivity among each other, and you lump them together in a column. And you also find that they coincide usually there’s a glia cell around which they’re formed, but it’s not by any means a very, very clear kind of architecture. And it seems to be possible that neighboring columns, that fuse and perform functions together and so on under certain circumstances. It’s a lot more messy I think, when we look into this in detail.

Jim: You look at the minicolumn, as the closest thing to a reasonably generic unit. Okay, that’s good.

Joscha: Yes. But it’s really something very… I’m already squinting a lot and possibly too much. Another thing that I wanted to put on my mental stack when you mentioned it is the question of whether we address the least intelligent thing that is generally intelligent in nature. And so I wonder why there is nothing that is obviously smarter than us in nature. Even for a monkey, it’s hard to have a brain that is larger than ours. There are brains in nature that are larger than ours, right? Whales have larger brains, elephants have larger brains. Why is it that whales and elephants are not smarter than us because they basically carry these brains around for free, right? They have so large bodies that they scaled up the brain with the body size. And it’s not that they need a proportionately larger cortex to control a proportion.

Joscha: Not that they need a proportionally larger cortex to control a proportionally larger muscle. That is true for the body map, but only a very small part of the brain is sio-somatosensory cortex. So, what do they do with all this extra capacity? Why is it that they are not that smart? And, I suspect that if you make a system too smart, it’s very hard to control it. It could be that elephants have massive autism because, the not autistic elephants meditated themselves out of existence. They basically started to understand their role in the universe like very smart monks. And then, as a result that decided that doing office politics all day and having kids and participating in society, is just not cutting it. And, instead they just go to do something more interesting with their lives like meditate. And, these elephants didn’t have a lot of offspring. Who knows if that is the question. So, I wonder if this stuff-

Jim: That’s a crazy idea. I like that.

Joscha: It is a very crazy idea.

Jim: That’s weird. I like that. I’m going to lay that one on somebody and say, “Oh yeah, we only have autistic elephants because elephants would otherwise be so smart, that they would be too busy philosophizing to reproduce.”

Joscha: I mean, with the dolphins. It’s obviously, slightly different that, dolphins live underwater, and you cannot really hold a pen underwater because everything you write down will be washed out. So, there will be no life for intellectuals, because they cannot read and write under water. So, dolphins only talk about sports and celebrities and sex, and this society is not moving anywhere.

Jim: And they have three dimensions too. That’s another thing. It requires a better brain to operate in three dimensions at the speeds they operate at, right?

Joscha: But navigation is so much easier, right? Because, you don’t have to solve all [inaudible 00:01:50]. You can just move where you want to.

Jim: Yeah. Collisions are much less likely, right? When you’re stuck in two dimensions, collisions are constantly on our mind. Three dimensions it’s easier to dodge, right?

Joscha: Yes. You’ll probably also notice that self-driving airplanes are completely common and standard for many, many decades now, while self-driving cars are difficult.

Jim: Yeah. That’s a good point.

Joscha: Because the navigation that you need to do on the ground to coordinate with all the other things that are limited to two dimensions and have so many crossing paths are just harder.

Jim: Yeah. I love that. So, that’s where geometry and cognition come together and that’s exactly the right answer. Let’s jump back a little bit to a point you passed over rapidly. I would love to dig into a little bit, which is your statement that, in your belief, the philosophical questions are sufficiently answered to proceed. What is your take on the nature of the mind? Are you a strict materialist?

Joscha: I suspect that there is a misunderstanding with respect to what matter means. I found that a lot of people think that matter is immediately given somehow. That we have seen atoms and touched them and the molecules and the earth on which we stand, and the air to which we move. And, while this is experientially the case for the earth and the air, it’s not quite true for the atoms and the underlying structure, right.

Joscha: And when we look at this in more detail, it just turns out that what we mean by matter is a way to talk about information. And, what we specifically talk about is the way that we can measure change. We noticed that we can measure periodic changes in place, which we call matter, and we can see how these periodic changes in place, move between places across locations. And this is what we call momentum.

Joscha: And the description of the universe in terms of matter and momentum is what we call physics, right? So, physics is the set of functions that describes how adjacent states of the universe are correlated. And, the idea of physics is that we explore the hypothesis, that there is a causibly closed lowest layer in the whole thing. This is what foundational physics is about. What’s the causibly closed lowest layer that describes this entire universe that we are in.

Joscha: And, this hypothesis that this layer exists and it’s discoverable to a large degree and can be described or inferred. This is a very successful hypothesis. The hypothesis doesn’t have a really good contender. There is not another game in town that is quite plausible. What would that other game look like? And, typically we are confronted with the notion of idealism.

Joscha: So, instead of matter being primary and matter being the way information travels in something like a mathematical space. So, set of discernible locations and trajectories that the information can take between those locations. We think that the mind is possibly primary. So, the conscious experience is primary.

Joscha: And, subjectivity, that’s true, right? The consciously experience is affecting a now, that we have given here in this moment. And, this now is not the same thing as any particular physical now. The physical universe is smeared out and uncertain with respect to that. It has a very vague and weird relationship to this experiential now, that is immediately given. And, if we make the step to say that this experiential now is everything where, something can be real and experience is real by any observer. And the physical universe has nothing out there that can experience something that can be confronted with the reality. Because, in physics that’s just automator, only and feeling mechanisms, everything that is real is in a dream. It’s in here with us, in this thing that we perceive that has colors and sounds and feelings and so on, right?

Joscha: So, if you make that thing primary, there still needs to be an outside that dreams us. What is the thing that dreams us that, produces the dream that you’re part of, the dream in which physics takes place from our perspective and which, where we construct our ideas of physics and everything else. And that thing out there, this outside world that we cannot access. This is still physics. If we are dreamt by a mind on a higher plane of existence, then it turns out that this higher plane of existence is still the skull of a primate. There’s a brain inside of it. And that a higher plane of existence can be modeled with the ideas of physics. It’s not changing anything.

Jim: Yeah. But is that necessary? Or, when we talk about, let’s say human minds emerging from brains… I’m a naive realist myself. Maybe I’m too naive, but I think there is a physics out there and physics emerges to chemistry and chemistry is very stable and predictable. And, from chemistry we get biochemistry. And, from biochemistry we get biology, and from biology, we eventually get neurons. And, then neurons, we eventually get nervous systems. Eventually we get brains. And, at some point, probably around the time of the reptiles or maybe the amphibians, we finally got mind, which is the subjective state. Then mind is a new subjective state that is packaged within physics. That’s the alternative way of looking at it.

Joscha: I suspect that you don’t need to tell the story from this direction. You can spend hundreds of years sitting in your monasteries in Tibet, or wherever, and be super smart, and have all the time in the world on your hands. Because, you convinced the local peasantry to give you free breakfast every morning, because you are Holy men. And, you sit together with the other smart, Holy men. And, you write books and discuss psychology introspectively, at a level that Western psychology still doesn’t quite master.

Joscha: And, you do understand many parts of the mind and how they interact and so on. And you never ever venture out to describe physics for some reason, right? It’s not necessary to do so, because, you do not intend to revolution your society and invent new means of production, make everything more efficient, because that would probably destabilize society.

Joscha: So, you leave that as it is. And society outside is something like a periodic process that you try to organize as well as you can. So, it’s somewhat stable, but you don’t want to turn it into runaway processes, technological progress. So, why would you want to look into physics? You don’t need to do that, right?

Joscha: So, many cultures only focused on this mental perspective and inner structure of perception first. And, not at the outer structure that enables it. But, if you’re not looking into physics from our perspective, you are leaving money on the table. If you ignore this entire scope of models that work out, that do have predictive power, tremendous predictive power, that allow you to build machinery that all these other civilizations could not build. And you can think and reason about, and did not even start to consider possible. This was nothing that they based any ideas on what you can do in the mechanical universe.

Joscha: It’s possible to do that. To leave all that on the table. And, our culture is a little bit veered because our civilization is not that old. I think it just started 400 years ago. And, we are mostly unaware of that civilization break that we got out of the cults. And, the Catholic civilization is one that obfuscated the area of the mind, because, it made it all part of a mythology. So, you didn’t have the free space to reason about psychology visiting the Catholic society. All this was taken up by Gods interfering with the cells on the same brain. People were in some sense, discouraged to study psychology because it might interfere with religion. And, the physics that they engaged in was also somewhat crude because access to rationality needed to use certificates, and you would not accidentally disprove religion and interfere with it, because, it was in some sense an anti-rationalist system that people had to live in.

Joscha: And, what we did was we freed our rationality for the first time in thousands of years. And, this new, rational society that woke up, this enlightenment, dismissed all the stories about the mind that the Christians had ever told them and thought of them as superstition. So, we lost many concepts and we are still in the process of restoring them.

Joscha: For instance, I’m fond of saying that spirit is a word that we have dismissed now as superstitious. And it’s an old word that just means operating system for an autonomous robot. And, when the word was coined, it just meant that the autonomous robots had existed. And so, there were people, there were plants, animals, cities, nations, states even, possibly ecosystems, but this was it. There were no robots that people had built.

Joscha: And, now that we have autonomous robots and they have operating systems, we understand that there is something like an operating system and that humans must have one too. And, plants must have one as well. And, obviously societies and civilizations have some kind of operating system, right? And, we understand that this operating system of society is not real. It’s virtual, it exists over the coherent interactions of individuals in society, in the same sense as the mind does not exist as a physical thing. It exists over the coherent interactions of the neurons or whatever the constituting parts are.

Jim: I think that’s a very important distinction. And, my study of complexity science, the way I will often say that is reductionist science. Let’s call it old style science, is about the dancers. While complexity perspective is about the dance and the dancers, right? So, the things that hold together, let’s say a business company. They’re virtual, they’re abstract. You can’t put your finger on a particle and say, this is the operating system of a business. And yet the business is real. It is a series of coordinated actions operating on signals, with boundaries, and semi-permeable membranes. It has feedback loops. I think that’s key.

Jim: And, you make that point as well in some of your writings that feedback loops are absolutely critical in creating higher levels of complexity and systems. At least they appear to be. For instance, in one of my critiques of our current operating system is that our society level operating system is overly driven around the money-on-money return loop.

Jim: Everything is, in the business world, and frankly in many, many people’s personal world, is all about optimizing money-on-money return. And, that has produced many of the less than desirable characteristics of our era. But, that is a real thing. The flow of money is a information processing modality, which ends up coordinating behavior of actual atoms. And, we’ll get down to the emergency cysts argument about what is top down causality. But one could argue that a society organized around money-on-money return has top down causality, and that it requires Mary and Joe to get up at seven o’clock in the morning, drive for an hour and work in a bullshit job for eight hours and drive an hour home again. So, I think this broader concept of what is real to include complex adaptive systems gets around this false distinction between dead matter and live systems.

Joscha: The notion of the feedback loop is very old. I suspect that every statecraft that built societies deliberately, had to have this notion of feedback regulation in it, and understanding of nature. I find this already in Aristotle and in our intellectual traditions throughout the times, this notion of the feedback loop, we find it in La Mettrie, where he describes that, there must be systems of competing springs in the mind, that pull and push against each other and keep it in some dynamic balance. So, it’s a very classical notion and it became the core of seven addicts and control theory. And, it was a very popular paradigm for a very long time. But, I also suspect that there is this little bit of traditional superstition around the first and second generation of dynamical systems theory, especially second-order cybernetics, in the sense that we are attempted to think of these dynamical systems as real.

Joscha: And, I suspect that they are just models. They are not real. There are the behavior of too many parts to count in the limit. When we describe how individual things interact, we can often track them and see low level processes that changed the evolution of one system across the boundary by another. And, if you can no longer do that, because, you were looking at trillions of molecules, for instance, you will have to resort to models that look at the statistical dynamics of these too many parts to count. And, some of the resulting mathematics have converted results, and others have not. And, the geometry of the world that you’re looking at, when we look at the dynamical systems, this is typically the stuff that is convergent, where we can make models.

Joscha: So, it turns out that Newtonian mechanics is these conversion dynamics of too many parts to count within certain ranges. It’s not real, because, you cannot really make a Newtonian mechanics perfectly working from individual parts. It’s only within a certain region of many, many parts, too many parts to count, that you get something that looks a lot like Newtonian mechanics. And, for a different region, it’s two for Einsteinian mechanics, right? But, these systems are not real. It’s just a level of the modeling that gives you a coherence. So, when you are an observer and you zoom into the universe that contains you, and the many, many parts that make up yourself, you basically will often find layers of description, where you can make a coherent model. And, these are the ones that you latch on as description layers. And, then we discover that they form a hierarchy and then we to establish causal relationships between them. But, these causal relationships are not causal relationships that exist in the physical universe.

Joscha: Causality is a model category. It’s a property of the models that we are making. So, when we talk about these big conundrums, like the mind-body problem, you’re not talking about how is one physical set of things, like bodies, connected to one other possibly non-physical set of things, minds. What we have to talk about is, if you have here one category of model, that is our body map that is dynamically arranged in space and articulated for skeletal muscles. And on the other end, you have mental states and mental processes and software states and so on. And these two disparate categories of models, how can we make them congruent?

Jim: Well, they obviously have to operate together. I was going to give an example about, what does reality and models really mean. We talked earlier about companies, right? They’re virtual. You can’t really put your finger on saying that’s the company, right? It’s a standing wave essentially of action and motion, but they’re real in the sense that they have traction in the physical world. And that’s, I think, a reasonable assessment of what is real.

Jim: As an example, think of a coal mining company. A company that digs coal out of the ground. So, in terms of traction in the real world, they dig holes, and they deliver coal to people who turn it into energy. So, there are actual things that are done in the world by this virtual thing called the coal mining company. And, obviously trying to track that at the level of atoms would be ridiculous. Even tracking it at the level of human beings would be exceedingly difficult. Maybe if you had billions of dollars, you could simulate a coal mining company at the level of individual humans.

Jim: But interestingly, at the level of abstraction of accounting, it’s quite simple, right? And people make large bets based on the future of one coal mining company versus another. Based on the signal, very abstract, very high level of accounting information that comes out. And the result is company A gets smaller and company B gets bigger, based on people’s assessments of this high level information. And yet, at the end of the day, we have to say that the mining company is real, because it is having very significant impact on the real world.

Joscha: But, not everything that has a significant impact on the real world is real in a particular sense, right? You could say that ideas have a significant impact on the real world. And, it’s very often difficult to say what the idea actually is, because it only exists approximately across minds, right?

Joscha: If you think about a political movement. How would you say that the political movement itself is real, if the idea behind the political movement is understood by most people in different ways. But, for a company it’s much, much easier because we have a software, our old legal system, that defines the conditions under which the company exists, right? And so, you have a criteria by which you can decide whether the company is there or not, and what state it is in. This is because we have created the substrate for the company to run. It’s similar to what happens in our computers. We have built a deterministic system, with clear rules that allow us to decide whether a bit is set in the gate or not, or in a register or not. And, this allows us to construct extremely precise models of the behavior in the computer and preordain the behavior of the computer. It’s a very specific thing that is probably different from minds, where the state that the mind is in is still somewhat probabilistic.

Jim: Yes, I would say that is true, but it also seems to be at least in many cases in nature, that emergence is to higher level structural entities, are built from relatively well defined lower level units. And, when you lack those more defined lower level units, it appears to be more difficult to get emergent properties to come into being. So, for instance, you make the good point that, the fact that our laws are relatively uniform, and they result in currency that has equal value, and that people can only be exploited so far because of the limits of the law. So, at some level they’re almost fungible, may actually be part of the mechanism that facilitates the emergence of the mining company. Perhaps in the same way that the fact that neurons, while of course they vary, there’s at least a hundred different varieties, are not that different compared to non neurons, right?

Jim: They are relatively fungible units of construction, and they give evolution something to work with, to produce higher level emergences. In this case, mind, and we can take mine to be not just human minds, but minds all the way down to wherever minds first come into being on our evolutionary tree. And so, I think that’s an interesting and important thought that we typically have a level of emergence that has relative simplicity at the outer envelope of the component pieces at that level, and that those combining, allow us to reach the next level of emergence.

Joscha: But we also know that the rules that are implemented in the world, that are so uniform, for instance the financial system, needs to be in order to work, to be implemented in a way that is rather uniform. You don’t have large opportunities for arbitrage, right? Where they have leaky abstractions. And yet, we both know people that got extremely wealthy by specifically looking for the fine print.

Jim: Absolutely. And then, none of these systems are perfect, right? Biology is subject to attack by viruses, for instance, right? A virus is not actually a biological entity. It’s essentially a flaw in biological systems that they’re exploited by dead chemistry, in the same sense that arbitrageurs are essentially like a virus operating on business, looking for the flaws in the system. And there will always be them.

Joscha: So, there are two interesting questions. One is, how much fine print is there in the mind? So, to which degree does the mind not emerge over the activity of neurons, for instance? To which degree is this simplification? And, to which do we, is it a very good abstraction, and should guide all of our thinking and not. And, there are some people which feel that the neurons are not the right description at all. And might even have superstition connected to that. But, I think it’s still worth keeping that in the back of the minds, that there is usually some degree of fine print involved, when we make such models and circumstances under which this is not the entire truth, and more interesting things are going on.

Joscha: With respect to viruses, the Coronavirus is not a life form. It’s more like a text that the cell cannot help itself not to read. And, when the cell reads this text, it’s doomed, because it cannot sandbox the idea that is contained in the text. It will have to turn it into an action. And, these viruses exist also in society in a way, right? But, it’s not as if the cell or the biological life ever existed for a long time at least, without these viruses. These viruses have been around briefly after cells came into being probably. So, the cells already contain a lot of viruses. All the existing cells are the result of many, many interactions that they had with viruses, many of which permanently migrated into the cells that later on divided and became us, right?

Joscha: So, at the same thing is true for societies. A lot of the ideas that we have, are the result of the interaction with viruses that interacted with the pure host ideas, that had been formed in a natural convergence, rather than being an infection process that interacted with the natural convergence state of a virgin mind. And then took roots in there and formed an immune system to make sure that competing ideas don’t take root in the same mind and so on.

Jim: Exactly. I’ve often used the term memetic viruses for radical ideas that challenge the status quo. For instance, the scientific revolution of starting around 1700 and probably reaching its pinnacle, at least with respect to the Christian thing that came before with Darwin, or some very virulent memetic viruses that were brought out by individual thinkers and collectives of thinkers. And, they put a substantial hit on the preexisting status quo model of the universe. So, I think that concept of virus where broadly constrained makes a lot of sense.

Joscha: There’s of course the question, whether the virus increases the fitness of the individual in the group. And, it seems to be quite obvious, if you look at society, that the viral evolution, the mimetic evolution, does not necessarily have to lead to improvements.

Jim: Nor does… Biological ones don’t either, right?

Joscha: In the long run. I think that the things that don’t work out are going to be removed from the playing field. This is how evolution works. But, it can be a temporary breakdown of complexity that you observe. So, it could be for instance, that you have a species that spreads over a very large area of your ecosystem, and is very homogenous and this makes it very susceptible to an infection. And then, a large part of the population gets wiped out by a relatively simple attack vector. If you have more diversity across species, you have more resilience against viruses. And, the same thing can happen in a society. So, if everybody is using the same social media and the same news sources, then a society can have very homogenous virus infections. And, as long as the virus is adaptive, in the sense that it makes the group coordinate better, it can even convey an evolutionary advantage on the group and make the group out-compete other groups.

Joscha: So, I wonder to which degree we are at the result of such a viral domestication process. That, we basically are living in a civilization that has out competed other civilizations because people in it, they’re very susceptible to this same mind viruses. And, as long as the viruses were accompanied with some kind of church, and an immune system like an inquisition, that would make sure that everybody would be susceptive to the same viruses and not rogue viruses, right? The society is possibly more successful than other societies. And then, if you remove the church and you have all these superstitious people without individual epistemology, and any kind of firewall against rogue ideas, that have no chance of being true when seen with bright eyes, that this might create a dangerous situation where your society just falls apart, because it splinters off into random cults.

Jim: Yeah. We’re running that experiment right now, right?

Joscha: Possible. It’s also possible that everybody’s just suddenly seeing the light at the same time, right? And, this amazing thing happens that after several thousand years of human evolution, we suddenly got to the point where we have the right moral opinion about everything, that we didn’t have in the 1500s or the 1800s or the 1950s or the 1970s or 1990s. Now we see it.

Jim: Of course, everybody thinks that. They always think that they’re right. Every APOC thinks that they’re right. But, I do think that one of the experiment that we’re running of memetic viruses everywhere, and eliminating many of the quality control mechanisms that more autocratic regimes have, may destroy us, or may take us in a phase change to the new level. And, that’s what some friends of mine and myself are working on is, can we get to a new level of civilization?

Jim: Well, not all of our answers will be the right ones, but there’ll be a lot more right than the status quo. In particular, learning how to operate within the limits of our ecosystem. The current status quo seems to have no brakes on it. It does not know how to stop. It keeps producing new things, whether they’re actually good for us or not.

Jim: And indeed, many of them are dangerous to the continued existence of the human race [inaudible 00:55:35] nuclear weapons. I just finished reading this weekend a very interesting book by William Perry and another fella reminding us, there are still way too many nuclear weapons out there. And, if there should be just a mistake, it could knock us back to the stone age quite easily, right? Let alone things like CRISPR, or AI risk, et cetera, which we’ll talk about in a little bit.

Jim: Anyway, that’s interesting. But let’s move back to our topic a little bit here. And, let’s go all the way back in time in fact, before modernis-

Jim: Here, and let’s go all the way back in time, in fact, before Modernism really got started right at the cusp with René Descartes and Dualism.

Jim: Dualism seems to be a strong attractor to this very day, right? And place against that, a view of… We’ll talk about consciousness here, essentially, or “the mind” or “the spirit” or whatever we want to call it. Descartes, of course, famously believed it was of a different substance than the body or energy or signals. It was literally something very unclear on how it interacted with the physical universe, but it was of different substance. While someone like John Searle, who I have found to be one of the more interesting philosophers of consciousness, argues that, “Nope, consciousness, or mind more broadly, is nothing but an emergent system from biology very much like digestion.” And like digestion, it comes at a high energetic and a high genetic cost to keep it going. And those two seem to be the poles of our historical thinking, and yet Descartean Dualism still seems with us.

Joscha: I wonder to which degree the generation after Descartes has basically simplified his thinking, and that is especially apparent when you look at Occasionalism, the question of how the spheres interact, how is it possible that the mental sphere and the physical sphere can interact when the physical sphere is casually closed. Right, if the mental sphere doesn’t need to be casually closed it’s possible that something is getting into your dream and messing with it, but how is the dream world interacting with the physical world, if the physical world doesn’t need any kind of external interaction to go with it? And there must be a reason why Descartes didn’t see this as a very big problem. Also, what I find is when I read his texts, he is often smarter than people give him credit for.

Joscha: For instance, in the Meditations, he will interact with religion in the same way as somebody say, in Communism, might interact with the political dogma. That is, he will make a nod to it and pretend to see no reason to doubt it, but will defend it with implausible arguments so that everybody who is able to get to a certain point in their own thinking realize, this is not an argument that is good enough to actually make a point that this person is highly incentivized to make. And you only need to make the next step and understand, oh maybe Descartes was smart enough to understand this as well and smart enough to understand that I would possibly be smart enough as well to understand this. So now we got this out of the way and understand why he wrote it. Right? There is a reading of Descartes that is relatively straightforward, and that is that most substances are mental substances in a way.

Joscha: So res extensa is the thing, that for instance, Jeff Hawkins at Numenta, is so obsessed with. This idea that everything that happens in the mind is in some sense, a representation that maps to a certain region in the same three dimensional space, the same free space, it’s your model, right? It’s the space that our mind models about the universe, and this interpretation of Jeff Hawkins is not complete because our mind also has a lot of content that does not refer to anything in the same moving free space. So if you would say these two categories of mental thought, the physics engine, that our brain is generating to deal with predicting sensory data across all modalities, and all modalities will be mapped on that physics engine, right? Everything that you hear and see and touch, is mapped onto the same free space. And all the other things, this would be your res cogitans, this is not res extensa. So say res extensa is the physics engine that your mind is generating, res cogitans is everything else. And now you can easily see how they interact. What’s our software?

Jim: Yeah. I want to hop back to a comment you made earlier about the mind not being causally closed. Do we know that? Is it possible the mind is causally closed? It’s just very complex.

Joscha: The question is what kind of causation you observe in the world, and this was an idea that first occurred to me when I was a kid and was playing on Telnet. There was a class of computer games, which were called mutts. They still exist in today, but most of them are now graphical, [inaudible 01:00:30] tangent adventures, and many of them were implemented in an object-oriented language that allowed you to create an arbitrary world from text. And so it was very much like a text adventure, but it was a text adventure that was dynamically evolving. And in which people could interact across many computers, so they would log into the same server and each of them would have a virtual character and avatar that would play in that world. And some of the people advanced to the point where they would become wizards, and even gods.

Joscha: And the wizard magician is somebody who has right access to the rules of reality. Somebody who cannot just use the surface way of reality that is producing a certain mechanical structure, a certain substrate, but you can go on a substrate beneath that. And the substrate beneath that changed the rules by which everybody else has to play. Right, this is what magic is about. And it’s also what magic is in the real world about, which is somebody who focuses on the way in which other people construct reality and messes directly with that layer. So the people around the witch will have a reality that is open to the attacks by witchcraft, by the right access, to the attention of people, to the way that people perceive their own relationship to reality into the witch. That was the reason why the witches have met a similar fate under the expansion of Christianity as the Jews did under the expansion of Fascism, it’s basically your competing system of seeing the world that the dominant new vector did not deem to be compatible with its own mode.

Joscha: So I try to eradicate it. So, witchcraft in these games existed, right? It’s a way to make people perceive reality different by changing the rules by which people have to perceive reality and interact with it. And this witchcraft exists in our mind, there are ways in which we can perceive miracles and make other people perceive miracles. And it comes down to creating a mental entity that you can control in the mind of another person that is changing the other person’s memories and perceptions. And as soon as you notice that you can edit your own memories and you catch yourself editing your own memories, you notice that the interaction, the causality in your mind is symbolic. There is stuff going on like, you perform a certain ritual that involves maybe sacrificing a black cat, and as a result, things in the real world change that are not obviously mechanically connected to the sacrifice of the black cat, right? It’s a completely symbolic interaction. The power of symbolic rituals can only be explained, I think, by the fact that our minds are not the cause of the closed doors there.

Jim: So you’re saying that sacrificing the black cat actually does cause a change in physical reality?

Joscha: No, it must cause a change in the way that you make sense of physical reality, in the way that which you relate to physical reality. The model that you make and the actions that you perform as a result of that change, you regulate in a different way. And as a result, reality will now look different to you.

Jim: That, yes. Okay. That certainly makes sense.

Joscha: So for instance, you could make a ritual to become, say a CEO of a company. Imagine you are a person that is an employee of a company that’s difficult to hold on to a job, they’re financially struggling and so on. They really don’t know what to do about this. And there is no way they can get out of this. They look into all the rules of reality that exists, and they can look into economics theory and they realized, “I’m a member of the working class, in fact there’s nothing I can do about this.” Right, and then they meet a magician and the magician says, “Look, we can do these rituals.” And a lot of people that offer this magic as a service, they have this abundance of meditation and expensive retreats and so on. And they basically reprogram you into becoming say, a glorified parasite or an entrepreneur or an investor.

Joscha: And the difference between an investor is not some magical ritual that has to be performed at birth or change the universe or change with the social order. It’s a change in how you relate to the world around you. If you can basically change your expectations in such a way that you consider yourself to be a very different system, you can often gravitate to a very different place in society and the economic order, right? And certainly you have this big house and this big car, and it’s not that you are working necessarily longer hours than you did before but you just interact with the outside world in a completely different way.

Jim: You’ve updated your code. I mean it happens all the time.

Joscha: And the same thing happens, say with, relationships. So you want to find the perfect partner or you want to meet very particular people and you perform a certain ritual. And that certainly changes the way you interact with reality and its downstream effects also makes other people interact with you in a different way and suddenly you’ll find yourself in a very different position in the world.

Jim: Yep. That’s true. But I’m not sure about its significance. If we assume that something like a rough distinction between hardware and software, and I understand that there’s actually many layers of software in the mind to update your code and then therefore have a different degree of traction in the world than you did before, it doesn’t strike me as particularly mysterious.

Joscha: You know, if there is only one real layer, it’s the layer below quantum mechanics. And everything above that is models, there is a lot of ways in which we can meddle with these models to get the outcome that we want.

Jim: But so far we have not found any such mechanisms to actually impact the level of physics. All right, we can not change the mass of electrons via witchcraft. We cannot change the spectral characteristics of Elvis and Tori by sacrificing a black cat.

Joscha: Exactly. So there seems to be a level at which reality is causally closed at which magic is not possible. This was the point that I was trying to make. This hypothesis, that the world is entirely subject to symbolical magic, falls apart at some point, because there seems to be a layer outside of our minds that you cannot change, or the rules do not change. And the question between the magicians, between the people that think that everything is a dream, is, that this is only because we have agreed as each other, that there’re certain parts of the dream are immutable, and we cannot defect from that dream because then we will go insane from the outside and from the inside reality falls apart. Just descends into chaos.

Jim: And of course you cannot disprove Idealism, right? It’s unfortunately in that area where, “Could be true, just seems fucking unlikely to me.” And as I said, I put my flag down many years ago as a naive realist which is, there is a reality out there, magic doesn’t work on physical reality, and it’s not because we all agreed not to change, it’s because it’s just a different thing. It’s not a realm in which magic can apply. Magic has no category in the actual physical world. In terms of our, symbol space yes, you can believe in magic. And you might actually think about the world differently. Think about people who go to casinos and believe in luck, for instance, right? And I know many such, right? And yet we all know that if you look at games of true chance with large enough N… there ain’t no luck. The house always wins and a highly predictable amount.

Jim: In fact, we’ve even done experiments, I love this one, where they track the win and loss records of a group of nuns that went to a casino and a group of ex-convicts that went to a casino and guess what? They both won and lost at exactly the same rates once N was large enough. So, these mind viruses that attempt to claim that they can manipulate the universe but can’t, are a specific example of what I might call malware that the human brain is very, very susceptible to. Just think of the nonsense that’s loose the world today about COVID-19 for instance. But, I do believe that we can use a sharp enough knife and say, this is just not true about reality.

Joscha: So the occultist might say to you, “Jim, you’ve locked yourself into a reality in which you will never win the lottery because you have made that commitment, in the way that in which you constitute your relationship to reality, that you can never beat the odds” right? That magic is not possible. And it’s hard to say whether that’s true or false, but when you compare the hypothesis from the outside, you can basically see which one leads into a consistent model of reality. You can, of course, always perform magic. You mentioned you run a company and everybody in the company is depressed because the numbers don’t add up and they are pointing towards doom. And then you hire a consultant and the consultant performs magic. It changes the benchmark, and suddenly everything is awesome again, right? So you pay the consultant. And now the question is, “What’s happening to your company?” Did the consultant impose a better model on your company, by which it tracks its performance in a better way and it regulates in a better way, or did they just cheat?

Joscha: And this is the issue of magic, that a lot of magic comes down to cheating. Of course you can edit your memory and your expectations and your interpretations of what happens in between, but it might also change the way you feel like. For instance, even if you feel terribly, you can just imagine that you are basically a king that presides over an awesome kingdom and that tomorrow is going to be awesome again. And this is just a very, very short intermission that does not actually mean anything. And this moment, if you look at it from this perspective, it’s actually quite bearable, right? From this perspective, you’re probably going to be a much happier person. But of course the question is in the long run, how well do you track reality?

Jim: Yeah, let’s say for instance, you decide that, “I am the king of infinite space and I decide I’m not going to work and I’m not going to do anything.” And then I’ll end up starving to death. Right? So at the end, reality bats last, and again, in terms of public affairs, you can claim that COVID-19 is a hoax, but that doesn’t stop the virus from doing its thing.

Joscha: Yes. Of course you can also do the opposite. For instance, since my early youth, I think, was early teens, when I stumbled on the same thing as Greta Thunberg did, which, the limits to growth and the environmental pollution that was on a one way trajectory. And the fact that we didn’t have regulation mechanisms implemented in our civilization that could make it sustainable before it breaks down. And that seemed to be an obvious thing, right? That we are instigating dynamics that when unchecked, will lead to the demise of our civilization and our main defense against that is visual thinking. And once you realize that, you get depressed, right, it’s terrifying. And the same thing is also true when you look at society, you mostly focus on the things that get worse, where institutions get sentenced and the people defect from what they should be doing on all levels of responsibility, and everything is constantly breaking down and getting worse.

Joscha: And this was my dominant perspective for most of my life. And I must say the world didn’t disappoint, right? There was always enough evidence to support this worldview. So I spent my life being extremely worried. I did the adverse to the king that thinks that what you see today is just a short intermission of less likely unpleasant things happening in a life that is overall totally glorious. I basically perceived the world as something that is pretty much miserable, where the past and the future are miserable and the present is quite bearable, but this is an exception that will surely be corrected in the near future. Right, and this is an opposite distortion that is unhealthy. I think.

Jim: So on the other hand, I think your analysis is approximately true, as we talked about before, that the current status quo seems to be in a runaway state where it is going to run over off a cliff.

Joscha: Oh it totally is right. 2020 is not an aberration. It’s exactly the future that we always expected would start to manifest around 2020.

Jim: Yeah and it’s starting, and it will get worse until we do something about it.

Joscha: But we could have enjoyed the time in between so much more.

Jim: That is true. All right. Well, this is interesting. This is interesting, but it’s not quite on the main line of the topics I wanted to go through, but it was very interesting. Let’s move back a little bit more to some of the specifics of cognitive architectures, the nature of cognitive processing, and particularly, I’d love to talk a little bit about what your views are on the gap between humans and other animals.

Jim: You alluded to the fact that, elephants have much bigger brains than we do, whales do too, some dolphins, killer whales, I think, but we don’t see, elephants sitting around philosophizing. And of course, one of the theories and probably the leading theory is the difference is that somewhere along the line, we added a new class of object into our brains. Something like symbols, maybe, or language of thought, or perhaps it was a more powerful form of procedural memory that allowed us, for instance, to conceptualize multi-part tools. And maybe that was exapted for language, but something in that space. What are your thoughts from examining AI and cognitive science about this, only 1%, one and a half percent difference in genes between us and a chimp and yet, seemingly a giant gap in terms of our cognitive ability.

Joscha: So there is an experiment that would be very interesting to make and it is, “How smart can dogs be?” There are obviously extreme differences in the intelligence of dog breeds, right? And typically the small dogs that people like to have in our home tend to be quite dumb. And the dogs that we use to herd our sheep tend to be very smart, but the dogs that herd our sheep tend to be less controllable, and they’re less suitable as to keep around because you need to negotiate the relationship to them at a more fundamental level. They’re less domesticated and harder to domesticate. And homo sapiens also seems to be a domesticated hominid. I sometimes wonder whether the Neanderthals were individually smarter than us, but they didn’t have scalable tribes that would scale into states, into societies with other the numbers of individuals beyond the Dunbar number. And in order to get people to cooperate at scale, you need to domesticate them in such a way that you would selectively dumb them down. We dumbed down the epistemology, so they are able to believe the same thing without proof and walk in dogs steps.

Jim: It’s interesting. Though, I will point out that the confrontation between homo sapiens, sapiens, and Neanderthal happened when we were still operating below the Dunbar number. That was obviously in our forager stage, at the latest 36,000 years ago. So I’m not sure I buy it with respect to Neanderthal. On the other hand, it’s a fact, that archeologists tell me is true, that if you compare modern man to Cro-Magnon man say 12,000 years ago, at the very end of our forager days, Cro-Magnon man’s brain was 10% larger than ours.

Joscha: Exactly. So what I wonder is if the evolutionary advantage that allowed us to displace the Neanderthals, to genocide them, which is probably what happened, was coordination.

Jim: Yup seems reasonable. Cooperation is the human superpower.

Joscha: Yes. And it’s not just that they view cooperation in the sense that you make a choice individually to cooperate with somebody else, which is what cooperation is usually about, it’s that we do this without thinking that we do this automatically.

Jim: That’s interesting. But again, we’re really interested in the line between let’s say, humans and chimps, right, which is much bigger than between Cro-Magnon and Neanderthal. Neanderthal and chimps is gigantic too, just a little bit smaller, perhaps.

Joscha: Yes. So the main issue seems to be length of childhood, I suspect. The length of our childhood is not too much given by social circumstances. It does play a role, but the main issue seems to be the speed of the maturation of the brain. And what you see is that in ancestral societies, it takes at least 15 years before you are able to forward more than you can eat. And in our society, this period is even longer. All right, so the time by which a kid can basically earn more than it needs in terms of upkeep is typically longer than 18 in our society. So it’s a very expensive period to maintain in which you mostly do an exploration set of exploitation. And what we noticed that in this period, it’s not just a decision that the individual is making to focus more on exploration. The individual is in some sense, literally insane.

Joscha: It has an incomplete model of reality. It is an incomplete architecture. They don’t think this is just not because it has not learned enough yet, but it’s like the capstones are missing. It’s like the training happens layer by layer and the infant spends a longer time than a cat infant to learn basic special relationships and learn contrast, and object permanence and so on, and then spends a longer time on engaging with social relationships and so on. So you’ll find that a cat, a house cat, can have a better model of the social reality and the family, and the capabilities and relationships of the individuals between them than a two year old baby will have, right? And that’s despite the baby, obviously being in some sense much smarter when it comes to spatial reasoning and so on, even at that age, and definitely in terms of using language because most two year olds do have some kind of language that far surpasses what a cat can do.

Joscha: And so it seems to me that our ability might be conjunction of slightly larger brain and optimized architecture, but mostly more training data per day and then we bootstrap our brains, so we can make better abstractions. And that would be a very simple genetic switch. So you could have a genetic switch that basically delays childhood, every phase of it makes it slower. And as a result gives you smarter chimps. At the expense of longer childhood, which means that chimps need to have much better environmental circumstances and more benefit from exploiting these circumstances, especially. So I suspect that moving into a temperate zone, so you have a benefit for planning ahead. So you can make agriculture and decide that if you put stuff in the ground now it might sprout. And if you keep a certain fraction of the stuff that you will not eat as seedlings for the next year or for years in which you have less vegetation coming, all these planning ahead and so on, is going to give a huge benefit to a long childhood, allows you to generalize over a very, very long time spent.

Jim: Interesting. And then of course, humans, of course this is just one of these “just so” stories about evolution, so it may not be true, but, one of the theories is that once we started standing on two feet, Bipedalism, it produced evolution that constricted the opening of the pelvis that limited the size of the head of the human baby. And hence, while there was seemingly something going on with rapidly increasing brain size, I mean, our brains are almost three times the size of a chimp brain, even though we have similar body size, we’re a little bigger body size, but not anywhere close to three X. The constraint was the pelvis arrangement in the bipedal method of getting around. And hence the evolutionary adaptation was to be delivered very, very, very prematurely so that we required a much longer time to fully develop our brains.

Jim: Unlike… Actually the model animal I use in my cognitive science work is a deer, white tail deer, and white tail deer is fairly competent two hours after it’s born, right? It can get up, it can walk around, it can find its mother. It can flee from prey, not very well, but at least a little bit, compared to a two hour old baby, can’t do a damn thing, right? Because the deer pelvis allows a much larger baby relative to the size of the mother. And they have not been under evolutionary pressure for really large brains either. So maybe that is the causal factor of this very long learning process, which is interesting. It’s a very interesting point you make about, however we got there the fact that we have a very long maturity period would tell us that we have more training cases to run, more layers to build, and more abstractions. And that’s interesting.

Joscha: Having a large brain is super nice. And you also see some people that are obviously super smart, like say John Laird and Steven fall from it, also have extraordinarily large skulls. So it seems to be possible to have a little more leeway in the human pelvis to get larger skulls out there, and sometimes it also has got results. But you also find people that have brains that are similar to say, a gorilla brain. And these people are not necessarily mentally impaired in any way. They can hold down a job, they often study at university and can be reasonably smart people, right? So the size of the brain is not absolutely everything. There is a certain leeway in which you can use it. And it’s probably nice to have a brain that scales up better, but I don’t think that brain size by itself is the deciding factor.

Jim: What about the theory that symbols or language per se, is the bright line?

Joscha: Yeah. I wonder about this. That’s a very tempting idea, and it seems to be that the ability to do grammatical decomposition is something that distinguishes the ability of humans and other apes. Right, so for instance, elephants don’t seem to be able to produce new images, so they can learn to draw, and what they will apparently do is, at least in the instances that I’ve seen so far, they don’t generalize. They don’t make a portrait, they don’t capture a new scene. They will produce the same image stroke by stroke, again and again. They can learn to do this, they have extremely good motor control, but there is no obvious generalization and observation going into the thing that they draw. It’s not the symbolic depiction in the same way as we do it.

Joscha: And if you look at the gorillas that have been raised in environments that they were exposed to human-like stimuli and human-like familiar structures, and so on, they did get in many ways to be more similar to humans than many people thought possible, but they also didn’t do the grammatical decomposition. So when Coco draws a dog, it looks like a Jackson Pollock, it’s basically an arrangement of colors that seem to be related to what she was looking at, but you don’t see the decomposition of the dark into limbs and torso and the head and the arrangement of the parts in it. This has not been properly reproduced. It’s tempting to think that the lack of a grammatical decomposition in visual scenes corresponds to the lack of the ability of the gorilla and the incompetency to use grammatical language.

Jim: And then maybe the grammatical language, because it’s such a compression, right? If we don’t have symbols, we don’t have something like, at least partially recursive language and we have to manipulate images only, which is at least one argument about what’s in the brain.

Jim: … which is at least one argument about what’s in the brain, in the prehuman era, the density and the ability to manipulate easily images, is way less than symbols. Symbols are tiny, right? The concept of dog, even if I don’t have a word for it, I don’t have written language, a conceptual dog is much smaller than many, many images of many, many different dogs. And so having symbols may just make the brain exponentially more effective.

Joscha: That’s an interesting question. Also, if there is a continuum between human intelligence and ape intelligence rather than a sharp cutoff?

Jim: What I’ve read, it seems to be, it’s not sharp, but there’s a big, big Gulf. As you say that you can teach Coco, to put together very simple linear sentences, but nothing at all like a recursive sentence.

Joscha: Yes. But of course, there are human beings with developmental deficits that have similar cognitive capacity, so there are certain syndromes where the brain does not develop in the same way as it does for the others. The question is, what exactly are the differences? Are this all pathologies? Of course, they are in some sense because our genetic code is the same for the most part, and there are just local changes in the genetic code or environmental conditions that prevented development to the specification that we are normally evolved to.

Joscha: And yet, if you look at the differences also between human beings, I noticed this as a tutor in computer science, that the performance that people achieve as programmers can often be predicted very, very early on, depending on the kind of obstructions that they are making in the first few hours, when you are confronting them with certain ideas.

Joscha: It’s basically a hierarchy of concepts that you could see in computer science, say from variables to loops, to pointers, to functions, to closures. Each of these concepts basically requires more and more inversion and obstruction, more and more pointers that you need to keep stable in the same representation. The more abstract these concepts become, the harder it is to teach them.

Jim: Yep. That is very true. I’ve found that same thing in my technology career. I hired, oh, I don’t know a thousand software developers perhaps. I got to be very, very good at it right. I’m a pretty damn good programmer, but I know many better ones, but I do have probably a better theoretical basis in computer science than most, but I was able to recognize at a very high level, by asking relatively few questions where they stood on this hierarchy of understanding and the ability to grasp increasingly abstract software development concepts. You’re absolutely right. An hour conversation, I could predict about 80 or 90% level of confidence, how far this person would go in their career.

Joscha: That’s [inaudible 01:26:57] most of these concepts can be taught. It’s just the amount of time that it takes to teach the concept is very variable.

Jim: Yep. Yep. Realistically, you only have a budget of education period.

Joscha: Realistically, the individual also has a few decades.

Jim: Yup.

Joscha: So if you are able to just learn 5% better than somebody else, this is going to compound, and so I’m totally envious when you look at Stephen Wolfram, who understood things at 22 that I understood in my forties.

Jim: Yep. I certainly saw that in my dealings with the people at the Santa Fe Institute. In the business world, I was almost always the smartest guy in the room. At the Santa Fe Institute, I am almost always the dumbest guy in a room or damn close.

Joscha: Yeah. These are great rooms.

Jim: Yeah. Those are great rooms. As I said, that’s not quite true, but in business, I was definitely in the 99th percentile in the world of complexity science, on a good day, I’d be the 25th percentile. And that is, you learn so much so fast, but you also come to appreciate that there are people who just operate at a very different level of abstraction which I will never be capable of, but that’s, okay.

Joscha: Yeah. These people make me very happy.

Jim: I’m glad they exist. I like them a lot. And frankly, I spent fair amount of my effort making their life better. So they can do their work. Let’s now switch a little bit, let’s move down some in the stack and of abstraction. In your writings, particularly in the book, you talk a fair amount about cognitive architectures as an approach to, we talked about the very beginning, thinking about the brain through software in ways that may help us understand ourselves.

Jim: By the way, maybe do practical things, but at least understand ourselves better. Could you maybe tell our audience a little bit about what cognitive architectures are in the sense? Things like Soar, ACT-R, the PSI model, and how they differ from what we read about in the newspaper all the time, the machine learning deep neural network approaches.

Joscha: The cognitive architectures are a tradition that mostly originated in psychology, visiting people that were strongly influenced by the ideas of cybernetics and AI, and then decided to get real about this and try to get a look at the way our mind is structure because our mind obviously has a lot of structure to identify the architecture of our mind and then identify the principles that would need to be implemented.

Joscha: I think that most people in the field of AI would agree that there are two directions that we need to look into. One is the general principles of learning and functional approximation, so when confronted with data, how do you efficiently build a model over the data that allows you to predict future data and direct visit and build control models? The other question is, in which particular way, is this organized in the human mind to give rise to the particular feeds that humans have, like learning language, interacting socially, interacting with the environment, and with their bodies to reflect symbolically over their perceptual representation such as feelings and so on.

Joscha: So, how to get these two perspectives together, is for me, a very interesting and challenging question. Most of the work that is being done in Western learning is not looking at architectures. The architecture is only instrumental to a certain task which could be for instance, text completion.

Joscha: So, we think about how to organize structure into layers and then how to stack the right number of layers together. Maybe implement an algorithm that automatically searches for the right number of layers, but we can also see that the brain is not organized into layers. It’s organized into regions that have very complex interconnectivity, so it’s much more like a city with a rich set of different ways of transporting information around in it.

Joscha: So there is going to be some street network that is low level that you can reach through your immediate neighborhood, but it’s quite pedestrian and it takes longer for the signals to cross starch distances. Then there are a long range connections like a subway. Then there is some general interconnection network that goes by the thalamus and allows for information from basically every region in the neocortex to get to every other, to route information around.

Joscha: So, how that works is a very interesting question to me. You could also look from a perspective of a training a, network and sums layer by layer, and then as soon as you introduce a new layer, you make this a function of the existing layers. Once that thing is trained, you introduce recurrent links, so the predictions of a later layer in your architecture are going to inform the predictors of the earlier layers and become inputs to them, so they become the context in which the lower layer makes its next prediction.

Joscha: The result is the same, so instead of getting a nicely tidy hierarchy of things, where you have an input and output, and the input is your sense of apparatus and your output is the highest layer of your attention, it turns out to be fully interconnected and going every way backwards and forwards. Suddenly, your visual cortex is not the first stage of processing, it’s just the area where you store the sorted textures.

Jim: And of course, that seems to be how the human mind is structured. When I reacted deep learning, mostly feed forward, though, they’re now adding some simple recursion, I always ask myself, what are they missing by not having these feedback loops?

Joscha: I think that everybody is aware of the fact that they want to have recurrences right from the start. When you look at the original work, for instance, by Hinton and Sejnowski and Ackley on Boltzmann machines that you have already a very, very general form of a model that understands that a model is a set of parameters that constraint each other, and each constraint is a computable function that says if a perimeter has this and this value, the influence that has an all the other parameters in the model.

Joscha: Then you can say that the deviation from these constraints is energy and to minimize the energy, so it’s very similar to a spin class model in physics where you try to minimize the global energy state of the system, and when you achieve that, the system converters to an interpretation of reality, and in some sense, though radically, this works very well as a model it’s pretty close to optimal, but it’s impossible to train.

Joscha: It turns out because the search space for these variables and constraints between them is so dramatically large. It’s basically not trainable beyond a few parameters. And so, Clinton introduced a constraint on this. He said that instead of having all these natural links between the parameters, all these hidden links, we only link them in a forward manner, and in between the parameters, we don’t have links in what’s called a Restricted Boltzmann machine, RBM.

Joscha: And of course, suddenly this thing cannot model many things anymore. Then the solution to that was to string many of these RBMs together in a network. So, each of them is individually trainable, even though it’s limited. Overall, it produces a behavior that the individuals could not. This eventually leads to our current deep learning architecture as via, a few steps, but it’s not optimal. The search space is too large.

Joscha: There are too many models states. The idea model should be able to be so tight that every model’s states corresponds to a possible state in reality. Most peoples network’s have many magnitudes, more possible model States, which gives rise to adversarial examples and limits generative creativity because most of the states that the system can be in, but not correspond to a route state.

Joscha: So for me, the thing that a lot of people don’t pay enough attention towards, is what is the transformer model achieving? It seems to be a way to think about embeddings into a space of features in a more general basis, so it gets back to this original notion of the Boltzmann machine in a different projection, from a different perspective, but still, and it is this notion of attention and self-attention binds features together across the dimensions into a relational graph.

Joscha: This allows you for instance, to generate a text in which the noun and the pronoun associated over a very large distance. The initial part of the text mentions the person by name that performs a scientific experiment, and the data part of the text just refers to this as the scientist or the researcher, and uses them as synonyms and understands in some sense, or represents that all these entities refer to the same concept in the text.

Joscha: This is something that was very hard to achieve in previous neural network implementations of language. It’s striking that this doesn’t only work for texts, it also works for images. So you can train this on images. You can feed it the first few lines of an image, and it’s going to continue the image which implies that internally by predicting the image, it’s building a representation during the prediction of the next thing of the entire image until the end.

Joscha: That’s a tremendous achievement. It seems to open the door to embeddings in general across all modalities. What happens if you are not just modeling the perception of a system like this, but also its decisions? Is there a difference between making a decision and predicting your decision? It’s probably the same thing, just from a different perspective.

Joscha: There’s still going to be some differences in terms of the way we predict reality because we do not predict reality just from the past, we also predict reality from the perspective of the future that we want to have, so we limit our search space to certain results that we want to have achieved in the end. This is a thing that the way we access the GPT-2 class or transformer class of models currently are not doing, but it’s nothing that is inherent to the way these models are constructed. It’s just inherent with the way you’re currently using them.

Jim: That’s very, very interesting about where we may be able to go in using transformer based architectures to get back to the ability to do things at long range.

Joscha: As you know, the transformer architecture it’s still ahead if you look at it, it’s a very simple idea. Of course, a very smart, simple idea that is then scaled up to see how far it can go. There’s not an obvious limit that we have hit yet to how far it can go which is in some sense, terrifying, because it’s so simple and you immediately wonder what are the optimizations that are going to be that we will quite inevitably discovered in the next few years?

Jim: Yep. That’d be very interesting. We’ll have to keep a look on it. We’re getting long on time here. We’ve gone and gone over our time limit, but that’s okay. I’d like to drill down a little further into details into some of your own work. Now let’s maybe give it another 10 minutes if we can, and talk a little bit about the PSI theory, Dietrich Dörner is that the guy’s name? Who came up with it and you wrote the very interesting book on the topic.

Jim: One thing I found interesting about it was that it was connectionist, but the elements in the connection architecture while they all use the same architecture could be at varying levels of abstraction from quite high to quite low, and the system self-organized and hierarchical all that stuff. So anyway, if you could explain how all that works?

Joscha: Dietrich Dörner is a German psychologist, a cybernetician strongly influenced by these ideas in the 1960s, while actually at a desk, in the hands of colleague. There’s [inaudible 01:38:41] who became a good friend of his, and at some point their directions diverge even though they remained friends over the years until Lem died, and then decided that the biggest influence that he could have on the development of artificial intelligence in cybernetics would be to become a philosophical science fiction author because he would be free of the constraints of academia and to actually get things to work, and instead, anchor ideas in the minds of people that would have a larger influence than writing a few papers about systems that he would not be able to get to work in the next decade or two decades. Dörner was more optimistic. He thought that behavioral psychology which was all the rage at the time was not cutting it.

Joscha: And instead we need to do cybernetic and computational psychology and just implement a model of how people work and then we’ll be done. He thought that he’d be done in the late 1970s originally, and told his wife that they would have an awesome time on the beach after that because their job would be finished we would have computers that think and solve all our problems for us.

Joscha: And of course, that didn’t quite work out, but he mostly on his own reinvented or invented [inaudible 01:39:52] many of the ideas that AI was into, so he started all this monolithic systems that later became situated connected to an environment, and then this environment could be changed by the system, so he became an agent architecture, and then he invented multiagent architectures and all the while these architectures had models of autonomous motivation in them that were based on this cybernetic idea of a feedback loop that would regulate it all.

Joscha: What I liked about his work was that he was extremely serious about building minds, and his heart seemed to be in the right spot, and also his ideas seem to be on the right spot, so I started reading his ideas in the 1980s. Most of the psychologists ignored him because it was theoretical psychology. He had for a long time the only chair of theoretical psychology in Germany.

Joscha: There was no such thing as theoretical psychology that basically tried to bridge between AI ideas and psychology while he was mostly unaware of the discourse that would take place in AI, and he would read sometimes something about it, but always come up with his own solutions. The first interview that I read with him, very constant native journalists of the German magazine Spiegel, which is the equivalent of the times in the U.S. Asked him why he would claim that these systems have two emotions?

Joscha: Wouldn’t everybody understand that it’s impossible for a computer to have emotions? And Dörner replied very earnestly that it really depends on your definition of emotion and that if you have a definition of emotion that doesn’t have an extension that you can understand, you probably don’t know what you’re talking about. He went on to try to define emotion and then try to explain why his systems would have emotion in this sense. I agreed with him, and I thought that this notion of emotion does not capture everything that emotion captures for me when I define emotion.

Joscha: It seemed to me that there is a trajectory along which we can make this definition richer and extend it, and then implement all these missing things until we all agree, so I thought this is probably the way to go. I started reading all his stuff and then decided to systematize it and translate it into something that could be actually implemented. Because there was less [inaudible 01:42:12] than him, I took his PSI theory [inaudible 01:42:14] this letter that psychologists love to use, they make a theory of everything and translate it to micro-PSI.

Joscha: My humble attenders and computer scientists do get some of the concepts that used to work. I spent almost a decade with that. This book Principles of Synthetic Intelligence is an attempt to turn the site into an acronym for a book title, of course, and to systematize his work and make it accessible to people in cognitive science and artificial intelligence.

Joscha: So, this is what the first part of the book is doing, summarize Stoner’s ideas and systematize them, contrast them with ideas that were around in the field compared to related work, and then the second part of the book is implementing these ideas. The third part of the book is critiquing them and explaining where I think we need to go beyond them. This is a snapshot of my understanding back then, and my thinking has since then in many areas evolved a great deal and moved on. It’s not that I still think that these are bad ideas. It’s just, this would be the first third of the next book that I would write if I find the time.

Jim: Okay. Interesting. What is the status of the micro PSI project? I saw there was a micro PSI one, which ran only under windows, and then there was a micro PSI two that was written in Python, but I looked at the GitHub project looked like there hadn’t been any updates on it four or five years. Is anybody working on it as it’s being used at this point?

Joscha: Yes. So the first micro PSI, we always try to be platform independent because the people that I worked with, they had MACs, they had Linux, they had Windows and we wanted to make this accessible to anyone. So, the first one was done in Java and it was written directly as [inaudible 00:19:59]. It was using all the typical things that you would use in 2003, so lots of XML and lots of factories.

Joscha: I think that back then, it was respectable software engineering, but it was very different from what people do five years later when everything was going to be [inaudible 01:44:16] and you would put your UI in the browser. So, the way to make a platform independent implementation of a research cognitive architecture was different then you did the second edition. And second edition was being used in two startups.

Joscha: One went defunct was in AI planning startup. The other one was started later by students of mine. Mostly I am also a co founder in this, it’s called Micro-site industries and uses the Micro site architecture as a framework to implement control networks for industrial robots mostly, and also does a little bit of basic research, but most of the things that are done are in house. So while we still host the architecture for people that want to use it, and some people use it and play around with it, the main evolution takes place within the company for proprietary projects in Berlin. And at the moment I am using Micro-site internally for trying out a certain models that I have about using spreading activation networks to produce procedural strips that interact with cellular automate representation and processing. There’s also thinking about what the next edition of Micro-site is going to be and how it’s going to be implemented. And we have some ideas about this, but it’s too early to talk about this.

Jim: Ah, okay. I was going to see if I can talk about what Micro-site three might be looking like. I’m going to throw out some ideas, which I also am always throwing at Ben Gertzal about open cog, which is I hope when you do it, that you think big. So many of these cognitive architectures or AI platforms were unfortunately conceptualized to only run on a single machine or a small cluster. And one of the things that we now have is really cheap computation and really fast networks. And if I were designing a platform for thinking machines, I would look carefully at some of the Apache large scale, very big cluster, very large throughput platforms like Ignite and Flink and Spark has two big advantages. One that they operate at massive scale. And secondly, it gets the implement or out of having to write a whole lot of low level stuff. And so you can leverage your manpower in the higher levels of the actual value add rather than trying to optimize how you move pointers and things of that sort. These guys have already figured that out.

Joscha: Exactly. I also wonder a better visual stills think so much of vision system and more across systems. So instead of thinking about how we can make a representation that is completely homogenous, think about how different parts of the architecture implement general principles that allow them to learn how to interact with all the other parts. And if that happens, you can, for instance, instead of reinventing the new algebra on your GPU, that then reinvents how to render graphics on the GPU using linear algebra, you can for instance, use existing shader programs and graphics engines and learn how to use them instead.

Jim: Well, I think that’s right. And again, there is no free lunch. So people say, Oh yeah, well you have Ignite, a true, gigantic key value store. Well guess what, on distributed architectures, there is no free lunch, right queries that have to traverse the network have very different costs than ones that don’t. And so having a system that can self-organize to take advantage of the realities of its network is probably the real secret to maximize these super powerful, gigantic scaler tools, but using them naively, you can produce degenerate queries quite simply that yes, you have a 100,000 processors, but guess what?

Jim: It’s still slow because the data is all over the place that makes any single query inefficient. So these things aren’t by no means panaceas. But if I were going to go down that road, I would be thinking of these very large scale data processing architectures, rather than writing something to run on a single machine or a small cluster. Well, this has been a very interesting conversation. We went a little bit longer than I was thinking. We didn’t get to some of the topics I had on my list, but that’s okay. I think our people will find this to be a quite interesting deep dive into the mind of Joscha Bach!

Joscha: Thank you. I really enjoyed our conversation and I’m sure we have more topics left for future time. Thank you very much.

Production services and audio editing by Jared Janes Consulting, Music by Tom Muller at