Transcript of Episode 25 – Gary Marcus on Rebooting AI

The following is an automated transcript that has not been reviewed or revised by The Jim Rutt Show or by Gary Marcus. Please check with us before using any quotations from this transcript. Thank you.

[00:00:00] Howdy. This is Jim, Rutt and this is the Jim Rutt Show.

[00:00:11] Listeners have asked us to provide pointers, some of the resources we talk about on the show. We now have links to books and articles referenced in recent podcast that are available on our Web site. We also offer full transcripts, go to That’s Jim Ruch show

[00:00:32] Today’s guest is Gary Marcus, author of the recent book Rebooting. I’m thrilled to be here, Jim. Great to have you.

[00:00:39] Gary is the founder and CEO of Robust A.I., a new company with the goal to make robots smart, collaborative, robust, safe, flexible and genuinely autonomous.

[00:00:50] Prior is the founder and CEO Geometric Intelligence, a machine learning company hired by Uber in 2016. He’s the author of five books, including one of my faves in the past, which I had no idea it was him called Guitar Zero, which both me and my wife love. If you like music and you like cognitive science, go read Guitar Zero, but read the new one rebooting A.I. first.

[00:01:14] There is also a cognitive psychologist who’s published extensively in fields ranging from human and animal behavior to neuroscience, genetics, linguistics, evolutionary psychology and artificial intelligence. According to his Web site, he’s perhaps the youngest ever professor emeritus at NYU. I should also note that he attended Hampshire College, as did an earlier guest on the show. Physicist Lee Smolin must be something interesting going on there.

[00:01:42] They taught us to speak truth to power at Hampshire College, and that’s what Lee and I have in common. We are both going after some very popular views and a lot of people don’t like that. But we speak the truth as we see it.

[00:01:54] I like that. I can see exactly that. That template is the case because both you and Lee are challenging the conventional wisdom in a major way, and neither of us are afraid to do it.

[00:02:04] I mean, doesn’t win us a lot of friends in certain quarters. But, you know, science doesn’t progress by people being chummy with each other. It progresses by recognizing limitations and taking the next step.

[00:02:15] Absolutely. A significant part of your new book is an attempt to be more cautious and realistic about the short term prospects for a I without denying its long term potential. On what I might call the oversold side, you lay out a three part description of what you think. A big part of the problem is you call it the chasm. Could you take us through that?

[00:02:36] Well, there are multiple parts to it that I think we could start with the gullibility gap. So the gullibility gap is that we are very prone to seeing small signals of intelligent behavior as if there’s more intelligence there than there really is. So classic example is Eliza, which was just a dumb keyword matching thing that people would talk to us if it was a psychiatrist and some people thought it was really a psychiatrist.

[00:03:01] So it would mash things. You would say, I’m having trouble with my girlfriend and it would say, well, tell me about your family. And people thought that was really smart, but it didn’t actually know what a girlfriend was. It didn’t know what a relationship was. It didn’t know what a family was. It was just parroting those things back. It might as well have been a parent and lots of people thought that it was smart. So they’re gullible. They see a small sample of what looks like intelligence. Our brains didn’t evolve to say, hey, that person’s intelligent and that’s really just a machine and just get suckered in. Another example of that is there was a guy who was so trusting of his Tesla after it had driven him around for a few hours, a few days or whatever it was, that he thought it’d be okay to watch Harry Potter in the back. And we all know what happened to that person, right? They watched Harry Potter while their car drove for them and it ran into a trailer that took a left turn on the highway and the person was killed. So, you know, it can be really dangerous to trust machines too much. But we have a strong tendency to do that.

[00:03:59] So the second thing that we talked about is the illusory progress gap. You see a little bit of progress on some piece of a problem. And you think that that means we’ve solved the whole problem.

[00:04:10] So you get some system that does a tiny bit of language like it recognizes your request to turn off the lights, Alexa, turn off their lights and you suddenly think while A.I. has solved the understanding problem or you get a system that can interpret a few of your requests on Google and you think that Google must understand natural language. Of course, Google doesn’t understand natural language. It understands a small fragment of natural language, but it’s very easy to fool it and it can’t actually go and integrate all the information from the web. It can just put together a bunch of web pages that matching keywords or use synonyms for keywords and things like that. But just because Google can do a piece of it doesn’t mean you can actually have a conversation with it. So we’re very prone to seeing small steps as if they mean more than they are. The truth is that A.I. is really, really hard. We’ve been working on it for 60 years and we don’t have machines that understand the world that can build what I would call a discourse representation of the things that we’re talking about and interpret that. It just doesn’t exist yet. And yet people write newspapers stories as if this latest little bit of progress was huge progress. Another example was Microsoft beat this thing called squad. Or you could argue about whether, but we’ll say for present purposes, the beat squad, which was a test of underlining text as you read passages. And that sounds a bit like reading. In fact, it involves a bit of reading. But just because you can read what’s underlined all in the text doesn’t mean that you can make inferences that stem beyond what’s written, even though those are obvious to any person. And so the media accounts were like humans are going to be replaced by robots because machines can read as well as humans. But any human, even like a 9 year old, can read things that aren’t explicitly stated and infer the connective tissue between the things that are stated in the ones that aren’t. And no writer worth their salt will tell you absolutely everything about everything because would be incredibly tedious. So things are left out.

[00:06:01] You don’t tell people that if you dropped something that it hits the ground because people know that if it’s dropped, it will hit the ground. But the machine doesn’t realize that if it’s not said explicitly in the text.

[00:06:11] Absolutely.

[00:06:12] In the third part of your A.I. and you called the robustness gap with the robustness gap is about is we’ve a lot of systems that works a 60 percent of the time or 70 percent of the time, but don’t work 100 percent or even close to 100 percent. And that’s fine for some things like if you’re recommending books and you liked my guitar zero book and you don’t like my new book. Well, you know, I’m sorry you wasted little money, but nobody dies. But if you have a driverless car system that works 90 percent of the time, then one in ten times goes out and has an accident, that would be a disaster. So there are many systems out there that work a little bit. I was just having an argument on Twitter with someone about a math system that’s between depending on how you count fifty and seventy five percent. Right. Well it’s idiotic to use a system that is 75 percent right. When you’ve got a calculator that’s 100 percent right. And if you’re only seventy five percent right, you’re not really that robust. And if you’re 50 then you don’t even know why you’re pretending that you can do math at all.

[00:07:05] That might all use the point of the robustness question is obviously hugely important and speaks. So you know what I mean?

[00:07:13] I think within the popular imagination is the most significant salient application of A.I., which is self-driving cars. I look this up and it’s useful to keep in mind that the number of miles you’d have to drive to kill somebody with a human driver, a drunk one, a sleeping one, and that woman just had an argument with its girlfriend is 10 million miles.

[00:07:35] I think the average is one in one hundred thirty four million is the average. I mean, of course, it can happen your first day of driving, but I think it’s on average one hundred and thirty. But even if it were 10 million, one in 10 million, it would still be much, much more reliable than your smartest driverless car, which is probably a way, Mo. You know, looking at the published statistics and what those statistics say is that driverless cars need an intervention about one in every 10000 miles. That’s, you know, many orders of magnitude away from the human safety statistics for fatalities.

[00:08:07] Yeah, I remember the time I looked this up. The total number of miles, by way, Mo, was less than 10 million. I. No wonder. Hadn’t killed anybody.

[00:08:15] Right. You’re raising a you’re making a good second point, which is we don’t even have the right number of miles to even begin to think that we’ve achieved reliability. So if you had a million accident free miles, that still wouldn’t really be anywhere near enough big enough sample to establish with certainty that you were as safe or safer than people. And I was making a second point, which is the reality is that these cars need human help every ten thousand miles on average, once in 10000 miles. That’s the best in breed. And that’s just not good enough for what we call level 5 autonomy. Where I pick you up the point, a ticket to point B, the way that Uber works. So, you know, people sometimes like to make fun of Uber drivers, but Uber drivers are way safer, way more reliable, require Layla less health, human driven Uber cars that than do you know your average human. I mean, you’re having driverless car. Excuse me.

[00:09:11] I did a quick look up, and we’re both wrong about the actual human kill rate, which is one point to five deaths per 100 million vehicle miles. That would be, what, about 1 in every seventy five million miles, something like that.

[00:09:25] But it’s still a a probably still a bigger number than all the miles driven by all the self-driving cars.

[00:09:32] You will send me a link because I have seen other numbers.

[00:09:34] Yeah, this one’s from the websites. It’s from Wikipedia. Maybe it isn’t that good. But quoting the National Safety Council, using mythology that differs when is in the same ballpark?

[00:09:45] Hundred million plus or minus hundred million is what I usually hear.

[00:09:49] Yeah. And doubtful. I don’t know the current number, but I wonder if the total number of miles driven by all the cars not under human intervention amounts to 100 million yet. No.

[00:10:00] Another way you’ve stated this problem is that in a word, current A.I. is narrow. It works for particular tasks. This program for.

[00:10:09] But when it encounters something that’s even a little bit different, it doesn’t know what to do. My friend then Birdsall likes to give the example that Alpha go as amazing as it was.

[00:10:19] Train at 19 by 19 Borg. If you are not an 18 by 18 board, it wouldn’t play very well. You asked the play checkers a good play at all?

[00:10:26] That’s right. We use that example in rebooting A.I. as well. Playing on a different board and we refine it so our alpha ego doesn’t, for example, even know that there is a board with stones on it. It doesn’t actually have the ability to recognize stones in different lighting conditions. The way any player would be doesn’t know how to pick up the stones. It doesn’t understand that stones on a board are a metaphor for territory in human battles or whatever. He really just knows about that grid of that size and that’s all it knows about. And you could retrain it to play on 18 by 18, but you’d need it to play another 30 million games.

[00:11:02] Absolutely. We’ll get the 30 million also later, which I consider to be a big hole in the in the grand approaches.

[00:11:08] Getting back to where you were first going, you know, it seems to me that an awful lot of the difference between foreign eyes and how humans or even our dogs deal with the world is we have context and world models.

[00:11:22] Here’s a quote from your book. Even a young child in countering a cheese grater for the first time can figure out why it has holes with sharp edges. Which parts allow cheese to drop through? Which part to grasp with your fingers? And so on.

[00:11:36] But no existing A.I. can properly understand how the shape of an object is related to its function.

[00:11:42] All that’s got to be driven by world models, you know, psychological physics, all kinds of stuff that make sense to you.

[00:11:49] I mean, that’s exactly what we’re arguing, is that the current neural networks, they learn a lot about statistics, but they don’t really build a model of the world. So you can have a conversation or not have a conversation, but test the language abilities, have this thing called Geeky 2, which is one of the most powerful language models right now. Move it. Talk to transformer dot com. And if you type in questions of the form, here’s a sentence and then you fill in the blanks. You get really bizarre answers that show that even though it knows the statistics of what’s going on, it doesn’t actually understand what’s going on. It’s going to see if I could quickly pull up a example of the kind of thing I was playing around with the other day. A water bottle breaks and all the water comes out leaving. And, you know, what kind of answer do you expect? Like, well, if all the water is out, you should have roughly zero. But it says roughly in one example. Two hundred gallons of water like no water bottle in the real world leaves two hundred gallons of water after it’s broken. Or here’s another example from the same thing water bottle breaks and all the water comes out, leaving roughly. Dot, dot, dot. And it says six to eight drops of beer. Well, that’s funny, but it doesn’t show that the system actually understands that when the water comes out of the bottle, you basically have an empty bottle. And there’s just millions of examples of this sort.

[00:13:03] And that’s frankly what I would expect from something like the warning, which is essentially a very large structure for capturing statistical regularities. And that’s something different than what we might call artificial general intelligence.

[00:13:18] Maybe even before we get to artificial intelligence, just to emphasize your last point is step two, artificial intelligence is you see what’s going on in the world with your sensors, which might be cameras or people type things in whatever. And what you’re trying to do is to build a model of what’s going on right now. So your listeners, for example, have a model that we have two people that are talking to each other, may not know exactly where they are, but you can if you’re a listener, you in your model, for example, think that they’re both English speakers. They’re both knowledgeable about machine learning, they’re both opinionated. And so you’re building up some view of what’s going on and then you use that in order to interpret the next thing that goes on. Well, that cumulative process, the model building process is one of the fundamental things that just hasn’t solved yet. And deep learning in particular doesn’t really solve. So it evades it by having large amounts of statistics. And sometimes that tricks people. But that also leads to the lack of robustness that we’re talking about. I would argue you can’t possibly get to a general intelligence if you don’t have that process of cumulative model building as one of the core things that you do.

[00:14:24] That seems very reasonable to me.

[00:14:26] It seems reasonable to you, but it’s been very hard for me to get the field to engage in that. So you have a lot of people that are working on deep learning. They’re extremely well paid. They’re excited about what they’re doing. And they’re not really familiar with the literature on cognitive psychology, which a lot of which is about cognitive model building. And people just don’t really appreciate that point. So, you know, Geoff Hinton is saying you shouldn’t listen to Gary Marcus because he thinks the neural networks don’t understand. But look at Google Translate. It understands. And the thing is that Google Translate does not have a deep understanding of what’s going on. It does not build up the kind of model that we’re talking about. So you could feed it in a passage from Harry Potter. It might translate it by looking at the statistics of how this phrase connects to this other phrase. But if you asked it at the end what happened in this scene, it doesn’t know where to start because it’s not building up an interpretation of this person wound up in this room and they had a magic wand or whatever it is that they did. It’s just doing a kind of like like China, the old Searle Chinese room thing. It’s just matching little pieces together. It’s not constructing these cumulative models of the narrative that it’s interpreting and just the way that we were talking about. And the field does not want to engage in that issue. And so it’s kind of a little bit delusional about how much progress it goes back to the illusory progress thing. Just because you can do machine translation doesn’t mean that you have a system that can interpret a narrative.

[00:15:55] Probably why it seems reasonable to me is since 2014, I’ve spent a considerable amount of time reading cognitive science and cognitive neuroscience literature and particularly focusing on the intersection between consciousness and cognition.

[00:16:10] So I have a relatively rich understanding of these things and I use that as my existence proof artificial intelligence and compare other approaches to that so I can see where they’re falling short.

[00:16:23] As an example, people in the learning side who are so insistent on deep learning umbrellas. I had a chat one time with a guy named Juergen Schmidt Huber.

[00:16:32] I’m sure you that you’re familiar with. And he kept saying just over and over that his LSD version of the learning was during complete, then brushed aside any need for any other approach. Starting the.

[00:16:43] Yeah. Sometimes I feel like people in the field understand math a lot better than they do. Psychology, linguistics, the cognitive science is more generally. So mean. It’s not surprising. If you dig into those fields, you start to understand what the problems of that cognition. Whereas if your approach is purely mathematical thing, you get very good at kind of moving the statistics around, but you don’t necessarily learn what needs to be solved. If you think about consciousness or even just cognition in general, a lot of it is about constructing narratives essentially and internal models. And that’s mostly not on the radar. To Schmidt Huber’s credit, he’s actually got a paper about building internal models last year. That’s an interesting paper. But by and large, the field does not engage in that set of issues and does not engage in the cognitive science literature in general. So it’s not that these people aren’t very, very bright. They’re extremely good at what they do. But like the models that they make, they’re narrow in a certain way.

[00:17:37] I think that that impedes progress to some extent.

[00:17:41] And even though they’re focused on the math side, which many of them are. I’d say that a rich understanding of the mathematics of learning ought to make you suspicious of one size fits all, particularly one of my very favorite results is the no free lunch there, which basically states that there is no best way to solve any problem or to search any database, and that any such general statement ought to be inadmissible without evidence from the domain in question.

[00:18:12] So I’ll give you my response. This technique is the answer, period.

[00:18:16] I say such things are an empirical question. Let’s see how they work in specific domains.

[00:18:20] Right. Just to rephrase that. No free lunch there is really that there is no one technique that is going to be general across all problems. So you may have a specific problem where there is a best technique, but there isn’t.

[00:18:30] No, it is provable by that theorem that you’re talking about that there is not a universal technique. And so when a Schmidt you or something like that says, well, I’ve got this universal technique, you keep your hand on your wallet and realize it’s actually going to be good for some problems and not others. What’s interesting about human biology is that we have a lot of different mechanisms for solving many different problems that we’ve gotten over evolutionary time. Some of them are pretty good and some of them have problems that stuff with. We have many different techniques. So observational learning, for example, is not the same thing as learning language or as, you know, calibrating what you’re going to do with your hands as you catch a tennis ball. So we actually use multiple techniques. There isn’t a one size fits all, as you’re saying, and there’s a long tradition in intellectual history of people kind of overreaching in that way. So, for example, behaviorism tried to capture all of psychology in a single set of equations and some of what it said is true. Like, we’re all motivated by reward, but the fact that we’re motivated by reward doesn’t give you any of the nuance. For example, about how we do physical reasoning to understand whether something is gonna tip over.

[00:19:38] You’re absolutely right. There is for many people a tendency towards looking for one answer. That’s why I often say that I when I once I get to know somebody of some intellectual depth in the sciences, I make a determination. Do they understand the no free lunch there or don’t they? And I would say it’s about 75 percent. They do not yet.

[00:19:59] Well, on a related point, I agree with you. I think it’s a good litmus test. The related point is, if you don’t know a thing, you tend not to appreciate the value of nativism. So everybody appreciates that learning is part of how we get to be who we are and who we acquire a lot of information from culture. But one implication of no free lunch theorem is that having innate priors for particular kinds of problems, innate knowledge about particular kinds of problems can be really helpful.

[00:20:26] So if you were not born knowing that there were persisting objects in the world and you just had a lot of kind of sense perception, it would take a long time to figure out from all those correlations that objects are even things to look at. And so people that really appreciate the no free lunch theorem, I think are often much more receptive to there being some kind of nativism than biology. And I would argue that A.I. is not going to proceed until we similarly allow some neatness in there. And a lot of people don’t want it.

[00:20:53] I know. Seems very weird.

[00:20:55] They think that from very basic first principles we can solve the world and maybe they can, but it doesn’t seem to be the way it’s been solve previously. I like the example you gave actually of objects.

[00:21:06] Objects are core to the work I do on the intersection between attention and consciousness. And my digging around the literature, it seems clear to me that objects are pretty damn innate. Right? You cannot look at the world without having it chopped up into objects for you, and that seems to happen at a very young age.

[00:21:24] I mean, we can’t absolutely prove that, but it seems very, very likely. I’d like to know people. Humans don’t like to think that humans have innate stuff in it. There’s a strong bias. My friend Iris Barrett has actually documented that empirically with some papers that are coming out. So as Lilac Lightman used to say, empiricism is innate. Believing we don’t have anything built in is a weird innate bit about human psychology. But if I show you a picture of a baby ibex climbing down the side of a mountain a few hours after its birth, there’s just no coherent way to do that other than to think that that baby IBEX already understands that there is a three dimensional geometry to the world that relates to constituent parts of the things that it’s traversing. Like you have to assume that or else you just can’t explain what the behavior is. You can’t say that in one hundred and eighty minutes of experience that the baby IBEX has learned all this stuff about objects and three dimensional geometry. It just does not make any sense.

[00:22:18] It’s funny you mentioned the IBEX is what I used as my model animal for my conscious cognition work is the white tailed deer. And why did I choose that? I’ve been a deer hunter for 50 years. I have a fair amount of theory of mind about how deer behave from the time they’re born to the time they’re, you know, the top bunk in the woods.

[00:22:38] And I find that to be very, very useful way to think about problems of this sort. You’re exactly right. White tailed deer is up and running within two hours and is clearly navigating a complex object filled space, but didn’t understand it would be running into trees.

[00:22:54] All right, then. I mean, it is systems they can actually navigate. Trees like sky, dios, drones. They have a lot of innate structure in order to help them recognize that they don’t learn that from scratch.

[00:23:06] And we’ll go back to later when we talked about hybrid systems, about how perhaps in a sub symbolic neural net approaches may conform least approximately to our perceptual and higher end of perceptual object recognition systems and a hybrid architecture. That’s what we’ll get to later. Another problem I see with neural nets, I think you pointed this out as well, is the very large number of examples that these systems so far had to get any reasonable level confidence. Probably the world’s best chess players played maybe fifty thousand one hundred thousand games as compared to the hundreds of millions that these self-learning AIS have to have.

[00:23:42] And let me give you an example of my own personal history that really took my interest in this small learning set size and got me hooked up with Josh Tenenbaum.

[00:23:51] My take and I chatted a fair amount about this problem is one of my hobbies since I was a kid was war games have been playing war games since I was 10 years old, which is fifty five years ago, starting with the old Avalon Hill cardboard pieces on maps kind of thing. And that played everything up to the current state of the art computerized games. And I don’t lie about two thousand fourteen right. When I was getting interested in cognitive science, cognitive neuroscience and a G.I., I started to learn a new game called Advanced Tactics Gold, which is for people who are old timey turn based strategy war game players. This is like the ultimate game damn close to it. It’s huge. It has every aspect. It’s got sort of a rudimentary politics, economics, all kinds of different military units that you can mix and match. I did just a very rough calculation and decided that it had a branching rate per turn on the order of 10 to the 16th. Compare that to checkers, which is three or four. Chess maybe 30 and go maybe 250 advanced. This goal was 10 to the 60. So way higher search space than these other games. And sure enough, when I played it the first time, it kicked my butt really badly. I mean, it was humiliating. The next three times it beat me easily that we had two games that were pretty good struggles. And then Game 7, I beat it. I’ve never lost again. I probably played it a hundred times. And so how could that be? How could I learn so quickly?

[00:25:19] I went back and looked at what was my knowledge base of having been a war game player for fifty five years, and I estimated that I’d perhaps played twenty five hundred games to completion across maybe two hundred titles. That’s not a lot. Twenty five hundred games. I’d also perhaps read two hundred books on military history and strategy.

[00:25:40] So what seems like not a whole lot and put two hundred books is just a few seconds of download today on fast computer and twenty five hundred games isn’t very many. But I was able to beat a state of the art 10 to the 16th branch. All right. Game after seven tries. How did I do that? That question has been nagging at me ever since. And that’s just a qualitatively different approach than these brute force deep learning approaches.

[00:26:07] Well, I mean, I don’t know anything about the particulars of the system that you playing, but I would say that human beings have a lot of concepts like notion of evolving an economy or notion of, say, splitting forces or flanking the opposing forces. And those concepts are often very helpful or the mood is that the other loop, you know, Orient. I forget the second decide and act, observe, decide and act.

[00:26:33] But in this case, it was just not real time to loop supply in real time games for sure, but not really in turn based games.

[00:26:40] But certainly the other things definitely do. You know, it turns out that the basics of strategy and tactics happen to work in this game. And I had induced strategy and tactics from playing twenty five hundred games and reading two hundred books, which was relatively small amount of input to have been able to produce those models and principles and apply them in a complex setting.

[00:27:05] So I don’t know the specific game that you’re talking about, but there are lots of general concepts that humans have that you can start to apply quickly, whereas most of the current best systems are fundamentally just learning about contingencies. They don’t have a lot of abstraction there. So if you’re in a game where there’s a limited number of possibilities and it can be a pretty broad limit, you’ve got enough data. You can kind of graph out a space and interpolate between the cases that you’ve done. If you’re a machine, if you’re a person, you’re using concepts like I’m going to divide the other person’s forces and flank them in this way or maybe use the other loop of Orient, Observe, Decide, Act. There are lots of conceptual frameworks that you can bring to a war game or to kind of any problem in the world. The narrower it is, the easier it is to use brute force. The more that your options percolate, the more that you need some grasp of what’s actually going on in order to perform well.

[00:28:01] That’s a fact I refer to that my own work as heuristics induction, you know, in a game with branching of 10 to the 60 a per turn, there’s no way you can do anything like deep learning, reinforcement learning to try to learn that problem space.

[00:28:15] You have to create some heuristics. And how do you create the heretics by some form of induction? I wish I knew how that was done by us humans.

[00:28:23] Well, you know, the other thing that you did that a machine current machine cannot actually do is to read a book so people forget that current A.I. systems are essentially illiterate. Deep mine showed off that they quote, master go without human knowledge, which is not actually correct. It was human knowledge that we could talk about it. But they sort of made a virtue of the fact that they don’t actually know how to put in a book about go into their systems or a book about military strategy and go is very well documented. The particular war game that you’re playing, maybe there aren’t any books about it, but you read books about other things and transfer that knowledge about other things to this new game. So you can imagine, for example, deep learning system, learning how to play a first person shooter where you attack zombies. But it doesn’t learn enough about how first person shooters work in general to be able to transfer that to a first person shooter where you blow up Nazis. The system, what its learning is very superficial, very close to the bone, whereas what you’re learning as a human is about more abstract things like in a game that is a first person shooter. Where am I likely to find enemies? How am I likely to hide? How do I change weapons and so forth? Are you learning abstract ideas that you can then apply in a new environment? That doesn’t mean that there isn’t some very specific knowledge. You learn the corners of this map in this game and that doesn’t help you with that game. It has a different map, but you learn a lot of general things like how to work with maps, how to work with inventories and so forth. You can apply from one to the next. There is no current day AI system, deep learning or otherwise that can really do that can. As you said, induce the heuristics. So you induce heuristics about how to manage an inventory or how do I read a map or how do I accumulate a map from partial information or how do I duck behind an obstacle? You’re trying to learn these abstract skills that you can reuse, and what we lack right now is a way of building reusable skills. Not saying it can’t be done some form of a AI, someday we’ll do it. And that someday might be in 10 years or 20 years and 50 years. But it will happen. But the current technology doesn’t give us a way to accumulate reusable skills. So instead people learn things so-called end to end. They learn the entire game from top to bottom and nothing that they can carry with them to the next game over.

[00:30:42] And again, that limits you to games with relatively low branching rates. It does. I believe at least what I have read, I haven’t actually used reinforcement learning on a project, but I read. Out about it and came away saying, yeah, it oughta work good, but only within domains that have relatively low branching rates.

[00:31:01] Well, you know, really telling talk that I saw the other day was by Peter Beal. We were both speaking at the Rotman Institute in Toronto, and he’s been one of the pioneers in deep learning for robots. And he gave a very honest, candid talk and he said, here’s the fundamental issue. We can do all kind of crazy stuff in the lab, like was in his lab at the Rubik’s Cube. Stuff that people talked about recently was an extension of work that he was part of the team that developed. I think it was his students, you know. So, you know, to remind you that Rubik’s Cube, the system didn’t actually learn how to solve a cube, but it did learn how to do the manipulation to turn the faces of the cube. And that’s very exciting. But you don’t see stuff like that in the real world. So people build laboratory demonstrations that Rubik’s Cube had like sensors inside. And, you know, it’s in a well lit room or whatever. You build a robot to go into the real world and things just don’t work that well.

[00:31:54] So it’s OK. Maybe in a factory environment where things are very, very well controlled, but more open ended. The world is which is related to your notion of branching factor. It’s not identical to but related to it the more trouble robots have. So, you know, the best selling domestic robot of all time is still Roomba that doesn’t manipulate objects at all. And nobody knows how to build something that could do what my housekeeper can do, like tidying up a room that might have any kind of stuff in it or the famous the Wazni Act test.

[00:32:24] Know, I thought this was a very clever alternative, the deterring test, which was the plop a robot down and a random American kitchen and tell it to make a cup of coffee. Good luck. Nothing you do that. You’re currently right now. Not even close. Not even close. Not even those cool robots from Boston dynamics, right?

[00:32:40] No. I mean, the Boston dynamics doesn’t really specialize in manipulation at all. They have one robot that does some manipulation, most of which has been tele operated, then claim to fame, for they are the robots. Is that like they can walk around and you can kick them and they’ll remain stable, which is a kind of robustness which is really important. But they’re not robust and autonomous able to make their own decisions. Let’s say, like I’m looking in my room right now and there’s a fan that really should be put away. The summertime is over. And so, you know, what kind of robot is going be able to pick up this fan at this one particular one is a battery operated fan with a long stock. And so you have to go over to it, decide which axis is convenient and then grip it. And then you’d realize that the bottom piece is actually just a holder and it would drop and you’d have to compensate for that. Be no problem from my housekeeper if I said, could you move that? And there’s no machine that knows how to do that is literally does not exist yet.

[00:33:32] We’re a five year old kid. Could do it, right?

[00:33:34] No problem. I have one. So I know it for a fact that she could do that without problem.

[00:33:38] I remember being vibe. In fact, my mother, the wise woman that she was when we were out of school, thought it would be interesting to teach us how to do housework, though, just like Tom Sawyer at five years old. Your dad learning anything, right? That we learned how to clean house, how to clean the bathroom, how to straighten up all that stuff.

[00:33:56] Show me the robot that can whitewash a fence. I will be impressed.

[00:33:59] Exactly. Now, we touched on this in passing, but I want to hop back to it, which gets to my mind was damn close to the meat of the matter of the next big step.

[00:34:08] I mean, here is the conduction is a really big one, but another one is what you call I think what I call real language understanding seems like truly understanding language is a great bottleneck to further progress in the kinds of general purpose eyes that we’ve been talking about. Fact. You gave a nice example, which was A.I. programs that could automatically synthesize the best medical literature to be a true revolution. Computers that could read as well as page these students. But with raw computational horsepower, Google would revolutionize science too. But we’re not even close to that. Hell, we don’t even have convincing chat bots. I tried a few, supposedly the top ones, and they’re not even close. They wouldn’t fool me for two minutes, let alone 20 minutes. What do you think the issues are around real language understanding and what real language understanding is and what we might see there in the future.

[00:34:59] So we’ve been trying to coin a phrase of deep understanding to distinguish between deep learning. So you could say that there’s a shallow understanding in systems now. Right. So Alexis can understand particular requests or you could feed into this GP two system. And if you’re talking about. I don’t know, baseball, it’ll come back with baseball related terms or if you talk to it about chairs, it’ll come back with terms that are chairs. So you could say there is a superficial understanding, but there’s not a deep understanding. There’s no way that the system understands that chairs are for sitting on or if you took away one of the legs, the chair might fall over. And so it wouldn’t be as good a chair to sit on. So there are couple of things are missing. One, we already talked about these cumulative models of the world. That’s what language is about, is I build in your head a model of something and then you kind of check your model against mine or you elaborated that whole process is missing and then common sense is missing. So one of the ways that we can do this, the. I can tell you. Could you move that chair over there because you know what a chair does and what it’s for. So, you know, a reasonable orientation for that chair is with the flat surface up so that somebody could sit on it and that’s spinning it upside down so that they would have to sit between the feet of the chair is not a reasonable thing, because that doesn’t accommodate the way that people’s butts work. And so, you know something about how the world works and you integrate that with a model of the things that we’re talking about. And that’s what language is about. And that none of that’s really there yet.

[00:36:24] So I’m going to hop ahead on my questions list, because you opened up something that’s interesting, which is common sense or knowledge engineering, as we used to call it, the good old fashioned A.I. or knowledge engineers or the closest approximation we had graduate students would create representations of knowledge or common sense within its domain. And then we’ve had some larger scale projects like psych and concept that.

[00:36:51] And yet none of that seems to have really gotten the kind of traction that we need. Any thoughts on why not? And is there something like the kind of brute force way out to extract common sense that deep learning has been able to extract statistical regularities?

[00:37:07] Well, there’s a whole lot of different questions there. So I may lose some, but I’ll start with psych. Psych is the, to my knowledge, the biggest knowledge engineering project of all time. Although there might be some medical taxonomy that are in some way similar in scope. And what Doug Leonhardt tried to do and is in fact, still trying to do is to codify much of the world’s knowledge in a machine interpretable form. I think that the specifics of how he did it maybe aren’t quite what we would do today. So he did it very much informal, logical terms, and that may or may not be flexible enough. So some things that people would emphasize now would be uncertainty, probability distributions. So when I know about chairs, I don’t just know some formal things about him, but I know kind of the distribution of chairs that I’ve seen before and what their properties are. That kind of stuff, I believe is not represented in there. I think as a system, one of the issues that it has is there’s actually not a lot of natural language interface. It’s mostly the formal logic part. And it’s possible that taking its reasoning system in conjunction with a good natural language interface might actually be very useful because that doesn’t exist. I think it limits the utility of what he’s got. Generally, it’s regarded as a failure.

[00:38:20] I’m not sure that’s actually the right interpretation. It might be that he solved that piece of the problem, but since he hasn’t solved the whole problem, it’s not very commercial. And then people infer falsely that because he hasn’t solved the whole problem been a commercial success, that there’s no value there may not be correct. I am implying I would do it differently nowadays. I would have a lot more representation of uncertainty. And he’s got pretty good arguments that you do want the power of a formal logic and not just a first order logic. You want to be able to represent things like I believe that you think that such and such as going on, and that requires fairly sophisticated logic. There is no formal argument out there that he’s wrong. There’s a lot of people who are allergic to what he did, but none of them, so far as I know, have a serious alternative. Maybe the closest alternative is some of the work that Agent Choi has been doing recently at University Washington and the Allen Institute for A. I think it’s very interesting, but it’s not as stable and robust in drawing an inference as Senate stuff. And there’s probably room for both. And ultimately, I think you want to do some learning stuff like Choi is trying to do, and you want to have the formal logical reasoning abilities that let it emphasize that you need to bring this together in some fashion.

[00:39:33] Interesting. I’ve also wondered about site. You know, it’s been out there for 20, 30 years and I’ve sort of seen it at the corner of my eye, but I’ve never done a deep dove into it.

[00:39:43] The best currently publicly available thing is, Len, it has a piece in Forbes, I believe, on making inferences about Romeo and Juliet that just came out earlier this year. And he actually walked me through some of it. And when it works, it’s really impressive how much it can infer about why certain characters took certain actions, which unimpressive about it is there’s a lot of it’s the sort of hand written about Romeo and Juliet. So it doesn’t really teach you how the knowledge would be acquired. There isn’t really a natural language front end to it, but once that knowledge is represented, there can make much more sophisticated inferences than any thing else that’s out there. And so, you know, maybe it’s been miscast, at least in people’s minds. You know, what it’s really good at, I think is making logical inferences about the nature of people and actions and their interaction with one another. Nobody else really has that right now. It’s not an end to end system.

[00:40:38] And what people are trying to build now are systems that kind of go from pixels to action. And if we could get those to work, that would be great. But the reality is they’re incredibly fragile. And we give in the book the example of the deep mind Atari game system of reinforcement learning system or deep reinforcement learning system that plays Atari games. And you read the original paper and it says it learns to break through the tunnel and have the you know, the paddle hits the ball through the wall and it does that ricocheting thing. But that’s that’s this illusion of progress and gullibility gap. Just because you see the machine and think that that’s what is doing doesn’t mean that that’s what the machines actually doing. If you move the paddle up a few pixels as vicarious did and a really cool paper, then you see the system doesn’t actually understand anything at all. It doesn’t really understand that there’s a ricochet technique when the paddle is moved a few pixels from all these memorized positions. The system is just makes mistakes left and right.

[00:41:29] Well, that’s how reinforcement learning works, right? It just basically takes one step after the other. And then it figures out at the end, was there a pay off and then it up votes all those links that led to a pay off at the end. So that’s what you’d expect.

[00:41:41] And if if you have enough data relative to, let’s say, a particular level in a video game and it can play that game better than a human, but that doesn’t mean they can play the next level. And you can think of this demo was kind of another level. OK, we will you know, in level 2, the battle will be in a different place. And it’s it’s at a loss. It has to start over to learn that new level. It doesn’t have emphasizing something I said earlier, transferable knowledge about what a wall is or, you know, that kind of physics of that game. It just has this narrow knowledge about contingencies within this level if you play enough times and get contingencies for anything. But the question is, what do you do when the world changes?

[00:42:17] If the world’s fallen off, you can do that. If the world small enough.

[00:42:21] That’s right. And you know, you’ve made this point a different way, talking about branching factors and so forth. We’ve been talking about narrowness. If the world is potentially complex enough, that suddenly is not the right tool for you to use. And so there’s been a lot of work on that line and maybe something will come out of it. There hasn’t been a lot of commercial applications so far. And whether or not people come up with a cool commercial application for it, it’s really not the solution for A.I. and the open ended world. It’s just not well suited to that.

[00:42:51] Let’s finish off on the psych concept in that area. And I think I asked kind of ill formed question and I won’t try to form it better.

[00:42:58] Do you have any thoughts on how a next step in A.I. research might be able to extract common sense knowledge from the real world in something like a symbolic form, at least to now go sleep to the way deep learning extracts statistical patterns from examples that I think is probably possible?

[00:43:19] I think that the roots of that are and things like inductive logic programing, but that we need richer innate basis in these systems. So they have prior knowledge about how objects work and people work. Not everything, but some like knowing that objects exist, that they persist in time and so forth, in order to kind of have a scaffolding such that when these incoming facts arise that the system can do something with it. What people have tended to do has no prior knowledge and doesn’t work very well. So there’s, for example, Tom Mitchell’s Nobel, but I think unsuccessful attempt called now never ending language learner, which I guess is still running at Carnegie Mellon. And it tries to extract triples from the world.

[00:43:59] And sometimes it comes up with things like, you know, Barack Obama is the president of the United States in there. Okay. They can be outdated. Sometimes the facts are really, really poor because the system doesn’t understand about entity disintegration, let’s say. So it comes up with facts like Barry is a painter and you’re like, well, which Barry doesn’t really tell me what kind of painter you, the artist does he do houses. And so a lot of the knowledge is kind of undifferentiated in a way that makes it useless. And like the spirit of what I think Mitchell was trying to do. Great engage, enjoy again is doing some stuff in this vein. Don’t think we have the tools yet. I don’t think it’s impossible and I think it would be a great area to try to make some advances in.

[00:44:45] It seems to me is fairly close to the you know, again, like along with language, understanding the core of getting to the next level. Just thinking out loud here.

[00:44:53] Suppose one were to think through a project somewhat analogous to psych. But with everything we know today about what’s happened over the last 30 years and the knowledge we have about cognitive science, cognitive neuroscience. You think it could be considering the stakes involved in getting to a high level A.I. relatively quickly? What could it make sense to do a neo psych using everything we know now to start over completely from scratch to build this grounding basis for A.I.?

[00:45:22] I think be a huge deal. I think it’s a five hundred million dollar project and nobody who has the appetite for it right now.

[00:45:28] I wrote down a billion and I said a billion. Who cares? You know, a guy is worth trillions, right?

[00:45:34] I think it be well worth doing. And, you know, after my robots, the company becomes a huge success. Or maybe as a spin out of it or something like that, maybe I’ll take that on.

[00:45:43] Well, I think about a billion, you know, billions what the Europeans wasted on their brain initiative. Right.

[00:45:49] Well, the original budget was a billion euros.

[00:45:52] Yeah. And as far as I can tell, not a whole lot came out of that. You know, think of the Obama stimulus. It was 700 billion. Right. Took a billion of that. Put it in neo psych or.

[00:46:02] Well, it’s also less than the investments that, say, Facebook or Google slash Alphabet, make an NIH here. But it’s not to their taste.

[00:46:10] That’s contrary to the religion, at least that Google.

[00:46:14] I mean, you know, Facebook, perhaps a little less modern, maniacal, but still mostly dominated in the deep learning approach. I’ll talk back a little bit again. We hopped a little bit here, but that’s part of the fun of doing a podcast. Make it somewhat non-linear. We were talking about language. It’s something that really hopped out from the book, which resonated with me, is that you said human thought and language or compositional, as are, by the way, most good techniques for doing computer programing. Could you tell our audience what you mean by compositional edge? Why that’s relevant.

[00:46:45] So I’ll actually take a computer programing first. So the way that we build complex computer systems is we build small modules and then we build larger modules out of the smaller modules. You make sure that the small pieces work and then you make bigger pieces outside of them. So that’s a form of composition. Naledi sentences are the same way. So we have small pieces like nouns and verbs, and we can make more complex pieces like noun phrases and verb phrases, and then we make sentences out of those and we can make sentences, sub sentences of other sentences. So I can say I like my new iPhone and then I can say, you know, that I like my new iPhone. You can say your friend thinks that, you know that I like my new iPhone. And so I you put together little pieces inside the bigger pieces. That’s what compositional reality is about. For the most part, the current approaches don’t really do that. They look at all the words in a sentence before and after and try to find other sentences that have similar vocabulary in them. Similar is a kind of complex notion in these systems, but they don’t really look at structure in that way. So they don’t come back to you and say that, you know, this clause is about somebodies intention and this other clause is about their perspective on that intention. And we’re going to put it together in order to make this statement about the world. And we’re gonna derive this cumulative description of what’s going on that just isn’t really part of the current workflow and either has been work on that kind of thing historically. But right now, because deep learning is making better short term results, people are focusing there. And I think losing sight of composition naledi is really the the core thing we’re trying to come up with.

[00:48:18] And make a distinction for the audience, there’s strictly been divide between symbolic A.I. and sub symbolic.

[00:48:26] I watch deep learning and neural nets are the primary example and historically symbolic A.I. has been compositional, or at least in theory could be, while mostly sub symbolic has not been compositional.

[00:48:39] The only thing I would add to that is that the distinction itself is confused. So every neural network that I know has, for example, output notes that are in fact symbols. And so nobody’s been ever that clear to me about what sub symbols really mean. But it is true that they try to make do without certain kinds of representations like operations over variables in structured representations that any programing language from assembly language to Python takes for granted. And I think that in a way, they’re kind of tying their hands behind their back, taking away one of the most valuable discoveries in the history of humanity, which is how to write computer programs, composition.

[00:49:15] And that was relatively recent. I’m old enough to remember when people wrote these horrifying. But I did some myself, these horrifying fifteen hundred line Fortran programs that were just ugly bags of spaghetti, right? Yeah.

[00:49:27] I used to write in basic with all these go two lines that it was completely not modular. It was a mess. It was hard to debug and it was hard for someone else to understand your code if they looked at it later because it was not sufficiently modular. It’s not pretty. I mean even those programs, though, you have to say, were compositional in the sense that even a line of basic code, if Ake was B, is still a symbolic system. It wasn’t a very well structured symbolic system. So, you know, they’re symbols at the level of your basic operations and even basic Fortran or whatever is completely symbolic in that sense. And deep learning systems for the most part don’t allow that. And you have people like Geoff Hinton trying to say, don’t go there, don’t look at that kind of stuff. It’s evil. It’s old fashioned. Don’t touch it. But, you know, all of the world’s computer programs essentially are built on this kind of stuff. And then the more sophisticated stuff. Anything that any know reputable software engineer would right now is compositional the very high level. So you build modules on top of modules and you have inheritance and all these kinds of stuff. It’s very much from the symbol manipulating tradition.

[00:50:31] You talked about before you do that, though, you will talk about how human language is competition. Let me do a little sidebar on something I’ve stumbled across recently. I’m not sure I understood it well enough to say this accurately so laid out. Do you get your comments? And that’s some of the newer forms of neural networks, particularly graph neural nets and they’re closely related message passing neural nets seem to have some aspects of composition naledi to them, or at least they act upon compositional components.

[00:51:02] They’re better. So the graft networks have a graph structure which is from straight symbol manipulation. And I think that that’s a prerequisite. Essentially it saying that knowledge is represented as things that look like a network or tree or a graph in technical terms. And that allows you, for example, to formally specify the relationships between things. You can say that Molly is the father of Gary or something like that or mother of Gary. Excuse me. So you can you can make specific claims about relations as opposed to live neural networks. That’s just like character by character use feed in a sentence. And so I think that’s a step in the right direction. I think there’s still too little in general formal reasoning over those systems. Not a great way of representing abstractions as opposed to specific facts. But I think it’s a step in the right direction.

[00:51:52] Let’s talk back to human language and its composite composition, naledi. There you go.

[00:51:57] Well, so, I mean, I already gave you one example of composition, naledi. You know, this sounds about, you know, I like an iPhone. So an iPhone is a noun phrase. IPhone is a noun. We’re putting those together to pick out particular iPhone and then we’re using like as a verb and we make a verb phrase like an iPhone that we have a subject to it. I like an iPhone. And then we can make that whole clause part of another sentence. So I can say, you know, that I like an iPhone. And so, you know, you as a noun know is a verb. And then we have this whole clause that represents an idea, a proposition as as some people would call it, of I like an iPhone. And so we have a whole set of verbs that are what we call propositional attitude. So you know that I like an iPhone or you deny that I like an iPhone or doubt that I like an iPhone. You have all of these different possibilities. Composition naledi is about the whole being computed from parts. So once I know how verbs like doubt work and they know how nouns like iPhone work and I can put together the whole sentence once they know all of those pieces and I can derive an interpretation of something in terms of how its pieces work, which is actually exactly how computer compilers and computer interpreters work is. They make a composition of the components that tells them what to do.

[00:53:15] I like the phrase was it the whole being computed from the parts? That’s right. Yeah, I’ll bet that should be people’s number one. Take away what we’re talking about here. Let’s pop up a level. We’ve talked about a lot of details in and out and around of deep learning and kind of alternative approaches. More broadly, what are some other approaches to A.I. that may not be getting enough attention, in your opinion?

[00:53:39] Well, I think the all the action right now is actually on hybrid models and some people are actually building them. Not enough people are talking about why they’re important as opposed to deep learning. So I again worry that Geoff Hinton has too much influence on the field. He’s been going around, you know, he’s the godfather of deep learning, going around saying that we shouldn’t build hybrid models, that we should just use deep learning, that some simple manipulation is old fashioned. But if you look at what people actually do when they want to get something done, they actually do build hybrid models. So Alpha Ego is an example of that. He uses deep learning to recognize patterns and then it uses tree search, Monte Carlo tree search in particular, which is straight symbolic operation to actually search the space of possibilities. I go there you go there. Let’s look at a bunch of different possibilities added up. That’s a symbolic computer program of the classic sort. They combine that with deep learning. So that’s a hybrid model. Josh Tannenbaum, our mutual friend, works on hybrid models. He’s got one on vision recently with a bunch of people on IBM where you use deep learning to recognize parts of images, but then you do a lot of reasoning about what those parts of those images are and what they relate to each other. Use a lot of some. He uses a lot of symbolic operations in order to do that. So that half of the house, so to speak, looks like computer programing. And the the front end of the house, if you will. Looks like deep learning. Another example of this Rubik’s Cube thing that got so much press, the part that actually does the solving of the cube, namely figuring out where I should turn, you know, the blue face should rotate 90 degrees. That stuff is actually done by a symbolic system. And then it’s some of the mapping between what you see in the physical joint forces you should be applying that are done by the deep learning system. So that’s a hybrid system. There is just a new Facebook kind of web search thing. It’s a hybrid system. I think that’s where the real action is going to be, is in trying to figure out theoretically sound and intricate, intimate ways of bringing together these traditions. It’s not a binary proposition. You know, Hinton presents it as if it’s either that new stuff or the old stuff. What we really want is even newer stuff that combines some of the deep learning stuff or things like that. Also probabilistic programing, Bayesian systems, a whole lot of kind of statistical techniques. I won’t say which ones, but some of those with the more classical knowledge representation stuff. That’s the other half of today’s conversation. That’s where the action is right now. I think it’s just getting started, but that’s where I expect the winners to come from.

[00:56:12] How about things like self-driving cars? I was sort of expect them to have a hybrid nature.

[00:56:17] A lot of them do. And I mean, that doesn’t come out in the media. So what comes out to me is if there’s a neural network involved anywhere, a system is called a neural network system. But you could do the opposite. You could say if it’s a if there are any symbols in the system, it’s a symbolic system. And that’s silly. But there’s there’s a huge bias in how these things are are reported. Similarly, the open eye when they talked about the Rubik’s system said a pair of neural networks learns to do such and such. And you have to kind of read the fine print and realize it’s not just the neural networks that are doing the system here, but there’s also this classic 20 year old symbolic algorithm that’s kind of at the core of the so-called solving part of it. So there’s a lot of hype around neural networks and people present their systems that way even when there’s other stuff going on.

[00:57:02] Yeah.

[00:57:02] Finally, I had a conversation six months ago with one of the smartest people in the world, I would say not in the field of A.I., but in a field not too far from my eye. And he had somehow extracted the idea that the only thing happening in A.I. these days was deep learning. I was fairly shocked at this abuse of of that notion.

[00:57:21] It’s a widespread misunderstanding, in part because the hype machine for deep learning is so powerful within various corporations that have a lot at stake. It opened the eyes, you know, actually no longer a nonprofit or no longer fully a nonprofit. So that fits under that umbrella. So Zack Lipton called it weapons, great PR. So you have a lot of weapons grade PR behind neural networks. And it’s a nice story to tell the media about. Hey, these are like brains and they work great, but they’re not really that much like brains. And it turns out you need other systems that kind of prop them up and so forth. But that’s a more complicated story and people aren’t as receptive to it.

[00:57:55] What about evolutionary approaches and happens to be something dear to my heart? You know, my deepest grounding at a scientific field is evolutionary computing.

[00:58:03] And as early as 2001, I did some work with evolutionary neural nets and other evolutionary approaches to building game playing agents. What’s going on in evolutionary A.I. these days?

[00:58:14] I think it’s a lot of potential, but it hasn’t succeeded. And my take on it is because everybody’s trying to recapitulate the period between the beginning of life and getting to, I don’t know. Dog cognition. And what they want is not dog cognition. They really want human cognition. But starting from pre bacterium isn’t going to get there. So you start with systems that are total blank slates. They don’t have the equivalent of a genetic basis of a vertebrate brain plant, for example. He took close to a billion years of evolution to get a vertebrate brain plan. Very hard. One evolutionary struggle. If I can anthropomorphize a little bit and then things pick up and pace from there. But the ISIS items that people build are like, you know, bacteria pre bacterial level. So you have a grad student work on it for a year and at the end they don’t have that much to show for it because they’ve basically started from zero. You know, the most interesting thing in my view and it’s a little bit species show. It is. But the most interesting thing that happened in evolution in the last few hundred million years is the evolution of people because people have such a different niche from other creatures and they have these amazing means of cultural transmission that, you know, they’re a little dim reflections of in other primates. But there’s really something special. People have this language thing, but the whole thing didn’t take that long to evolve, you know, maximally seven million years and maybe as few as like a few hundred thousand. Well, why did it happen so fast is because the primate brain plan on which the human brain plant evolved was already incredibly sophisticated. So it already had color vision. It already had a very sophisticated obstacle avoidance, very sophisticated social cognition, maybe not as sophisticated as ours, but still pretty sophisticated. And so if you evolve from a primate brain plan, you get all kinds of interesting things that happen. If you evolve from a bacterium, it’s just going to take a long time. It’s not impossible. But, you know, you need to have a lot of kind of lucky selections or maybe not lucky, you know, natural selection, you know, very fit selection. You need a lot of them to get from point A to point B, which is why it took, you know, so many hundreds of millions of years, actually took longer than you think.

[01:00:23] Life’s at least three point five billion years old. And the vertebrate lions didn’t really get roll until the Cambrian explosion, a mere five hundred and fifty million years ago. So the single cell and the colonial species were basically doing their thing for almost three billion years.

[01:00:42] And most of the work is recapitulation that 3 billion and most of the interest would be in capitulating.

[01:00:47] The more recent five hundred million that goes to your earlier suggestion and hint that maybe the answer is building these structures from which to let a eyes work from these databases of what is known about the world, why we have to learn all that stuff the hard way, right?

[01:01:03] Yeah. So I wrote a book about this called The Birth of the Mind, which was about how a small number of genes builds a complex brain. And part of the basic takeaway was that, let’s say the vertebrate brain plan is a very complex library of self assembling subroutines and you need the library in order to get going. The reason that people can build a Web site in a day now is because you have huge libraries to draw and lots of subroutines. And that’s why we were able to evolve or not. We but natural selection was able to evolve people relatively quickly from a primate basis that a lot of subroutines and libraries in the genetic code that are really useful. Whereas if you look at bacteria, there’s just not that much of the library. There’s some stuff for metabolism, but there’s not a lot of stuff there for cognition. And so if you’re you’re working with that e, you have to reinvent a lot.

[01:01:48] I think there’s a good hint there on on ways to move faster. One last area I want to ask you your thoughts about is the old field of cognitive architectures. You know, things like Saw and Act are, etc.. Does anything happen in those kinds of fields that might be relevant to the next big step?

[01:02:06] I mean, there’s a lot of people still working on. I think John Laird is maybe one of the biggest. He’s at Michigan and his stuff is worth looking at. I think that the general issue is, first of all, those things were mostly built originally for cognitive psychology rather than A.I. discussion right now is mostly around A.I.. Another issue in psychology is they might be true insofar as they go, but in a way they go too far. So they’re compatible with a lot of different things. I think that’s OK. But people are expecting that if I choose saw that automatically or act or whatever, that that gives me the answer to cognition, whereas it’s kind of like a notation that could allow you to build a model of cognition rather than the full model of cognition. In terms of a AI, they’re not, as far as I know, all that practice. Well, right now, I think that the intuitions behind what those people are trying to do are good, though, and I think that there are lessons to be learned from looking at how they go about problems like how do you train a person to fly an airplane or learn algebra? What are the steps that a person goes through? We don’t necessarily want our eyes to recapitulate those, but knowing something about what people do in the process of learning these higher order skills might be very helpful towards A.I. even if you know there’s not like off the shelf library code that you want to plug into your natural language understanding system.

[01:03:29] So maybe to put words in your mouth, there may not be anything right Baird today with the existing models, but it’s probably useful for people to know about them as we think about the future.

[01:03:39] I think is 100 percent useful to know about them. It goes back to something else that we’ve been talking about, which is this kind of intellectual narrowness of just knowing, you know, the math of gradient descent versus understanding the broader set of problems that people have tried to approach and cognition in order to recognize limitations and recognize avenues of attack and so forth. And really rebooting A.I. is a cognitive scientists or pair of cognitive scientists perspective on what you might do and try to move forward beyond the statistical techniques that have been well mastered but aren’t sufficient.

[01:04:14] We’ll come back to that at the very end. But I’m going out now. Hop to something else you talked about. What are some of the dangers from the data mining approach? Data mining is not. I think these might be your words. I’m looking at my notes, your data mining. I’m not done with great care and thoughtfulness can rebuild obsolete social biases. You give an example using Google image search. When we search for professor, only about 10 percent of the top ranked images were women, perhaps reflecting Hollywood’s portrayal of college life, but out of touch with current reality, much closer to 50 per cent of professors are women.

[01:04:47] It’s a really endemic problem and people think it’s easy to solve and it’s not. So you’re the famous version was that some African-Americans were labeled by Google as gorillas. That was in 2015 and Google got terrible press out of it and they quickly solved the problem so it won’t happen anymore. But there’s just more and more versions of that problem. There’s no currently and I think possibility of systematically solving it. So there’s lots and lots of variations on that. Your listeners can go home and try things like uncle and niece, for example. You’ll probably find that they’re mostly white. For example, even though, you know, white people are, I guess, a minority of the world’s population, but they’re better represented in the dataset and the system doesn’t know the difference between what’s represented the data system and what’s out there in the world. So it doesn’t capture it. And every time one of these things is fixed and there was no cases like this in 2012, they were fixed to 13 or whatever, 2015, 17, 19, so forth. Problems just keep popping up, but it’s like whack a mole and people put bandaids. If I’m mixing my metaphors sorry, people put bandaids on them, they fix one of these problems. But there are hundreds or thousands of potential instantiation. Nobody notices them and they just keep happening. And it’s exactly about perpetuating existing statistics rather than understanding what’s going on. If you think about what a good economist like Steven Levitt of Freakonomics fame does is it takes a bunch of bad data from a bunch of bad studies, figures out how to decouple them, confound found them in order to drive a sensible conclusion. And we need ultimately for AI to do that. So gets a bunch of bad samples. It needs to understand what’s at stake, what’s the history here and compensate for prior history and so forth and come up with something that’s in line with our values or objectives or whatever. And if you don’t understand our values and what we’re trying to get at, it won’t work. So if you took as your data set, this would be hypothetical, but you took the proportion of musicians that were ballet dancers of orchestra quality or orchestra caliber musicians that were ballet dancers in 1910 and you discovered zero. So you put in your system that the weight that you should actually penalize people for being ballet dancers because no ballet dancers are orchestra level musicians in nineteen hundred. So your system builds in this bias and then you discover later humans discovered later that that’s partly because there was bias on the part of the people who are choosing musicians. They didn’t want to have women in there. They thought women weren’t qualified. So then we moved in deep blind auditions and suddenly there are lots of women in orchestras. But you have this historical data and you have some data dredging machine that doesn’t understand the difference between the historical data and the recent data doesn’t understand that deep blind auditions change things and it just puts it altogether globs all the data and in that way it perpetuates the bias.

[01:07:40] Very interestingly, in that long discussion about data mining, you use the word understand about 12 times, which tells us that if we want our eyes to solve these problems for us, they have to understand which they don’t today.

[01:07:54] And the defense of deep learning people is to say it’s hard to define understanding, which would be like saying that since you can’t quite define pornography. And, you know, the famous quote that I’m alluding to that we shouldn’t have any policies about it. That’s silly. So it is hard to define understanding. We tried to give it by example and primarily we gave lots of examples like understanding is being able to read a children’s story, make inferences about who did what to whom, where, when and why. Understanding is about being able to analyze those journalist questions. When you’re confronted with a narrative in an article or things like that.

[01:08:28] I just had a odd thought, perhaps the rejection of the concept of understanding from perspective of the neuro net folks is at least analogous to the rejection of internal process by the behaviorists.

[01:08:40] It’s very, very similar. There’s a similar instinct behind them. So, you know, both the behaviorists and their latter day reincarnate, the neural network, people wanted to everything from data. They don’t want to talk a lot about mental representation. Beavers did want to talk about it at all. And neural network people mostly don’t. So they actually come from a very similar part of intellectual space that both trying to define everything mathematically and kind of missing the importance of knowledge, representation and neatness and so forth.

[01:09:10] That makes sense to me. Final a bit on this section. I love this point that you made, which I had never thought of.

[01:09:17] But it’s going to become more and more true. The heavy dependance of contemporary A.I. on training sets can also lead to a pernicious echo chamber effect in which a system ends up being trained on data that it generated itself. Earlier, I have noticed that if you use Google to look for odd things which I do all day, every day, more and more of search spam is obviously generated by rather stupid eyes. Right. And something mining the web is going to be mining this A.I. generated garbage and over time that’s going to get better and better and more and more of the content on the net is going to be generated. And then if it’s processing its own, as you say, its own generated data, or at least its by relatives of itself, other AIS using similar toolkits, I don’t know what the bad effects could come from that, but it seems like that’s not a good idea.

[01:10:10] Sucking your own fumes never could.

[01:10:12] Yeah, exactly. But I admit that just actually true it has to be happening right now.

[01:10:16] Or because it ties the particular examples I think we gave unless this was in the article that we wrote about in the book was about translation. And so, you know, some translations are done by Google Translate, particularly languages where not a lot of people are writing in Wikipedia and then that stuff gets fed back into Google Translate. It crystallizes its own mistakes some of the time, I would imagine.

[01:10:38] That’s right. Right. You did have that example in the book. All right.

[01:10:41] I’m going to head for the homestretch here and talk a little bit about artificial general intelligence. When I look at the page for your company, Robust Day, I said help us make robots smart, collaborative, robust, safe, flexible and genuinely autonomous. That sounds an awful lot like a G.I. to me.

[01:11:01] Yeah. I mean, if you had a G.I. in a bottle, it would do all of that. If you don’t have a G.I. in a bottle and nobody does, then you can try to take steps towards it. We’re not promising that we’re gonna deliver a G.I. anytime soon. Of course, we’d like to position ourselves to be able to help in that discovery. I think each of those criteria are steps towards a G.I. or, you know, are our metrics that could measure progress towards HCR. We’re trying to find commercial cases that are steps along the way. We can’t promise that we’re going to solve all of it in a day. But if we can make robots that are noticeably more flexible than the robots that we have now and noticeably more reliable, then that’s obviously going to really expand the scope of application of robotics. Right now, robots have to be in in cages or they have to be in very carefully defined environments. You have situations like the part of the big problem with the Tesla Model 3 is that Musk overestimated how easy it would be to get robots to work in production in complex assembly lines. So when things aren’t exactly the way that they were in your blueprint, you have different kinds of objects in different places and so forth. Current systems just aren’t that good at it. To really get a full service version of Rosie, the robot might take us closer to API and maybe there is some intermediate point that’s not Rosie the robot, but that’s at least safe in a domestic environment and a lot better than than what we’ve got now. Similarly, like you could think about package delivery. Right now people are building all of these kind of four wheel cargo carriers to bring something to your street, but they don’t bring it up the stairs. Right. And so that’s not really what people want. That’s not why they’ve been paying FedEx for the last 30 years to have to go out to the street to collect the package when the delivery guy is there. They want FedEx to bring it to the door. And building a robot to do that would, I think, be fully HCI, but it would require a lot of reliability, a lot of autonomy, a lot of flexibility of the sorts that we’re talking about.

[01:13:09] Yeah. Having formerly been a guy in the domain name business, at one brief period, my career, I always register domain names I think are indicative of the future. And one of the ones I grab about a year and a half ago is Prado a G.I. And the things you’re talking about there strike me exactly as Puerto a G.I. and I think that’s going to be a very hot area for companies in the coming few years. Certainly hope so. I think you might well be the right place. But bump it up a little. But you have to have thought about a G.I.. What are your thoughts about timeframe to really getting a GI?

[01:13:41] It’s hard to know for sure. I mean, I always think about like in the early 90s, nobody had any idea how big the internet would be and they had no idea that social networking would change national politics and all kinds of stuff. So it’s hard to really predict the future. The example we give in the book is in Blade Runner. They have androids that are fully able to blend in with human beings and at one point they stop at a payphone. And this is sort of astonishingly. Anachronistic because know cell phones, which basically became widespread in the year 2000 and these robots are a ways off. So predicting is very hard, but we can say for sure. I think is no commercial system is remotely close to natural language understanding, which certainly a prerequisite of HCI. No commercial system is remotely close to the kind of flexible, dynamic reasoning that people can do. No current commercial system is close to being able to transfer between different tasks and the way that you were from one war game to another. These things don’t exist yet. They may exist in somebodies lab, but they’re not widely known. And so it’s going to be a while in that much. We can be sure it’s not happening next week. It’s probably not happening in five to 10 years. It might happen in 20 years. It’s really hard to project and it might happen in 100 because the problems are really, really hard. Nobody has a bead on how to develop enough understanding of the world in machine interpretable form right now. For example, a lot of problems that have to be solved. So, you know, my guess is somewhere between 20 and 100 years, maybe closer to 50. And it’s a wide confidence interval. People always want, you know, a number like Ray Kurzweil loves to say it like 2031. Anybody gives you a precise number is bullshitting you. It has to be a confidence interval, meaning, you know, within these boundaries with some likelihood we can’t do that.

[01:15:33] The way I satirize Ray’s estimate, as I say, yeah, February 26 at eleven thirty p.m. in 2042, we will achieve a guy right on RTW Fifth Street in Dallas.

[01:15:45] Right. I mean, that hyper precision makes a compelling to many people, but it’s actually a clear sign that it’s bullshit.

[01:15:51] Yeah, it’s all about the error bars. Right. How about with you. It’s somewhere between 15 and one hundred and buy to put down a small bed. I’d say 40 to 50, something like that. But who the hell knows. Got out bar.

[01:16:03] Well no it couldn’t happen to mine. And that’s the one thing you really can’t do. It’s just not happening tomorrow.

[01:16:08] Five years is probably the shortest possible period, but I shouldn’t say short as possible anyway. Let’s not speculate any further. We know it’s uncertain. Not tomorrow, but probably within the lifetime of people who are already alive, which is interesting. Which brings us to the topic that we have talked about least a little bit, even though I’m bored to tears, whether it is a G.I. safety. What are your thoughts about that?

[01:16:32] The line we have in the book is don’t worry about killer robots, at least anytime soon. First of all, they’ve never shown any interest in our affairs or in attacking us. So worry about bad actors misusing a lie. But don’t worry about the robots rising up. And second, if they do come in, attack, lock your door. And if that doesn’t work, climate tree, because robots don’t that open doors. They don’t have to climb trees. They’re really not that bright right now. So we don’t have anything in the near term to worry about.

[01:16:59] All right. I think we will wrap it up on words that we had lost some time with technical difficulties at about three or four more questions. But that’s all right. This has been extraordinarily interesting, extraordinarily useful.

[01:17:10] And I would strongly encourage people that are interested in questions like this to go out. And Gary’s book. Gary had it. Coauthor, who was your coauthor? Let’s give him credit.

[01:17:19] My coauthor is Ernie Davis. He’s a computer scientist at NYU. And ninety five percent of the really cool examples in the book are his. All right.

[01:17:26] So Ernie and Gary’s book, Rebooting A.I. do you find this at all interesting? Go read it. Thank you very much, Gary, for our wonderful conversation. Thank you very much. The Screen Production Services and Audio Editing by Jared James, Consulting Music by Tom Mueller and Modern Space Music Dot com.