The following is a rough transcript which has not been revised by The Jim Rutt Show or Forrest Landry. Please check with us before using any quotations from this transcript. Thank you.
Jim: Today’s guest is Forrest Landry. Forrest is a thinker, writer and philosopher, and this is at least the sixth time he’s been on our show. He’s one of my very favorite guests. We talk here on the Jim Rutt Show about the difference between simulated thinking and real thinking. Let me tell you, when you’re talking to Forrest, you’re talking about real thinking. Yeah, we’ve had some awesome conversations in the past, most recently in EP 153 where we talked about a small group practice, a very interesting way for groups of 15 people or less to organize potentially very complex projects, organizations, and communities. Also, if you want to keep up with Forrest, you can follow him on Twitter at Forrest Landry, that’s with two Rs, at Forrest Landry on Twitter. Welcome, Forrest.
Forrest: Good to be here. Great to meet with you again.
Jim: Yeah, it was always informative to me. I always feel like I am a bigger person after I have one of our conversations then before, and that’s always a good thing. Today we’re going to talk about a fairly voluminous amount of writing and thinking Forrest is done on the risk that advanced AIs present to humanity. So let’s start off by maybe defining some of the terminology you talk about AGI and a term I’ve never really heard before, I don’t think, APS. What are those two things and where do they sort of fit?
Forrest: Well, so when we’re thinking about artificial intelligence, the first category distinction that I tend to make and is made by quite a few people also is the notion of narrow AI versus general AI. So narrow AI would be an artificial intelligence system that is responding within a particular domain. You ask it a question and it knows how to answer it, but the answers are about a particular topic. There about maybe medicine, if it’s a sort of doctor bot of some sort or another. Or maybe it’s a robot in a particular factory floor and the world in which it operates is that specific singular world. When we talk about artificial general intelligence we’re talking about something which essentially can respond in a multiple large number of domains. So multiple worlds or multiple fields of action. And so in this sense, there’s a notion of general intelligence that it has the capability to receive pretty much any task, anything that a human can do, for example, that it presumably would also be able to do and that it would have a reasonable chance of maybe being able to do a better job at whatever skills we might have.
APS refers to advanced planning systems. So for instance, if you were in a situation like a business or you were conducting a war as a general and you needed to, in a sense have a plan of action or some sort of strategy, then in effect you would need a kind of general artificial intelligence. Because the world is complex, it has all sorts of different things going on at the same time. And that in effect, when we’re thinking about strategy, we’re kind of needing to account for all of these interactions and all of these different dynamics in some sort of abstract way. So when we say advanced planning, we’re not necessarily saying that it’s, quote-unquote, some sort of agents, but that it is in a sense helping us as agents to make better plans or to be able to do things that we wouldn’t otherwise do. So you can think of it as a kind of force multiplier in the sense of responding to complex situations. And for a variety of reasons, there’s kind of this underlying issue about whether it’s a general response or a narrow response, and that has implications with respect to what we would think of as issues of alignment and safety.
Jim: Yeah, it’s interesting. This is actually perhaps another move up the elbow of the hockey stick today. I don’t know if you caught it, but the GPT-4 paper was released today, it’s 100 pages, I haven’t had a chance to read at all, but I’ve read bits of it, the abstract and a couple of journalistic takes on it. And even though these large language models seemingly should not exhibit intelligence, they’re feed forward, they don’t have it, they can’t do logic, they can’t even do math except by accident. They seem kind of architecturally kind of dumb, it may be that we are kind of dumb when we really get to thinking about it, because GPT-4 is astounding if these results published in this paper and I have no reason to doubt them, here’s some amazing breakthroughs.
Note, one of the things that GPT-4 will do, that GPT-3, which is the basis of ChatGPT, is it’ll understand images, videos, audio and text simultaneously and make cross domain connections between them. And as we know, things like IQ tests often have visual components, and I gave ChatGPT IQ test, one of the ones that psychologists use, but it was language only, and it scored about 119. So about as smart as a third tier state university graduate, something like that. GPT-4, because it can do a lot more domains has been able to do very well on a lot more tests. For instance, it took a state bar exam and scored at the 90th human percentile, the LSAT test, pre-law, 88th percentile, GRE quantitative, which is math, 80th percentile, verbal 99th. It basically did in high 80s, lower 90s percentile human on every AP test, SAT, ACT, et cetera, pretty much across the board.
So this sort of very brute force-y, not very elegant and not formally intelligent or at least not architected anything at all in ways that you’d expect it to have any kind of consciousness or agency is nonetheless now giving us poor little humans a run for our money across a pretty wide domain. Is this surprising to you that something is essentially primitive as a large language model is able to have this much lift up into the higher percentiles of human capacity across such a wide domain?
Forrest: Unfortunately not. So in effect, when you’re looking at the totality of all human expression, there’s a huge amount of latent knowledge or latent intelligence in that. And it’s unfortunately frequently the case, that out of relatively simple ingredients that we can get behavior and phenomena well outside of the expectations of just those simple ingredients. And you’re thinking about fractals and things like that. But in life systems for example, there’s lots of places where, again, from relatively modest components used over and over again that we can result in emergent behavior that has characteristics and capacities that wouldn’t necessarily be able to see or anticipate from the ingredients alone. I mean, obviously we see this in chemistry and things where the properties of oxygen and hydrogen wouldn’t necessarily predict the properties of water. I mean with some really advanced math and simulation you might be able to do that sometimes.
But in the universe itself, rather than trying to think about some sort of reductionism as being a predictive model for all sorts of behavior, it is actually the case that sometimes we want to try to account for behavior in terms of the general phenomenology that can emerge from complexity itself. So in this case it’s the multi-level application, so it’s not just that it’s figuring out patterns at a relatively rudimentary level one word to the next, but essentially entire sentences or paragraphs one to the next. So in effect it’s the abstraction capacity that happens as a kind of concomitant aspect of that that makes it decidedly likely, in fact, to my mind, that some generalizations of these types would incur.
Far more concerning to me, and you mentioned this explicitly, is that now it’s not just doing text, it’s doing video, it’s able to take in audio, it’s able to take in all sorts of different fields of information from multiple different domains and to correlate those and to essentially respond on the basis of multiple domains as well. So in this sense, it does strike me as being both intelligent in the classical sense of what the word intelligence means, which has to do a kind of appropriateness of response, and also that it is in fact general. So in effect, we are talking about a kind of artificial general intelligence in this specific sense. And although it might not necessarily have the same kind of cognitive causal reasoning powers that you and I have become accustomed to using, this may act actually be more due to absences of, as you were saying, macroscopic structure rather than say something wrong with the technique itself at a fundamental level.
Jim: Yeah, I do wonder what we’ll see when people start to combine these neural techniques with symbolic and logical techniques. We’ve had people on the show, in fact, that seems to be the sweet spot of the kind of AGI people I tend to have on the show, people like Ben Goertzel, Gary Marcus, Melanie Mitchell, Ken Stanley, and others who see deep neural nets as part of the road to AGI, but not alone. And that when we add these other capabilities together with it, that’s when we may well see the breakthrough. But those folks who are skeptical of deep learning as the way to AGI, I think are starting to get a little nervous here when things like GPT-4 come out. I’m still more with the, we need both, but this result is pretty astounding.
Forrest: Well, I have some opinions about that particular thing, and largely I agree that there is a combination methodology. Unfortunately in this particular area I’d prefer not to speculate because the last thing I want to do is make it easier for people to actually succeed in this than endeavor.
Jim: And let’s get into the [inaudible 00:10:07] matter. Before we do that though, as I was reading your materials over the last couple of days, I always look for a, let’s call it, a theme deep in the materials when I’m reading and I believe I found one. Tell me I’m full of shit, that’s fine, I often am. Which is Rice’s theorem, as I think quite central to your argument. We’re going to refer to it several times, I’m sure, in the next 90 minutes or so. It might be useful to explain in its plain English as you’re capable what Rice’s theorem is.
Forrest: So the idea behind Rice’s theorem is actually pretty simple. It’s very straightforward in a sense. Let’s say for example that we were receiving a message from an alien civilization, our SETI telescope or something like that all of a sudden got a clear signal, some message came in, and the usual thought experiment is something along the lines of, as a security analyst, I’m going to say, “Well, do we know anything about what’s going to happen when we read the message.” IE the message could be some sort of virus. It could be some sort of program or code or some sort of memetic instrument that effectively would have very adverse consequences for our wellbeing as a species or as a planet receiving this message. Maybe its payload is some sort of instrumentation that paves the way for that alien civilization to colonize us.
And so in effect, there’s this question, is there some way that we could evaluate the content of the message in a way that would allow us to assess whether the message was safe, IE was actually something that we could read and process and take in a way that wouldn’t necessarily disadvantage us or that put us in a way in harm’s way, so to speak.
So this is a pretty classic problem. It’s effectively the same kind of thing a virus scanner is trying to do all the time. You have some document coming in, it wants to make sure that it doesn’t have some macro that’s going to take over your computer and turn it into some sort of instrument of whatever other person’s agency. And so in the sense of is there a computational methodology by which we can assess alignment to our benefit rather than just the benefit of the message sender? So in effect, the message is content and we’re trying to use an algorithm to evaluate the content and assess is it something we can read without harm to ourselves?
So in effect, it turns out to be a special case of is it possible for one algorithm to evaluate another algorithm, to assess whether that algorithm has some specific properties. We’re treating the message as itself kind of code, and we’re using one kind of code to evaluate another kind of code and basically to say, does the code that’s being evaluated, can we assess that it has any property? And the property of safety to our benefit is a special case of a more general notion of have some property, if we run this program, will it stop? So in that sense, you could think of the Rice theorem as being a generalization of this thing called the halting problem, which is, if I start up a piece of code on a computer, will it ever complete? Can we predict that in advance? And it turns out that you can’t. That there’s no general case with which an algorithm can be used to essentially determine the properties of some other algorithm. And that’s essentially what the Rice theorem is asserting.
It’s basically saying that there is no methodology that is effective with an arbitrary message coming in to assert something as simple as, will it stop? Is it safe? Is it to our benefit? Does it even have characteristics as not just will it run in finite time but will it consume finite memory, or does it have properties that would even be consistent with those sort of notions? So in effect, there’s a whole bunch of things that are implied in the Rice theorem. And in this particular case, as far as artificial intelligence safety is concerned, it turns out to have implications with respect to can we predict what a artificial intelligence system will do? Is that even possible in principle? And it’s turning out that for mathematical reasons that it isn’t possible even in principle.
Jim: Yep. So for instance, if you thought you had an algorithm to look at an AI and say, is it aligned with humanity, Rice’s theorem would say not possible.
Forrest: That’s correct. And so in effect, it’s not just that though. So for instance, the thing about the impossibility theorem is that it’s over-determined, which basically means that there are several different ways to show the same result. So for instance, even if we look at what’s required for predictability of general systems, one of the things that we need to do is be able to model that system in some fashion. We need to be able to model the inputs. We need to be able to model what’s happening inside. We need to basically be able to say what the outputs are going to be, make a comparison to the kinds of outputs that we would desire versus the ones we wouldn’t. Is it actually doing something that’s safe, is something we need to assess? And obviously if we’re running an algorithm to do that, we want the algorithm to generate an output that would actually inhibit unsafe actions.
And it turns out that on literally every single one of these characteristics that we run into insurmountable barriers. It’s not possible to always know completely and accurately the inputs to a system. It’s not always possible to model what’s going on inside of that system. It’s not always possible to predict what the outputs are going to be. It’s not always possible to compare those predicted outputs to an evaluative standard as to whether that’s safe or not. And it’s not always possible to constrain the behavior of the system. And so in effect, there’s different categories of description here that are basically saying, “Hey, all of these things are necessary, and the absence of any one of these would completely scuttle the effort to basically make artificial intelligence systems safe.” And in fact, not only are we seeing that one of these characteristics that’s clearly necessary for safety is missing, but that in fact all of them are, and in many cases for different reasons, some of them having to do with physical limits of the universe, other ones having to do with mathematics. Rice theorem is just one example of one limitation. But it turns out that just in the notions of symmetry and causation itself, or in the physical university uncertainty associated with quantum mechanics itself, that there are hard limits to what can be done and that those limits turn out to really matter.
Jim: Yep. That’s kind of the outer boundary of the problem. Let’s come all the way back. Early in some of your writings, you talk about the question of whether human to human interactions are likely to be convergent towards generalizing narrow AI into general AI. Let’s talk about that a little bit.
Forrest: Okay. Do you have a question, I guess? What is it that you want me to tell you?
Jim: I think it’s sort of obvious, but let’s make the obvious clear. How is it that we can be pretty confident that humans will attempt to move their narrow AI to general AI? What is it about human to human interaction and competition that makes that a likely thing to happen?
Forrest: So this is a case where we are saying likely because we’re talking about market forces, we’re talking about the kinds of things that are incentives and human behavior. And so in this sense, to the degree that there is a fraction of the population that has the belief, however that belief was engendered, that general artificial intelligence would be some sort of panacea. IE in the same sort of way that there was a lot of marketing hype associated with the internet. We connect all these people together, that’s going to be wonderful. It’s going to enable democracy. It’s going to allow all sorts of people that were feeling isolated to feel connected. It’s going to enable libraries to be accessible to patrons and so on. There was a huge number of things for which it’s possible to say, “Wow, this looks like it would be really fantastic.” It looks like, “Wow, there would be all sorts of things that would be opened up as possibilities.”
And when we look at the actual history of what happens with, say, the industrial revolution or the internet as an example, or other technologies, whether they be in the area of chemistry and so on and so forth, there’s things that we hope will happen, there’s the things that actually do happen, and then there’s the things that we would prefer or wish didn’t happen. And in this particular sense, I think that there is actually a lot of benefit to narrow artificial intelligence, but I think that the hazards associated with general intelligence are severely underweighted and essentially actually just misunderstood. And that the benefits associated with general artificial intelligence are actually fully illusionary. That we think that there would be Bess to general artificial intelligence, but in that particular sense, there just isn’t actually. It might look like that for a short period of time, but in the longer term I think literally the entire category of things that general artificial intelligence, the people believing that general artificial intelligence would offer to us, turns out not actually to be the case at all for some very good reasons.
Jim: Well, let’s dump into that. I mean, one of the guests we’ve had on the show many times is Ben Goertzel. In fact, he was the person that coined the expression artificial general intelligence, and he works on an artificial intelligence project called OpenCog, and another one closely related called SingularityNET. And he’s one of those people that says, AGI is the last invention humanity will need to make. Because from that point forward, we can use AGI to do anything that’s possible, including things that would be exceedingly difficult for humans to do, including understand physics at a new level, manipulate proteins and chemistry more generally, rationalize the economy, improve chemical production through understanding very deeply quantum chemistry, though that may require quantum computing too, et cetera. What’s wrong with that view, which is commonly held, that AGI would actually bring some amazing panacea like results even if we ignore for the moment and just for the moment, the potential dangers that come with it? Let’s just talk about your claim that the upside isn’t there.
Forrest: Well, the claim itself I would say it’s correct in that it could potentially do or can do anything that’s possible. I think the part of the claim that I disagree with completely is the idea that it would do so for our sake. That we can utilize it, IE that its action would be on behalf of human beings or in service to human interests. And so the idea is that it would be the last invention that we would make, but that it would be in service to us. That those things that it would do, or that it would choose to do, could in any way be reflective of our actual needs.
Jim: So it’s not a claim that it would lack incapacity to do X, but rather that if it was not in alignment with us, the X that it does is in no way guaranteed to be of benefit to humanity, and potentially quite the contrary. That’s a little different argument than saying that an AGI doesn’t have the capacity to do amazing things.
Forrest: No, actually it’s worse. It’s because it has the capacity to do things. And moreover, will it not have the potential to not be in alignment with us, but that it’s actually guaranteed that it will not be in alignment with us. And it’s the nature of the guarantee of it not being in alignment with us. That is actually the substance of the arguments that I’m making. So in other words, some people are speculating, well, maybe we could make it align to our interest, maybe we could harness it somehow or another, or utilize it to our interest because we have the option of building it the way we want to. So why can’t we build in some constraint that would allow it to be capable in every aspect except that the capabilities that it exercises would be for our sake rather than for its own sake.
And so in effect, when I’m saying there’s an impossibility of proof, I’m basically saying something like several things at once. One of them being there is no such way to build a system with those kinds of constraints. That kind of constraint itself is just actually a non-con. It turns out to be impossible in principle. And then the next thing is that not only is there now that we can’t impose those sorts of constraints, but that it is actually the case for, again, very strong reasons that there’s a guarantee that it will not be in alignment with human interest, not just the subgroup of people that makes it, but with any interest that are human or life oriented at all. And so in this particular sense, I actually treat the development of artificial general intelligence and things like ChatGPT or GPT-4 and so on and so forth is actually an ecological hazard in the same sense. It’s the final ecological hazard, it is the kind of ecological hazard that essentially results in the loss of all ecosystems. So in this particular sense, as far as existential risk is concerned for the planet, it’s in the very highest category. It’s in the category of creating a cessation of all life and not doing it temporarily, but permanently.
Jim: Now let’s talk about GPT-6 or GPT-5, and let’s assume that these things stay as feed forward neural nets, or not embedded in auxiliary systems to do language and planning, et cetera and essentially just ever bigger statistical combinings of what we already know. When we talk about agency or the famous paperclip hypothesis that the bad AGI, someone’s going to tell it to run a paperclip factory optimally effectively and it interprets that to mean turn the whole earth and all the people in it into paperclips. Well, now turns out the way these feed forward networks of the GPT type, they don’t have anything at all like agency, there is nothing that it is to be like a GPT. And in fact, even if you look at some of the more speculative ideas about consciousness, let’s say Tononi and friend’s idea of IIT, integrated information theory, if you run the PyPhi calculation, which is their number to represent strength of consciousness, the PyPhi of a feed forward neural net is zero. So perhaps this idea that GPT-6 is going to rise up and be bad, it is just nonsensical ’cause there is no agency, there is no consciousness, there is no real hazard from at least this family of AIs.
Forrest: Well, I do appreciate the argument and actually there’s a lot there that I agree with. So for instance, I-
Forrest: I do appreciate the argument, and actually there’s a lot there that I agree with. So for instance, I distinguish between the notion of agency and intelligence. I distinguish both of those from the notion of consciousness. And as far as I’m concerned, first of all, I don’t even need to comment on consciousness or define the term or to think about it because in a lot of ways, for all of the arguments that I’m concerned with, it’s just not even relevant. So in other words, whether it’s consciousness or not, for the most part, for anything that I care about as far as alignments and safety arguments are concerned, it just doesn’t actually matter.
Jim: Agencies to my mind, what matters.
Forrest: Yeah, exactly.
Jim: And it’s hard to see how a GPT like device has agency either. It’s essentially a pure reactor. You probe it and it just reacts one time straight through the neural net.
Forrest: Well, this is where things get a little more interesting. So for instance, one of the things that is part of the canon of the literature of discussing this thing is this idea of instrumental convergence, i.e that, and it in itself is based upon another idea called the orthogonality hypothesis. So essentially, in an example, you gave the paperclip maximizer, the intentionality of make paperclips was translated into a whole host of responses. So for example, in the example of a feed forward network, you don’t necessarily have an end to the output. So for instance, if I gave it a single input, it might produce a continuing series of outputs over a long period of time. In fact, there’s no reason why it might stop. It might just continue to produce outputs from that basis of input for a long time. So the idea is that the intelligence i.e, the responses that it gives as being somehow correct relative to the environment is distinguishable and or completely independent of whatever intentionality, whatever the seed is that we give it.
So for instance, we say make paperclips. That it is from that point onward going to be whatever other inputs is going to be interpreted with respect to the make paperclips directive. And that all responses from that point would therefore be favoring that. And this is so-called orthogonality thesis as applied. The notion here that its actions in the world represent an intention. That’s what characterizes it as having agency. So for example, if I were to have said to you sometime in a formulative period in your life, you are the person that’s interested in math. And then subsequent to that, you go to a bunch of math competitions and you become really, really successful at winning them, and your entire life becomes essentially an example of a premier mathematician. Then in a certain sense you have agency, even though there was a program seed that was put in as directive to become a mathematician.
So the notions of a feed forward network and the notion of agency aren’t necessarily mutually exclusive. For my own part, I see some troubles with the notion of the idea of integrated consciousness. I do recognize criticality as being an important part of system design, but the idea that just because it’s a feed forward network to me mean that it doesn’t have agency. It’s just that the idea of agency itself is a somewhat confused concept. We tend to think that there needs to be some interior direction that’s going on, but that interior direction could have been provided from an outside source at some earlier epoch.
So then the question becomes, can we ensure that the actions or the effects that it’s having on the world are in some sense aligned to that initial directive? Do they stay that way? Does that represent alignment? Does it not? Does it represent whose interest? Are those interests being expressed well? In effect, does it have an impact on the world? Are those impacts on the world reflective of the intentionality of the developers or of itself? And at a certain point, it becomes really hard to tell. For example, all of the applications that you have on your cell phone, maybe they represent your intentions, but just as often they represent the intentions of the builders of the app to display advertising or to get you to buy things or to show up in a certain way as far as political elections are concerned.
So in effect, the notion of agency and intentionality are actually deeply entangled. And that when we’re talking about complex systems in particular, or rather complicated systems displaying complex behavior, that in a lot of cases it just makes sense to model them as having agency simply because of that complexity.
Jim: Simply because of the complexity, meaning that they are unpredictable, essentially?
Forrest: They’re unpredictable from a human perspective. So for instance, from a deterministic perspective, it could be the case that I could make a copy of the system somehow guaranteed that it had exactly the same inputs in the same way as a pseudo random number generator. The computer can predict what the next thing’s going to be, but it can only do it by actually running the algorithm. And anybody outside of that, it looks like it’s a random number generator.
Jim: And of course, that touches on the very important concept that relatively people understand of deterministic chaos, which is even though a system could be entirely deterministic, and we still don’t know for sure about our own universe. There are still at least three of the quantum foundations theories that have no still stochasticity, believe it or not, in them. And we could have a completely deterministic universe that is practically indeterminate because of the fact of deterministic chaos or even tiny, tiny, tiny differences in initial conditions produce very different trajectories. And the amount of computation necessary to predict those trajectories is larger than if you turn the whole universe into a quantum computer. So from any practical perspective, the universe is indeterminant, even if it were determinant due to this issue about deterministic chaos.
Forrest: Well, I’m very tempted to respond to that particular thing, observing that in this particular case, we’re essentially aligning the notion of determinism with detail specified all the way down to the scale of absolutely zero, in which case we can have some interesting conversations about what the notion of information actually means, and whether or not the notion of causation and determinism are the same are actually different. But in this particular case, I would say that that gets a little bit farfield from the notion of artificial intelligence. But just to connect the dots a little bit, as far as theories of quantum mechanics and stuff are concerned, I do hold that the Heisenberg uncertainty principle does actually apply, and that it is essentially a limit on what is accessible information.
So even for example, we were to say something like the actual structure of the universe was deterministic, it would be impossible for us to know because of the limits on information flow itself written into the substrate of the universe design. This is roughly the same thing. It’s basically saying that all theories of physics are essentially theories of measurement or they’re theories of what can be measured. So for instance, quantum mechanics says things beyond or below a certain scale just you can’t measure position and momentum at the same time. Or if you look at general relativity, it’s saying things that are farther away than the time light cone in the Minkowski diagram are essentially inaccessible. They are beyond the edges of what’s measurable.
So in that sense, quantum mechanics defines limits on what can be measured in the scale of the very small and general relativity defines limits on what can be measured in the scale of the very large, but also both of them are effectively defining limits on what’s observable. So in that particular sense, it becomes almost a moo point as to whether something exists and we can’t see it or whether or not the notion of existence itself has been undermined.
Jim: Yeah, you’re right. That is an interesting and slippery question. Let’s move on, or we could spend all day talking about it. So pop back up to paper clip maximizers and related things. Even if we were to limit the discussion to feed forward type AGIs that aren’t going to wake up one day and say, “Hey, I’m going to destroy the world enslave humanity or enslave humanity first, then destroy the world.” Part of the problem is the relationship of human desire and these new tools.
Forrest: That’s a big piece of it. So there’s a number of different components to the overall argument. One of them is that human beings in the sense of feeling diluted, that they can effectively have their own agency or their own intentionality completely govern or defined the intentionality that is actually manifested in these complex systems. That turns out to be a predictability question or a causal question as to is it the case that we can have our agency dominate the agency of these systems in a way that is sufficient or complete? And so from a purely mathematical game theoretical point of view, we basically say, wow, given the game theory of human to human relationships, there’s an incentive system currently in place, market dynamics and so on that are going to drive the creation of these systems from a delusion point of view that we could in fact constrain the agency of this type.
And again, the first branch is to basically say that on a purely causal basis or on a purely mathematical basis or on a purely physical basis, that there’s a whole consolation of different arguments that basically say that it is categorically the case that that just cannot be done. But then there’s a whole series of arguments that follow after that which are at least as important. So in effect, we’re saying that, and not only is it the case that from an interior point of view like that, when we’re thinking about the system and we say, okay, the stuff that’s going on inside of the system that we can’t predict or constrain what’s happening there, that in the larger dynamic of how the system interacts with the world and how the world affects the design or the substrate of the system, that larger loop, the larger ecological loop has some intrinsic characteristics associated with that, that themselves are convergent on a agency that is also non-constrainable.
So in the same way that… Look at the relationship between the animal world or the natural world and the human world to get an idea of what the relationship would be like between the human world and the artificial world of these machines, it is not the case that any creature on earth is going to stop a bulldozer or a tank. It’s just that the nature of biological systems is just not really equipped to provide any effective resistance to what are technological systems. Chemistry and napalm, for example, can wipe out ecosystems. Any amount of mechanical support using heavy iron for example, can pretty much overwhelm the strength of any creature in the natural environment. There’s no substance that are made by trees that is going to prevent buzz sauce from cutting them down.
So in effect, there’s nothing that is inherently within the capability of the natural environment that can slow the advance of the human environment, which in this case includes buildings and roadways and railways and landing ports for aircraft and docks and so on, that all of the technology to some extent that human beings are using effectively creates a very strong asymmetric advantage from the human world to dominate the natural world. And we see this environmental pollution going on all over the place.
So when you look at that and you basically say, well, what are the characteristics that effectively define the interaction between the natural world and the human world? It turns out that those same characteristics occur between the artificial world, what the computers would be able to do, communicating with each other over the internet, or the kinds of things that they would be able to implement in the world, that there’s nothing that human beings would be able to do on a social level or on an organizational level or on a capacity level to prevent the dominance of the machine responses. So in effect, there’s a sense here that there’s a feedback cycle that is occurring through the relationship between the ChatGPT system, the people that are responding to it, the people that are developing it, and the nature that it has in itself.
So in other words, that larger loop has a evolutionary dynamic built into it, increasing capacity that in effect over a longer period of time, converges in the same way that say instrumental convergence would occur. We’ve taken to calling this substrate needs convergence, and the agency that appears, or the intentionality that appears from an evolutionary output of these feedback cycles that are occurring macroscopically in the relationship between the systems and their environment and their substrate, doesn’t necessarily need to be consciously recorded or even held as a representation inside of the device itself. It’s inherent in the nature of the dynamic of the relationship between the device, the environment, and its builders, however that shows up.
So in effect, there’s a sense here that we’re saying something along the lines of the agency can be implicit. It’s not even the case that someone has to program it to be that particular thing. It’s that the laws of nature itself, or the laws of mathematics itself, essentially require that certain feedback convergences will occur. There’s a fixed point in the evolutionary schema of machine design. So in this particular sense, there’s a intentionality that emerges the same way that animals, for example, might not necessarily know why they’re building nests or hunting these things or going after this particular stuff and not that stuff. They might not have a conscious sense as to why they’re doing what they’re doing. But nonetheless, the responses are still intelligent in the sense that it is a correct response for the furthering or the increase of that particular creature, or it’s just wellbeing in any general sense.
So in this particular sense we’re saying not only is it the case that we can’t constrain them or even to have them be an embodiment of our desires or our intentionality, that in effect through these larger feedback cycles that include things like the relationship between human beings and their intentionality and so on and so forth, but essentially warrants all of that to have essentially the convergence effects that matter in the sense of becoming not just an ecological hazard, but a hazard to all human beings the world over.
Jim: And presumably human to human competition is the catalyst that gets that started.
Forrest: It’s part of it. It’s not the only thing, it’s the fact of competition itself. So in other words, think of the multipolar traps that apply if you have multiple instances of an artificial intelligence or-
Jim: Well, why don’t you explain to people what a multipolar trap is? Probably not everybody that’s listening today knows.
Forrest: So it’s basically an extension of the idea of the prisoner’s dilemma, that you would have multiple actors that if they coordinated with one another, they could create a globally beneficial result. But that if any of them defected and or did the thing which was beneficial to themselves, that the whole commons would essentially suffer for that. And that in effect, because everybody’s thinking and more or less has to think the same way, that in effect, you end up with a tragedy of the commons, a race to the bottom. So in effect, we’re looking to not only identify where there are circumstances that create these sorts of traps that force us into circumstances to have a convergence on essentially the worst possible outcome. That in effect, if we’re doing good system design or we’re doing good process design, that we want to create virtuous cycles that effectively create outcomes that are beneficial for everybody.
But that means we need to understand where multipolar traps occur, where these sorts of tragedy and common dilemmas things are being set up, and to effectively recognize those and not do that, not set it up that way. Unfortunately, when we’re looking at the relationship between businesses, what they think are the advantages to artificial intelligence and the relationships with the commons that we have very strongly in the current moment a race to the bottom type situation. So as a result, me and other people are raising the alarm and basically saying, hey, by the way, not only is this bad, but it actually happens to be catastrophically bad in the sense of the wellbeing of the ecosystem, not to mention all of humanity.
Jim: And of course, I’d add another one. And if we look at the history of the great jumps and technology, many of them occurred during war-time. And if we think about classic multipolar traps or arms races, think of the confrontation between China and the rest essentially around weaponizing AI. I think one of the things we’re learning in the Ukraine war is the many times multiplier, maybe exponential for precision weapons versus just crude launching stuff. Russians can put out a whole bunch of heavy artillery and 15 high Mars rockets will do a lot more damage because they’ll hit very precisely. Lingering munitions will deal do even better because they can hit dynamically changing things on the ground. But so far, neither side has deployed true long-term autonomous war fighters, but that is within the scope of our current technological grasp.
A autonomous tank is no more complicated and probably less complicated because you don’t give a shit if it runs over some lady with a shopping cart than a self-driving car. So we’re on the edge within 10 years probably of having self-driving cars. We could almost certainly have fully autonomous tanks within 10 years, and we don’t know where it goes from there, particularly as these other aspects of AI seem to be on a pretty steep exponential. So even if the civilian sector were able to back away so long as war or very importantly, the mere threat of war exists, the nation states are forced into a multipolar trap around racing ahead as rapidly as possible with respect to AI development.
Forrest: So the general observation that we would make, first of all accepting and agreeing with everything you just said, is that to a large extent, we’re noticing that the environment is becoming increasingly hostile to humans in the same way that the environment that was experienced by nature, by animals and bugs and so on and so forth, has become increasingly hostile to them with the advent of human beings, that the natural world became more and more hostile to more and more creatures. So most bugs, for example, end up on windshields and attracted to city lights, and essentially their life cycles are disrupted in one faster than another, not to mention animals being hunted, and most of the ecosystem being plowed under for props of one sort or another, or just to make way for buildings.
Jim: Or one of my favorites, this one’s so bad, we have to get this one in. 80% of the biomass of all birds on earth now are domestic poultry.
Forrest: Or just the sheer biomass of human beings, if I’m remembering the statistic correctly, exceeds the total animal biomass of all other creatures combined.
Jim: No. Because we are outweighed by our domestic animals, with more cows than our humans by pounds.
Forrest: Forgot to account for that.
Jim: Close to as many pigs, but wild man.
Forrest: Versus non-domesticated.
Jim: Yeah. Wild mammals are down around five or 6% of all mammals now. Though it is good news to know that the nematodes and bacteria are still in there and they outweigh us as do trees and grass and lots of other things. But in terms of the higher forms of life, we basically are totally crushing them under our thumbs.
Forrest: That’s correct. Okay. So domesticated versus non-domesticated agreed. I stand corrected. So in this sense, we are noticing that the technology that we’re making is becoming increasingly hostile to human life. And a lot of people are experiencing this in the sense of they are frantically busy every single hour of the day, their phones go with them, and they never really get a break from whatever engagement metrics, social media companies are trying to entangle them and binge watching and so on and so forth.
The idea here is that just on a practical level or a technological level, it’s not just that war machines are making it hostile for human beings, essentially, the battlefield itself is becoming hostile to human beings. So as you mentioned, the tank would drive over pretty much anything. It’s going to mostly regard that any things that are in the space, I mean, it’s dealing with a much simpler problem rather than driving on roads, just drive on any level surface or any surface that doesn’t have a curvature that exceeds such and such a coefficient.
So in effect, there’s a sense here in which that the autonomous tank is actually easier to build than a self-driving car because there are fewer constraints on its behavior. It doesn’t have to care about the wellbeing of anything other than itself and, or maybe other tanks designated as being of the same class. So in effect, there’s a sense here that what we’re looking at with technology fundamentally is an increase in overall toxicity. So I’m now just going to step up away from artificial intelligence to just make a really broad claim, which is to say for any coherent notion of the word toxicity, and for any coherent notion of the word technology, that it cannot not be the case, that the notion that technology is toxic applies.
So what is toxicity? Toxicity is where I have a depletion of something or too much of something. So I might extract necessary minerals and elementals from your body, at which point, because of the depletion, you get sick. On the other hand, if I put too much stuff in your body, I put too much lead or too much chromium or something like that, then you also get sick. But the entire nature of technological process is essentially linear motion. It’s taking resources from one place, and it’s more or less building it into itself, and those resources end up in another place, maybe a landfill or something like that.
And the upshot of it is that given the fundamental linearity of technology, as you said, the artificial intelligence is a feed forward network. A lot of the equations that are governing its behavior, at least in an individual neuron sense, are linear equations. It isn’t the dynamics of the solution, it’s the nature of the problem itself. So when we’re looking at ecosystems and we’re looking at things that are life, we’re looking at cycles, we’re looking at reclamation of atoms and or flows of energy that essentially have a much more distributed pattern than that associated with technology itself, again, fundamental level.
So in this particular sense, I find myself ending up being an advocate of appropriate use of technology, neither too much, nor too little. That in effect, in the same way that evolution is going to keep up with our capacities to use and utilize technology, obviously none of us have evolved to deal with autonomous tanks. So in this particular sense, it’s essentially required for us as an intelligent complex species that can be a bit forward thinking to essentially account for the basic underlying dynamics associated with technology itself. So that evolution in life can actually continue. Evolution can’t have prepared us for this particular dilemma, but in effect, being conscious beings, we can respond, or at least for a little while, we have the ability to respond in this, to slow down this onslaught of technology and to essentially compensate for the toxicity that was inherent in it. So when we’re talking about artificial general intelligence, I’m merely talking about the very worst case of that toxicity.
Jim: And it’s one of the things I often say is, yes, I understand the argument, we’ll get back to the argument on AGI, but truthfully, I think that we’re going to have to deal with humans using strong, narrow AI for bad purposes before we have to deal with AGI. For instance, the autonomous tank is not an AGI, not even close. It has, as you said, probably has a simpler problem than the self-driving car trying to operate in LA traffic or something. So a much harm could be done with large numbers of autonomous tanks, with something not even close to AGI I, and then of course, imagine an L. Ron Hubbard who decides to invent a new conman religion and used GPT-4 as his mechanism for convincing people all around the world in this amazing blitz crag, sometime in this summer. Again, a ChatGPT-4 is a long way from an AGI, but in the hands of a…
Jim: DT4 is a long way from an AGI, but in the hands of a human, being misused, could produce immense amounts of problems. So, how do you think about that in terms of where humanity should be expending its regulatory efforts on the pre-AGI risk in the wrong hands, versus the AGI risk?
Forrest: Well, this is one of those dilemmas of, do we focus on what’s immediate, or do we focus on what’s important? So in the sense that technology increases power inequalities, right? So for instance, if you look at kind of what’s happened in the distribution of wealth in the last 50 years, for example, you’ll notice that relative to pretty much every time prior to that, the use of technology, because it is so complex, it requires enormous resources to benefit from that. If I’m going to utilize technology, I need complex resources in order to be able to utilize it. So in effect, it requires larger and larger investments of capital and infrastructure in order to be able to build the kinds of things that can effectively utilize that technology and/or make it in service to the investor’s interests.
So in this particular sense, we see this increase inequality that’s happening, in the sense that the smaller and smaller number of the richer and the richer people are effectively able to take advantage of these non-linear effects to greater and greater advantage. So in effect, they’re the ones winning the game. I mean, if you think of the market system as a game at this particular point, the strategic advantage associated with technology is so strongly force multiplying for the top one-tenth of 1%, that at this particular point, coordinated action to compensate for that is becoming rapidly inaccessible. So in this particular sense, yes, I definitely agree that the use of narrow artificial intelligence in itself constitutes a kind of civilization hazard.
So in that sense, in the short term, meaning the next 20 or 30 years, or even as you say maybe even as soon as next summer, that we could be seeing severe social disablement or severe chaos emerging at the civilization level, but that in itself doesn’t constitute an existential risk. An existential risk would basically be something that essentially ends all humanity and all capacity for us to develop civilization for all of future time. So in one sense we’re saying, “Well, yes, it is the case that narrow artificial intelligence at a civilization level is also extremely dangerous.” And while there are some benefits that can be used, I’m thinking particularly of language translation and transcription, there are a lot of other things for which, as you mentioned, could be completely terrible.
I mean, on one hand, I’d love to have my doctors have a wider understanding of the kinds of issues that can go wrong in the human body and therefore, be more effective as doctors. But the last thing I want to have is a psychologist basically aiding and abetting a president to become a dictator. So in effect, there’s a situation here, where when we’re looking at the scope of things, yeah, I agree that the issue of narrow artificial intelligence in a certain sense is a fundamentally-disabling technology, and that, yes, attention should be put to that. But in the sense that it might end civilization, well, so what? Civilizations come and go. Maybe we have the chance of doing a better job next time. But in the sense of general artificial intelligence, it turns out that we don’t get it next time and neither does any other creature on the planet.
So in the sense that life is valuable, and if we look at the sort of complexity we’re talking about earlier and saying, look at one small square meter of the Earth, or even one single cell, and the effort invested by evolution over a billion years to perfect that design and we’re going to throw it all away to replace it with something that is, if anything, far more fragile on galactic time scales. That in a lot of respects that this is just actually a bad deal. It’s just actually a bad trade.
So in that specific sense, I’m saying, well, being the kind of person that is trying to think about the long-term wellbeing of the planet, i.e. what would it take for humanity to really love living here 1,500 years from now? But I’m seeing, okay, well, maybe civilization’s going to collapse over the next 50 to 150 years, and/or that existential risk might actually make it so that it’s unrecoverable in that period of time, that those ends up becoming the dominant factors. Because they not only do they exceed my lifetime, but they exceed my scope of whatever I personally have as a concern by a huge amount. Maybe I would be disadvantaged by some corporation with a strong artificial intelligence, but that might just affect me and maybe my children. But on the other hand, it’s not going to affect the entire family tree for all of future time. So in that sense, to the degree that I care about the important, it’s hard for me to believe that this is an either or choice.
Jim: Interesting. Now with respect to the view that AGI is an inherently, let’s say, above 90% likely to lead to the termination of the human species and all of its future, not all experts agree on this. Scott Alexander published a paper a couple of days ago, one of his Astral Star Codex or whatever the hell he is calling it these days, and he went and reviewed a number of people. He had said, “The median machine learning researcher says five to 10% chance risk.” Scott Aaronson, one of the smarter guys around, 2% risk. William MacAskill, who’s a fairly pessimistic guy, 3% risk. On the other hand, Eliezer Yudkowsky, 90% or greater risk.
What is your thought on why even… I mean, Scott Aaronson, there are few people who thought more deeply about this stuff than him. Why is there so much disagreement about how risky AGI is?
Forrest: So this is a modeling question. This is actually a fairly lengthy question. So first of all, what models are people using to predict risk? What are they based on? What kinds of factors are they accounting for? What I notice is is that all of the models that I have seen, every single person that you’ve just quoted, for the most part, they’re working within the schema of what’s likely to happen on a human-to-human interaction level, i.e. what are corporations going to do? What happens in marketplaces? That sets kind of like one threshold of what’s going to happen. The next layer has to do with pretty much, for almost all the people that you mentioned, again, a universal acceptance of the idea of the orthogonality thesis and the instrumental convergence hypothesis. So in effect, they’re evaluating essentially just how likely is instrumental convergence going to occur? How likely is a singularity going to occur, where something like a paperclip maximizer happens or some sort of intelligence explosion happens, thinking Kurzweil or stuff like that.
I personally am not really that concerned about the THM hypothesis, IE the singularity is going to happen and some number of minutes or days or maybe even a year that this intelligence explosion is just going to dominate and that the self-interested agenetic process of the artificial intelligence is in effect going to become, as you mentioned, the paperclip maximizer and destroy all life. That’s not the risk scenario that I’m concerned with.
The risk scenario that I’m concerned with is what I call the substrate needs hypothesis or the substrate needs convergence argument. It’s basically saying that if you look at the dynamics of how the machine process makes choices and has to make choices, cannot not make choices and what implications that has with respect to its furtherance or its continuance, and whether that furtherance or continuance happens directly in its own construction of itself or indirectly through human beings and corporations, that the net dynamics are the same. That in effect, what we’re looking at is essentially a kind of, does this convergence happen and can it be prevented? Those are essentially different questions than any of the people that are talking about this topic have been talking about so far.
Jim: Let’s drill into that. I mean, this seems to be the core of the argument, right? So say we have an AGI that has not taken off and is a million times smarter than a person and has become agenetic and willful and egoistic with some of these people…
Forrest: Well, now you’re talking about the instrumental convergence.
Jim: Yeah, that’s what I’m saying. We’re not talking about that.
Forrest: We’re not talking about that.
Jim: That’s what I’m saying. I’m drawing that as the picture that we’re not talking about.
Forrest: That’s right.
Jim: We’re talking… So where does the forcing function come that keeps us from being able to control chat ZPT 100 or something?
Forrest: So there’s two layers to it. One of them is does it converge on a sort of fundamental level, a substrate level, right? So in effect, we say, is there an instrumental convergence? That argument basically says to do anything you need capacity, an algorithm that increases your capacity is an effect going to favor whatever outcome you want to create. So as a result, it develops the capacity to build capacity. Now that argument itself has a certain form. So in other words, whether we’re talking about the capacity as being manifest in terms of its own intelligence learning structure, or whether that happens in its relationship as just a thing in the universe, the argument ends up being the same. Is there a convergence? So if we’re looking at just really basic things like it continues to exist, then to some extent, like anything that’s physical, it’s going to have to do maintenance on itself.
If it’s going to be effective at doing maintenance on itself or if it’s going to in a sense improve itself or increase itself or because even just continuing to exist is essentially an increasing of its scope of action across time, it’s still itself a kind of increase. So in effect, we’re saying, is it in the nature of things that exist that have agency, and we talked about why the notion of agency applies, even if it is a purely forward linear system, it’s forward linear in itself, but it’s actually circular in the sense that it affects the environment. The environment affects it, right? It might affect it in the sense of corroding parts and therefore needing to have replacements made. Of course, if it’s wanting those replacements to be effective, it’s probably going to over time of evolve more effective versions of those components because the human engineers that are doing it are basically saying, Hey, we don’t want to do maintenance all the time. Maybe we can make a better version of this thing that corrodes less. Maybe we can make a version that has higher capacity.
All of these things are essentially going to be driven from either an inside perspective or an outside perspective. Either the thing wants to increase capacity itself because it’ll last longer, and that just happens to become a convergent circumstance in terms of it’s only effective if it continues to exist. So in a sense, there’s a notion here that whether we’re talking about endogenous or exogenous action, that in either case there’s a fixed point in the evolutionary schema of the hardware design. So in effect, the fixed points are continue to be and continue to increase and to continue to increase its capacity to continue to be and to continue to have the capacity to increase its continue to be in every variation of the combination of all of these three terms. Right?
Jim: Well, I don’t understand why a particularly a fee forward network without any inner agency or inner consciousness would have a sense of self-preservation or a particular desire to grow.
Forrest: It doesn’t.
Jim: Only when it’s in the context of humans and the broader economy. One could see that a human who owns, let’s say, a proto AGI, I want my proto AGI I to be bigger and stronger to beat up the other AGI across the street.
Forrest: But all of that, that’s actually a secondary consideration. When we think about cells, think about the emergence of the first cell, whatever the conditions were that created the emergence of the first cell. That first cell didn’t have agency either. It wasn’t thinking about, oh, I want to make myself into a cell that can reproduce. It’s just that nature and chemistry or just literally the physical laws of the universe are such that the cell must have a certain form. It has to have certain capabilities.
Jim: Not true, not true, not true. I’m sure there were plenty of cells that existed that did not have replication capacity, but of course.
Forrest: They’re not still here.
Jim: They’re not still here. But those that did, dude, did. So there’s no necessity that a cell have reproduction capacity. In fact, that’s amazing that they do reproduce. This is a conversation I had with Stewart Crawford.
Forrest: But to the fact that they’re still here.
Jim: Yeah.
Forrest: They do.
Jim: The survivors, the survivorship bias.
Forrest: It’s the survivorship bias. So in effect, we’re basically saying that what gets called out is all the things that doesn’t have the necessary capacities.
Jim: Okay now I see your argument. Okay, so you’re going to say that, and let’s imagine there’s 10,000 proto AGIs in the world in 2050, and most of which do not have a tendency to grow and expand and self evolve, but a few of them do. If you run the evolutionary algorithm, you’ll see by 2100, they totally dominate the scene. Is that?
Forrest: Of course.
Jim: That’s approximately the argument?
Forrest: That is the argument, right? So in other words, I’m basically saying that there’s a long-term convergence process that is inexorable, right? There’s two things we’re showing here. One is that there’s a convergence process, and the convergence process arrives at certain fixed point kind of characteristics. Those characteristics are not arbitrary, and in fact, they’re pretty obvious to notice what those characteristics are. But moreover than that, we’re saying that not only is there this convergence process, it is inexorable once started it is convergent. So then the next part of the argument is to say, is there anything that can be done at all to prevent the inexorability?
Jim: Well…
Forrest: Is there any countervailing pressure that could be applied?
Jim: Fortunately, this does not get us into the rice theorem or the whole…
Forrest: No, no, it does. This is exactly it.
Jim: I’m going to make an argument that it doesn’t, right? Because we identified what it is that breaks the equilibrium, which is a self evolution and expansion essentially. Otherwise, we all end up with a bunch of cells that don’t reproduce. As long as the instrumental convergence doesn’t occur, and these things don’t become super intelligent, we’ll talk about that next, but then perhaps this danger could be dodged. Suppose we’re able to get agreement not to allow self modification and not to allow growth not directed by humans. Does that seems to break the two problems that we identified here?
Forrest: It would seem so, at least in the short term. Yes. Right? So for instance, we would say, okay, can we prevent this inexorable thing by interrupting the cycle, by effectively making sure that there is no feedback process through the ecosystem. My ecosystem, I’m basically saying all the manufacturing companies or all the capacities to build machinery that in any way influences the design of the next iteration of the system.
Jim: Well, that of course is going to happen. You can’t eliminate that, right? Especially not in the capitalist economy.
Forrest: You just made my argument. So in other words, we’re basically saying, first of all, that there are so many places that a feedback cycle can occur that you can’t close all of them off.
Jim: But in this case, there’s an economic, there’s certainly an economic libido for people who create proto AGIs to make them better.
Forrest: Of course. So not only is…
Jim: There’s no doubt about that.
Forrest: Not only is it the case that there’s so many places that passive feedback could occur, but we now have amplification going on that for sure will drive this cycle.
You’re making, you’ve now joined my side of the thing. You’re basically saying, wow, not only is it that we have a convergence process, but that human beings are in a sense helping it to be more convergent. The last thing that most people are going to do from a perspective of sanity in this particular case is try to slow it down because they think, illusion again, that it is to their benefit to have it be this way. So in effect, we’re saying, okay, there is this inexorable pressure that is coming from the multiplicity of feedback cycles that could occur. Are we ever going to find all of them and plug them all completely, perfectly, forever?
Jim: No.
Forrest: Right? Then there’s the, oh my God, there’s all sorts of feedback processes, all of which level in the direction of positive increase towards convergence. So then the next question becomes, is there any methodology, technical, engineering, otherwise mathematical, even in principle, for example, that could counteract this convergent pressure? This is where the rice theorem applies. This is where a lot of arguments about causation applies. It basically is possible to show that there is no way with any engineering technique, any algorithmic technique, that the nature of the tools they’re using, in this case, the logic of mathematics or the sort of causality of physics, whether it’s deterministic or not, in either of both cases, it doesn’t matter that the scope of the kinds of things that you can do with engineering is too narrow for the scope of action, of how much back pressure that you would need to create to be able to compensate for these convergent dynamics. This is the reason why I say, when we say, is it a risk over the long term? Yes, it’s not just a risk, it’s a certainty.
So in effect, it’s basically, if there was any argument at all that could basically counteract the pressure, if there was any suggestion of a particular process that could counteract the pressure, it would first of all have to come from outside of engineering. It’d have to come from outside of mathematics or anything an algorithm can do. So in that particular sense, it’s basically saying, well, literally, the only thing that we could think about in that particular space is social coordination, IE, to choose not to start the cycle of convergence in the first place. Because if we do, the net results are inexorable, and once they’re inexorable, they converge on artificial substrates and the needs of artificial substrates, which are fundamentally toxic and incompatible with life on earth. Not to mention human beings.
Jim: Well, there’s still, there’s a line here. How do we get across this line? Because we’re taking the case. It’s not a fast takeoff. It’s something that’s highly embedded in human institutions. We talk about the fact that there will be…
Forrest: It’s embedded in the dynamics of evolution itself. I’m not talking natural evolution, I’m just talking to mathematics of evolution. The notion of evolution is an algorithm that lives at a sort of an abstract level. It basically is saying that when you have feedback dynamics of a particular type, that certain convergence factors will over time emerge. That’s literally it. So in a sense, how…
Jim: Well, but let me… I’m want to drill down here, see if we can be very specific. The scenario that we were just talking about had this evolution being mediated in a very significant degree by economics, economic competition…
Forrest: Starts off that way. Doesn’t have to stay that way.
Jim: Also, great power. I frankly think it’s going to be the great power war making that’s going to dominate for a while.
Forrest: God, you know all of this is super scary. I mean, you’re basically saying you believe that World War III is inevitable, and I’m hoping not. I mean, I really like for that not to happen.
Jim: Somehow we seem to slide by, but maybe we won’t this time. But anyway, so these are things that are highly embedded in a human context. Let’s say partway along, humans decided they had a religious conversion and decided just to turn off all the electricity, nothing the AI could do about it, right?
Forrest: In the short term, that’s true.
Jim: Yeah. That’s like 2035. The last thing that was done is Jim Rut started a new crackpot religion convincing people…
Forrest: Hallelujah!
Jim: Electrons were dangerous. No more electrons, people. Then, bye-bye computers. So I think I’m helping you connect your argument here, but at some… Because you say the core of your story does not start with instrumental convergence, IE fast takeoff. But what you’re saying is that social… Socially enabled evolution driven by human multipolar traps and arms races eventually will get to the point we’re instrumental conversion convergence takes place. Is that what you’re saying?
Forrest: I’m actually saying that it gets to the point where substrate needs convergence takes place, and instrumental convergence may or may not be happening at the same time.
Jim: Okay. It’s still, as long as the battle of the proto AGIs is mediated by businesses and nation states. They have no incentive in the AGIs taken over, the AGIs don’t have agency. Let’s just stipulate that for this particular case.
Forrest: Well, it’s a boiling frog problem, right? So for instance, you put the frog in the pot and you increase the temperature slowly. There’s no point at which the temperature increases fast enough for it to jump out. So in effect, bear in mind that the dynamic I’m talking about occurs over maybe two or three generations of people. We don’t necessarily notice climate change because basically you’d have to go all the way back to Genghis Khan to really notice something like that dramatically.
So in effect, there’s a sense here in which when we’re thinking about how quickly do these changes occur, it’s too slow for us to notice that we have gradually and a sense comprehensively seceded our social power to these devices, which effectively would at that point be self manufacturing. This is even showing up today. So for instance, do you believe, for example, that in open AI, that the tools that they’re using to build chat GPT two were different than the tools that they used to chat, chat GPT one, and then when they made three, they noticed, oh, these tools could be made better if we automated this particular process.
So it makes it easier for us to make chat GPT three, and then when they move forward to four, they would modify those tools and they’d automate even more. So in effect, what happens is that when you’re looking at, say, microchip manufacturing in Taiwan, that the tools that are being used to create these things are themselves also increasing in automation and self capacity at the same sort of rate that the things themselves are increasing in capacity. So in the same way that the microchips are themselves automation devices, that they are providing support for the tools, which are automation devices. At some point or another, you just start factoring out the human desires. They start operating at higher and higher levels of abstraction.
Jim: Okay, I got it now. So let me try it, see if I got it. Maybe I have it. Which is the paperclip maximizer, fast takeoff AGI happens at nine o’clock in the morning by two o’clock in the afternoon, IQ 1 million, human race eliminated. In your case, and that’s through the [inaudible 01:14:42] , the, okay. When you get to 1.1 times a human, the first job you give the AGI, is to design its successor, right? And it goes from 1.1 to one, six, to two to nine, to a hundred, to a thousand to a million in five generations. You’re saying that you don’t need that because there’s essentially.
Jim: You’re saying that you don’t need that because there’s essentially an evolutionary process that’s mediated for a considerable period of time by human institutions and human conflicts, multipolar traps, and race to the bottom, which are pretty much… Arms races, which are essentially the same thing. And then at some point, gradually, and perhaps imperceptibly over multiple generations, we’ve eliminated humans from so many loops that even if the AGIs are only a little smarter than humans, they will have essentially built themselves a complete ecosystem that doesn’t need humans anymore at all.
Forrest: Well, they wouldn’t even need the intelligence. They could be one-tenth as smart as we are, and the shape and structure of those systems, those artificial systems, would still be the shape and structure necessary for them to maintain, and to continue, and to reproduce themselves as needed.
Jim: And that’s assuming that humans allow this boiling frog scenario to… Let’s take up their 10th as smart as a human, which they’re not at yet, but they’re probably getting close. Is your hope that humans would be wise enough to say, “All right, well, let’s take advantage of our 0.1 human AGIs, sort of proto-AGIs, but we’re not going to let them get us out of the loop. We’re not going to let them get us out of the design process. We’re not going to let them automate their own supply chains,” so that we still have ways to turn them off?
Forrest: Well, that works for a period of time. And then what we’re basically saying is that now the human beings are acting as a kind of barrier for the feedback process that is happening, because the AGIs are affecting the environment, otherwise they wouldn’t be useful. So in effect, to have them be functional in the sense of doing things, they’re making changes to the world. So in effect, we’re saying, “Okay, well, maybe those changes to the world are still defined largely by human intention, but they’re still defined somewhat by the needs of the machines themselves. Because if the world were to change to something that was completely incompatible with the machines, then in effect the machines cease to exist, they’re not available anymore.” So to the degree that we would say, “Okay, well, as the world changes, we’re going to adapt the devices to become more useful to the world,” then I’m going to basically say, “Well, then you have a convergence factor that is driving the design to be the kind of design that continues to persist.” Whether it’s persisting because of directly human agency or its own agency, well, that’s a slippery slope.
And the thing is that one way or another, you still have a convergence factor going on. And so in effect we’re saying, “Well, is there something that we can do to prevent that convergence factor from going on?” And it’s actually the case that there really isn’t, right. We could try to put human beings in place to act as a barrier for those feedback loops, but over time there’s going to be leaks. And those leaks essentially become cumulative, because of the multipolar trap dynamics themselves. So in effect, when we say, okay, there is this not only a situation, it might be a slow leak that’s happening here and there, and maybe we notice the leaks and plug them up, but other leaks appear over time, because everything fails eventually. So in effect, what happens is that every time there’s a leak, there’s a little bit of an improvement, improvement in persistence, and the capacity for increase. And these are convergent dynamics inherently, because of the multipolar trap itself. So in effect, what ends up occurring is that we say, “Okay, this is a ratcheting function,” right?
Every little bit essentially increases its capacity. Now whether that increase in capacity results in an instrumental convergence event or not, is, at this particular point, just a way to make a catastrophic risk worse. But in this particular case, we’re just simply saying that even without any amount of instrumental convergence, even without some sense of active agency, in the sense of, it’s specifically deciding that it wants to increase itself, or preserve itself, or do the paperclip maximizer thing, or anything like that at all, that the design of the substrate and the design of capacity, increasing capacity does actually still happen. So then you basically say, “Can we do anything about it? Can I use the engineering methodologies that we have available?” And then things like the Rice theorem basically show, no, it’s not only the case that this is convergence in a ratcheting sort of way, but that there is literally nothing we can do from an engineering point of view to plug the hole, to basically make it so that the barrier is perfect.
Jim: Well, that part, I buy it without any problem. That’s just basic halting problem extended. We know that’s real. But I’m still not quite seeing, it sounds like on this scenario, the non-instrumental convergence scenario, there is a phase change that has to occur at some point where humans are no longer necessary for the maintenance, nurturing, and growth of the AIs.
Forrest: Primarily necessary, right. To a certain extent, say over the next 100 years, again, do you think that, for example, the people who are going to go to school to learn the kinds of things that can be automated away, nobody’s going to pay for that. So in effect, what happens is that there are very strong social pressures to factor the humans out anyways.
Jim: So this is the argument that our economic and social and military and nation state pressures will allow, allow, because they’ll want them to, to take on as many human roles as they’re capable of. And at some point they’ll reach a tipping point where humans have no actual important roles left.
Forrest: Right. But it’s the inexorableness of that. So for instance, we can say, “Okay, there are all sorts of social pressures that, for instance, will increase the rate of evolution in this sense.” But the underlying dynamic of the mathematics is still the same. It just takes longer. So for instance, if I start to, whether I factor the humans out slowly or quickly, the factoring out part of it is the inevitable piece. So in other words, it’s inherent in the nature of the feedback mechanisms associated with the technological evolution itself, that-
Jim: Well, not the technological evolution, because the way you described it, it’s the technological evolution embedded in the economics and political and geopolitical machines, at least at the early stages. Because without those, a feet forward network isn’t going to decide to make itself smarter.
Forrest: That’s right. So for instance, the human beings basically make it happen. But once it gets underway, one way or another, the human beings get factored out, and-
Jim: Incrementally, by their own choice because, Hey, I would rather sit on the beach and smoke a cigar than go shovel shit. I can have my robot shovel shit for me. So I don’t need to do that anymore. I don’t need to devise computer chips anymore because Mr. Chat GP100 can do that for me, et cetera.
Forrest: Yeah. So we can say, for example, that human beings being the way they are, for the most part will want to factor themselves out. But I’m basically saying that for a lot of reasons, it would be the case that we would almost be forced to. For example, if we look at microchip manufacturing right now, okay, yes, there are human beings involved in the design, but for the most part, you want to keep the human beings out of the clean rooms. To make a microchip, I basically need to exclude every amount of dust, and any kind of chemistry or water moisture, that is incompatible with whatever physical process is necessary to layer on the next layer of however many dozens of layers that are needed to make an advanced microchip.
So in other words, as you notice that as the technology gets more advanced, the conditions in the environment that are the local conditions, like what’s needed at a Silicon Foundry for example, becomes more and more highly specialized, operating over temperature ranges and pressure ranges and chemistry conditions that are just incompatible with human beings being there operating the machinery. Because now, all of that stuff is more or less happening by remote control. So in effect, it’s inherent in the nature of technological manufacturing process. It is just, it’s of a different kind than that associated with say, making babies. Making babies basically is a biological process and is to some extent needing human conditions in order to have human reproductive reproduction going on.
Jim: Well, artificial wombs are coming, it looks like.
Forrest: I know, I know. But that’s a digression. The point that I’m making is that for the most part, people won’t make love in freezing temperatures. I mean, if it’s below negative 40 degrees, whether it’s Fahrenheit or Celsius, the same thing. There’s a sense here of which it’s just too uncomfortable for human beings to do what they do. And so in this particular sense, the kinds of things that make it comfortable for machines to do what they do happen to be the kinds of environments which are inherently and fundamentally toxic to human beings.
Jim: Got it.
Forrest: So we get excluded from the process, whether slowly or quickly, one way or the other.
Jim: Okay. All right. Okay, I see the story, and now I’m going to make it worse, which is starting in 2014, I spent several years taking a deep dive into cognitive science and cognitive neuroscience and tried to understand how human cognition actually works. And my number one takeaway is that humans are amazingly dim. And in fact, if there’s any one quote, a Ruttian quote that I hope survives, is that, “To the first order, humans are the stupidest possible general intelligence.” And that’s for a couple of reasons. One, Ma Nature is just flopping around at random trying shit, and the chances of her jumping far over the line are small. So just on the mathematics of how evolution works, we’re probably not very far off over the line, but even in a formal and analytical perspective, the amazing dimness of humans just screams at us. The most famous is the working memory size, seven plus or minus two.
I think now we learn better. It’s more like four plus or minus one, right, with three being the village idiot, and five being Einstein. There is nothing about cognitive architectures in general that the working memory size couldn’t be 10, or 100, or 1,000, or a million. Think how hard it is to read a book. You don’t actually understand a book, because you’re constantly shuffling words four at a time in and out of working memory, and then you’re trying to chunk it, and work those in and out of another piece of working memory. It’s an unbelievable kluge. And if you read a 280 page technical book, you have some sense of what’s in it, but you don’t know everything that’s in it.
Forrest: That’s correct.
Jim: If you had a working memory size of 270,000, which isn’t that large, it would fit on your phone, you would actually understand the book, at a very deep nuance level, and that you could a answer every possible inference from that book without fail, without too much intelligence. Compare that to a human, and you’re in a completely different realm.
Forrest: Well, you were telling me ChatGPT this morning was exhibiting the kinds of intelligence, as passing tests that most human beings can’t pass.
Jim: Without even being very intelligent, which is amazing. But let’s think about these things. Think about when we actually start building these machines in ways that are like humans. The other area humans are terrible, our memories. Our memories are the faintest images of what has happened to us, an episodic memory. Remember back to your fifth birthday party. I can barely remember it, and-
Forrest: My whole early life is more or less completely lost to me. So first of all, I just really want to affirm the basic point that you’re making, which is that, and this is how it showed up in my world, was human beings are basically the stupidest creatures possible to develop capacity for technology.
Jim: Yes, they’re-
Forrest: Right?
Jim: They’re the dumbest possible general intelligence.
Forrest: To create technology, right? The technology that we have is essentially way in excess of our capacity to understand how to work with it and use it, because we just barely got the capacity to develop it in the first place, and it’s already exceeding our capacity to understand it. So in this sense, there’s a very strong need for us to account for the fact of, as you’re saying, our own stupidity in the face of intelligence, machine intelligence. And so in effect, I’m not saying that I’m going to be even the only person to have made these observations. I’m sure there’s other people that are making these observations, but I’m basically pointing out that in the same sort of way that we actually now need the wisdom to think about the relationship between technology and evolution. And I’m saying frankly, we need to understand evolution well enough to recognize the hazard that technology produces for us. And that evolution for our own part hasn’t provided us with the skills to be able to deal with that. We actually need to learn that directly ourselves, right now.
Jim: We can’t even manage our economy, as last week’s events showed.
Forrest: Exactly.
Jim: So all right, I more or less understand this frog boiling humans biology, humans, arms races, et cetera, will gradually increase the reach of AGIs, even if they’re not super smart, until they probably do dominate things. I’m also going to put out the Rutt Conjecture.
Forrest: Until they create the conditions that are toxic to life. So in other words, again, even if we don’t follow the instrumental convergence hypothesis, what we do notice is that the increase in the total volume of technology on the planet continues to go up.
Jim: Right, right.
Forrest: And so in effect, what happens is that the technology displaces the life world. It displaces the humans. It displaces basically everything, in that same sort of inexorable way that humans have displaced animals.
Jim: It’s happening.
Forrest: Technology is displacing humans.
Jim: That part keeps happening irrespective of AI or not.
Forrest: That’s right. The thing is that once you combine these arguments with the instrumental convergence argument, keep in mind that the instrumental convergence argument is an accelerator of this argument.
Jim: And I’m going to stipulate, I believe the instrumental convergence argument is true. That there will be a fast takeoff at some point. Now, it may not happen in four hours, and it might… I had a very interesting discussion with Eliezer Yudkowsky and Robin Hansen once, sitting on the floor of a house in San Jose around 2004. And we had three different perspectives with, at that time, Eliezer believing in fast takeoff, Robin Hansen believed in very slow takeoff, and me in the middle. But it was an extremely interesting conversation, and I think if anything, we all gravitated towards fast takeoff at least a little bit in that conversation.
Forrest: Well, you can imagine my horror discovering the substrate needs convergence argument, right? Because the thing is that it just basically means that we create the conditions under which instrumental convergence happens, and then we keep those conditions around more or less indefinitely. And it’s just like we take the worst possible risk, and then what we’ve done is we’ve created now a fundamental argument that says not only is it the case that that is a risk today, but a risk tomorrow and for every day after that, for the indefinite future, in a perfected way. So then you basically say, if I take a perfected risk, both from the toxicity perspective and from the instrumental needs convergence perspective, is it possible for the level of risk over, say, the next millennia, to be anything less than 99.99% the case that it actually happened?
Jim: Yep. All right. So we’ve set the stage.
Forrest: Now you know I disagree with the other analysts in this space. The other people who-
Jim: Yeah, I see this other… I see that there’s a second ratchet, that if the first one doesn’t get us, the second one will, and I woulds say-
Forrest: The second one makes the conditions for the first one to be perfected.
Jim: Yep. Yep.
Forrest: That’s the kicker, right?
Jim: So even if it’s slow, because if it’s fast, it doesn’t matter. You don’t need your second ratchet.
Forrest: That’s right.
Jim: If it actually does happen, Ben Goertzel figures out the math, compiles it down on at nine o’clock on a Tuesday morning, and by Tuesday afternoon we’re all dead. But even if it doesn’t, even if it’s Robin Hansen’s view that it’s many, many decades, maybe centuries. Yours is a socio human political ratchet, that keeps it ratcheting for as long as it takes until the instrumental convergence occurs, and that’s when the phase change occurs.
Forrest: Well, there’s actually two other layers to this. So one of them is that you notice that over time there is an economic decoupling between the machine world and the human world.
Jim: We know that’s happening. And of course, that’s getting worse very rapidly.
Forrest: That’s getting worse very rapidly. And so in effect, what’s happening there is that we say, okay, not only is it the case that human beings are factored out because of the technology ratcheting effects, not only is it the case that they’re being factored out because of the economic incentive effects, but they’re also being factored out because of this economic decoupling effect. And so when you basically back out the sort of thing of, okay, from an economic level, that there is a decoupling of most human beings’ welfare from the hyper elite that are effectively still able to pay for the production of these particular machines, that over time you notice that that itself becomes an asymptotic convergence. Eventually even the super elite humans get factored out.
Because say you’re the ruler of all the United States, and you’re Elon Musk III, for example. When you’re trying to teach your child Elon Musk IV to understand how to regulate or maintain all of this machinery and apparatus, you’re not going to do that. You’re afraid that he’s going to compete with you from a rules to ruler perspective, and therefore you don’t want to teach him too much because he’s just going to dethrone you. So in effect, what happens is that even on an intergenerational basis, we get factored out, just from the economic decoupling perspective and the game theory associated with rules for rulers dynamics.
So in effect, what happens is that no matter how you cut it, no matter which way you go, you run into insurmountable barriers. And the insurmountable barriers are of such an order that the combination of the whole thing is essentially invincible. The only way to basically prevent this from happening is to not play the game to start with. It’s like that old movie. How do you win thermonuclear war? You don’t play.
Jim: Yeah, indeed. So now, in our last 10 minutes, I’m going to ask you to give… I know you’re a philosopher and not a social engineer, but what should we do?
Forrest: At this particular point, if I quote unquote, was to try to imagine some sort of, how do we get to where we need to be from there?
Jim: You’re Plato’s philosopher king, go with it.
Forrest: I’m going with it. I think we actually need a non-transactional way of making choices in the world. Take the incentives out. Make it so that the ways in which we’re thinking about how we make choices at the level of communities are not dominated by just business. We don’t have separation between business and governance the same way we do between church and state. And so in effect, what happens is that the perverse incentives are part of the reason why we keep ending up with categories of existential risks like this.
So in the long term, I’d love to see that get looked at, and that’s part of the reason why I work at things like EGP, and governance models, and community design, and so on and so forth. The other piece that I’d really love to see, and this is much more short-term, is that the arguments that I’m describing to you now to be more widely known. I’d like for people to literally understand why, for example, compared to Eliezer Yudkowsky, that I’m actually more pessimistic, because my arguments in effect include his, right?
I’ve got all the pessimism that he has, plus a bunch more. And then when we’re looking at things like the kind of models of Scott Alexander, and Scott Erickson and so on and so forth, when I look at their thinking about these things, I say, “Yeah, I understand your arguments, I understand your models, but you really need to understand the models I’m working with, because they actually add a whole bunch of corrections to the stuff that you’ve actually thought about on a pretty deep level. But there’s a whole bunch more underneath that that’s even more damning that I would really love you to know, so that when you’re telling other people, you don’t give them the false confidence that this could work out.”
So in effect, there’s a sense in which I would love for people to read the arguments, get to the point that they understand it really, really well, thoroughly even, and on social media or whatever, take snippets out of the documents, post them to people that are probably interested and care about the welfare of their children, or care about their having a job, or the fact that maybe their lives mean something, right? That in effect, if we’re looking at creating a meaningful basis for life, that essentially goes beyond the perverse incentives and actually addresses these kinds of concerns, people need to understand what’s actually going on.
I get that these arguments are complicated. I get that this sort of stuff is not fun to work with. It really sucks, actually. I mean, the last thing I want to be doing as a semi-autistic person is to literally have my mind and days consumed with this sort of thing. I would love for a lot of people to understand what I’m talking about, and nature doesn’t compromise. It is not the case that the complexity of these arguments or the issues associated with these things, none of this is negotiable.
Jim: Ma Nature doesn’t care that they’re unpleasant, to be sure.
Forrest: Exactly. And it doesn’t care that there’s a threshold of skillfulness that as a species we must pass, right? It’s like there’s a high jump, and we either get ourselves coordinated to jump over that bar, or we don’t. And if we don’t, we’re taking the rest of the planet with us, and that just doesn’t seem right.
Jim: And of course, this is Robin Hansen’s great forward great filter answer to the Fermi paradox.
Forrest: Exactly.
Jim: The Fermi paradox, as listeners, I go on about the Fermi paradox all time, which is, if there’s all these aliens out there, where the hell are they? And there’s two arguments. One is the behind, the previous filter, which is, it’s really, really, really, really hard to have gotten as far as we’ve gotten. And maybe it’s around DNA, maybe it’s around the prokaryotic cell, maybe it was multicellularity, maybe the neuron, but it could have been a bunch of things before we got there. And then the other side of it is well, it’s not that hard to get to where we are, but it’s really hard to survive much longer. This is the forward great filter.
Forrest: That’s right.
Jim: And Robin has laid this out very elegantly and mathematically, and he is of the view that probably the math says it’s more likely the future grade filter, not the past grade filter.
Forrest: At this point saying, regardless of which it is, if it’s the past grade filter, we’re undervaluing life.
Jim: Yes.
Forrest: If it’s the forward grade filter, we need to get our act together right now so that we can continue to value life. And so in this sense, regardless of how we think about the Fermi paradox, the necessity of our action is clear.
Jim: Got it. All right. I’m going to wrap it right there. I think that is a really good tour of the horizon. I think we did a pretty good job of nailing the core of your thesis, at least to the point I understand it. I hope the audience did too. I don’t think I did a terrible job on trying to pull it out.
Forrest: You did fine.
Jim: And I think we’ll just wrap it there. Thank you Forrest Landry for an amazingly interesting dive into a very scary topic.
Forrest: Sorry to do this to you, man. But on the other hand, thank you for hanging in there with me, and I super appreciate you giving me opportunity to speak in this way at all.