Transcript of EP 215 – Cody Moser on Inequality and Innovation

Jim: Today’s guest, it’s Cody Moser. Cody is a PhD student at the University of California of Merced and works with Paul Smaldino studying Innovation and Collective Intelligence. Welcome Cody.

Cody: Thanks, Jim. Really excited to be on here.

Jim: Yes, this will be fun. While I was doing my prep work I had forgotten, and I discovered that Cody and I actually were co-authors in a three-part roundtable on Twitter. I think it was immodestly titled something like Saving Twitter. So if you want to see us, it’s kind of like four-year-olds in parallel play the way it was set up. We didn’t really interact with each other, but the editors kind of orchestrated it, sort of mildly humorous. As always it, as well as the base paper we’re going to be talking about will be available on the episode page at jimruttshow.com. So go check it out.

And that brings us to the next point. The launchpad, at least for this discussion, no telling where it may actually go is a paper Cody recently wrote called Innovation Facilitating Networks Create Inequality. Now as we’ll get into, the paper’s about a shitload more than that, but that’s the title. And it was published in the Proceedings of the Royal Society B: Biological Sciences. And again, check out the link at the episode page. It’s a very readable article. You don’t need to be a rocket scientist to read it, but if you’re a total idiot, don’t waste your time. Just listen to the podcast. So let’s start with why did you do this work?

Cody: So it starts with a paper that Paul and I did a few months ago titled Maintaining Transient Diversity is a general principle for collective problem solving. And in that paper we looked across a large number of models where we found that any properties which seem to increase the diversity of a population in a collective problem solving task, say from a NK landscape to one of these multi-armed bandit tasks seems to increase the population’s ability to solve that task. But we kind of get dangerously close in that paper to presenting something a kin to free lunch. Where we say, “Ah, you should just do this all the-”

Jim: I was about to interject. I said, “Warning, no free lunch theorem.”

Cody: No, that’s precisely right. And we were like, “It looks like we’re not even positing any trade-offs here.” So the idea behind this paper was to say, okay, it looks like group performance perhaps gets scaffolded in some of these environments, but what’s actually happening on the agent level? So instead of just looking at how well the group performs at the task, we looked at the distribution of agent performance in the task and more or less led to the title that we have.

Jim: Yeah, interesting. And essentially it is a agent-based modeling approach and you focused on a specific task which I’d never heard of. Why don’t you describe the task to the degree that it’s useful for the audience?

Cody: Yeah, totally. So the original task was made by Maxime Derricks and Robert Boyd in 2016. And it was actually an in-person experiment. So the way that it works is that you get groups of people together and they have a set of six potions and they can combine these potions in many different ways and they end up getting more potions if they make the correct kind of combination. Now, unbeknownst to them, there’s kind of an initial pathway… There’s a path dependency built into these combinations where if you kind of mix the first six together in a certain way, you’ll discover this purple potion or if you mix them together a different way, you’ll discover this orange potion. But the thing is that you can use subsequent potions in those next few combinations. And the idea is that most people in the real life task did not go back to that original set of potions.

So in the experiment they did was they either had people in a fully connected group where they could all mix these potions with each other and see what each other were doing, or in a partially connected group where they separated them and then they could kind of spy on the other group every once in a while. And what they found is that this group that was fully connected never found basically that you could discover this super potion if you figured out the solutions to both of these pathways and combine them. But the partially connected groups more than half the time did.

Jim: I just want to get a little bit more clear out exactly how this works. So who gets presented the six initial potions and do all groups get the same six? Is it individuals or is it groups that get presented with the potions?

Cody: Yeah, great question. So it’s individuals.

Jim: So every individual on the network gets… And do they all get the same six starting potions or is it some kind of random mix from a larger set?

Cody: They all get the exact same set of potions. So really what’s dependent is who you’re learning from in terms of the combinations they made. So it’s basically taking social learning of other people’s successes and trying to integrate that into your own behavior.

Jim: Okay. I’m going to be a little pedantic here because I want to make it clear how this thing works again, because of no free lunch theorem. I was going to argue later that, okay, it works for this. What makes you think it has any generality, right? But those of you who listen to the show, I referenced the no free lunch theorem fairly often, but those of you who haven’t heard it recently, it’s basically a statement which basically says there is in principle no universal way to solve all problems. Every set of problems will have a problem specific way to solve it. So anyone that ever tells you this is the magic answer to all problems, just say no free lunch theorem. It’s the equivalent of the ban on perpetual motion machines, as far as I’m concerned. It’s a very good filter for nonsense.

Cody: No, precisely. I think that was David Wolpert.

Jim: Yeah, Dave Wolpert. Anyway, no free lunch theorem. So everybody gets six. Now, next question. There is an implicit law of physics, or at least law of chemistry behind these things. Is this law of chemistry the same in every trial?

Cody: Yes, that’s right-

Jim: In other words that if I mix A, D, and E, then I get something useful. Is that always true in all trials? So there’s no exploration of trying to understand the laws of chemistry, for instance?

Cody: No, precisely. So that’s held constant across all the simulations. And so pretty much what we’re doing is we’re altering the network structure and the individual agent behaviors, such as their propensity to share solutions with one another to see… In our metric what we’re looking at is how quickly they can solve this task basically. And they solve it by finding the super potion.

Jim: A couple of again, more low-level questions here before we get to the bigger thing. So just so we all understand, everybody gets the six initial potions, the laws of chemistry are fixed, so you make the right combination and every trial you’ll get the same result, which is good. Now, so I get one, I’m an agent. How many trials do I have? What are my rules of engaging with these six potions? And then what affordances do I have on the network? Can I ask my neighbors? Can we trade? What is the affordance architect… What do I do as an agent and what does my affordances on the network?

Cody: So what they’re doing here is the potions are formed in triads. So you have to take three potions to see if you can get a different potion. And what you’re doing is you’re picking one of the people that you’re connected to in the network, one of your partners at random, and you’re selecting two potions or one potion, and they’re selecting one potion or two potions so that you both combine three. And in that combination, matching your knowledge and their knowledge is how you test to see if you get a new potion.

Jim: So I’m an agent, I get selected by the running algorithm and I’m told to either pick one or two. And then submit it at random to some connection within my group, and then the God in the machine decides whether that’s a good potion or not. And if it’s not a good potion, it tells me, “Tilt.” If it is a good potion, “It says good potion.” Are there only two good potions or are there more than two? You referenced two.

Cody: Yeah, I think there’s nine. So the two that I was talking about at the beginning set the initial trajectory. So you can discover either we call them A trajectory or the B trajectory potion, which means if you get the one on A trajectory, you’re biased to use that again and again rather than going back to the original set.

Jim: Okay, so now Mr. Agent, I have my six little bottles here, I do a reaction and we discover a new potion. Does that get added to my set or do I have to knock one out of my set? Does my set now go to seven?

Cody: So the set is additive. So you take the new potion, but you’re more likely to use it because it has kind of a higher utility to you than the previous ones do.

Jim: You’d think, maybe. One could easily imagine a problem where that was not true. In evolutionary computation, which is my home academic field, there are famously all kinds of interesting trick problems that you try to fuck with evolutionary algorithms. So you could easily have dead ends where a potion was actually useless for instance, and that would be an interesting thing to experiment with. So now a couple more nerdy questions before we get to the meat of the matter. When I’m connected to somebody to see if we can find jointly a reaction that works, what is the nature of the boundary? Is this something that is stipulated in the experimental design, essentially a series of membranes and everybody’s stuck in one? Talk to me about how you generate membranes, how you vary them in size and how an agent happens to become embedded in a membrane.

Cody: Yeah, so basically what we’re doing is we’re generating random network structures. So some of these are, you define say 100 agents and you define some level of connectivity. So if it’s 0.5 or half, then half the agents which could be connected to each other are connected to each other. So usually those boundaries are static. You’re kind of stuck with your partners across the simulations. In some cases we do change it up where individuals after a certain number of time steps can switch partners, in which case one of the edges in the network is changed from someone you have now to someone you weren’t connected to. And we also used a few real-world networks.

Jim: Key man networks, I remember it was one of them. These are networks that not actually membranes, which are somewhat different actually in their design. So there’s basically because there’s all these algorithms for generating networks, we’ve all played with them. Additive networks, algorithms to generate small world networks, fully connected networks, etc. So we’re really talking about network connectivity from an agent-centric perspective. So when it’s my turn to do a experiment, the God in the machine basically finds one of my connections and says, “All right, we’re going to randomly connect you.” Okay, that’s actually simpler than dealing with membronics, but I would say probably a little bit less interesting. I think you’d find slightly different results if you define membranes of different scales and sizes and stuck people in them, because that way… Regular listeners know one of the things I’m fascinated with is origins of life.

And I often use it as a metaphor and it is certainly thought, but it is not known, but it is one of the leading theories, and it could be wrong. The so-called membrane-first theory on the origin of life that because when you think about origin of life, it’s a combinatorics problem where you have to find an autocatalytic set, but you also have to defeat dilution at the same time. So you imagine Darwin’s warm little pond that’s building this cool set of chemicals, then it fucking rains and everything gets diluted. So, “Oh, shit.”

Perhaps membranes were the magic trick. And there’s some reason to believe that certain classes of fatty acids have this very interesting effect that they interact with water in such a way that one half of them point outward and the other half point inward, and they adhere together to each other as well. So they will form a membrane. So then you have concentration inside. So anyway, thought for a future experiment, use membronics rather the network generation. But now I think we understand sort of the structure of the analysis. What happened next when you actually started doing this stuff?

Cody: We found a few things, some known. For example, one of the things which scaffolded performance in this task was if the network was less connected, they seem to do better at it. And in a sense the reason for that is that different parts of the network are able to think about different parts of the problem set. If you think about it, if everyone is fully connected to each other, they’re all sharing the same information, they kind of have this inbuilt bias where they don’t tend to do too much in terms of explore-exploit dynamics. They’re just totally exploiting whatever information they have, but they’re not able to look around too much.

Another thing we found is that larger populations outperformed smaller ones except for when you actually look at the amount of work done by the agents themselves. So in this case, what we’re looking at is the time until completion in the task, but if you look at the overall number of combinations such that say in one time step 100 agents will make 100 combinations and 10 agents will only make 10. If you simply just scale that number to the number of time steps that the group has, so say 10 agents have to make 10 steps to do 100 combinations, it takes about the same amount of work in all these different populations to complete the task.

Jim: So basically there’s nothing magic about scale other than the fact that you’d get more trials.

Cody: That’s correct, yeah. And this was somewhat surprising to me if only because even when you’re looking at things like connectivity. So let’s say a network of 100 agents with a point-five connectivity, that means that you’re connected to half of the individuals you could potentially be connected to. In 10 individuals, that number of point-five is going to be much smaller in terms of the overall individuals. The average degree of the larger networks is still going to be higher than in the smaller networks, but nevertheless, there’s some compensatory effect when the networks where they’re just performing the same regardless.

Jim: That’s quite interesting. And whether that’s fundamental or just a happenstance of the particular experiment, we probably don’t actually know. Though, I suppose you could vary the laws of physics and see if that changed that result. For instance, the added some additional reactions or added some dead-end traps or something. Is there any other form of information sharing other than an attempted experiment with a randomly connected neighbors or any equivalent of publishing? Is there a Royal Society Journal in the agent world?

Cody: No. No. So the information’s not necessarily ledgered, but if I do find something, I have the ability to give it to all of my neighbor. So that means that in a highly connected network say 0.75, and I find something about 75% of the network then gets that information as a consequence of the thing that I discovered.

Jim: Okay, that’s interesting. So that’s an institutional structure. We don’t have IP regulations in this computer world. Unlike the real world. No trade secrets, no people trying to hoard information for their personal benefit, which is a major institutional design rather unlike the real world.

Cody: One thing we do do is we lower the likelihood that one shares though. So going back to this IP thing, as you decrease your propensity to share, and so make it less likely that you give this potion to any of your neighbors, you also seem to increase performance there.

Jim: That is very interesting.

Cody: Yeah, I think in effect what you’re doing is you’re lowering the connectivity after the fact. So you might have 100 neighbors. That’s 100 individuals where you could draw information from when you’re making that initial trade, but not necessarily on the other end of things 100 people you’re going to share that information with when you find a successful one.

Jim: And that is quite interesting and seems somewhat counterintuitive, it would seem, and again, things like open source ideas, open source science publishing, et cetera, says, “More is better. Everybody should have access to everything.” Did you get any insights on what mechanisms could be driving more rapid success on the potion task when the sharing of intermediate results was lessened?

Cody: Yeah, so if you look, and this goes back to one of the measures that we use. We use a Gini coefficient to measure payoffs in the task. So each potion has some kind of score that’s related to how high it is in the kind of item space. So one that has a lot of points, took a lot of different potions to discover it. And what we find is that when you decrease the likelihood of information sharing, the Gini coefficient goes really high, meaning that there’s a really high inequality in the network in these scores. So essentially what you’re doing is you’re increasing the diversity of solutions that individuals have.

Jim: And maybe the set of potions they have on their table becomes more diverse. And of course this is actually a problem. Evolutionary computing is the big flat lands where everything is the same and nothing much ever happens. And again, this is an original life problem also. The big pond is too diverse, too mixed up, and the low probability reactions are more likely to happen in the smaller membranes, particularly if you have a whole bunch of them. So this is the parallel search but not fully connected.

Cody: No, precisely. Yeah. I think about this specifically in terms of this idea of transient diversity that it seems to really go back, for example, to Fisher’s fundamental theorem that the strength of selection is proportional to the amount of variance in the population. This just seems like another kind of reformulation of that exact same thing.

Jim: Yeah. We find in evolutionary computing all the time that you have to artificially insert diversity back in, because a lot of naive approaches ended up eliminating diversity. Because for even tiny differentials in fitness will quickly force out over a few generations slightly less fit, instantaneously less fit. And yet there may be pathways in those slightly less fit entities that would actually far exceed the guy who’s ahead a little bit right now.

This is the famous problem of being stuck on local maxima, as opposed to having the tools that you might eventually find by recombination to get something at a higher level local maxima or maybe even find the global maxima. So that may be part of the mechanism that’s going on too. Now let’s turn a little bit to this idea of inequality. When I first saw the headline, I thought you’d be trying to argue about the equivalent of socioeconomic inequality and you sort of come back and sort of wave hands a little bit on how this might be similar. But what you’re really talking about is inequality of probability of discovering the final potion or any… Do you get points for each potion you discover? What is your metric of inequality, first? And then let’s talk a little bit about what you found.

Cody: So the metric for inequality is based off of all the potion scores that you have. So basically you sum those and you could say, “Oh, someone’s more wealthy because they have a bunch of really good potions and then someone’s less wealthy because they only have a few poor potions.” And we just computed the Gini coefficient on that, which is kind of standard econometrics measurement that’s used to assess the inequality in the population.

Jim: I’m just curious, what was the Gini coefficient? Give some examples. The US gini coefficient is what? Let’s look it up here on the Google, 4.47. Is any memory of what kind of ranges you were in your little play world here?

Cody: Yeah, so in our case we usually scaled it between zero and one. So at zero everyone’s kind of flat. Everyone has the exact same things, but of course at one when it’s much higher it looks pretty bad and usually we would get somewhere around like 0.6 or 0.7, and that was in cases when the network was really not connected. Essentially you have these really high performing nodes that are essentially holding on to most of that, I don’t want to say wealth, but most of the good solutions in the population.

Jim: And that was the point I wanted to make, which is you actually weren’t talking about wealth at all. You were talking about points for having found magic spells about founding the answer to problems. Now it is true that in our society those things are somewhat strongly correlated, but they don’t need to be. One could imagine different forms of how reward is spread, that the discoverer only gets a tiny little taste perhaps. In your world you’re assuming the discoverer or a holder of an object gets the full reward factor. Call it a highly individualistic late stage capitalist perspective on things. And I think it’s just worth pointing out to everybody that happens to be the world we live in at the moment, but that definition of society was not brought down from Mount Sinai by Moses. It just happens to be where we are today and it’s well within our power to change it if we were so inclined.

So it’s a very interesting result, but one that we should be careful not to overly focus on with respect to its outcomes. What was the other word? You had outcomes, which is inequality of what you have in your stack, but then you had some other word for the socioeconomic equivalent. The two don’t have to be strongly linked, though in our current world they are moderately strongly linked. Though of course the most important thing in our current world is luck.

In fact, in many, many, many ways, where did you happen to be born? Did you get born in a neighborhood with good schools? Did you get born to rich parents? Did you get born to parents who were not crazy? When you’re in your business world, did your first boss hate people that wore green ties and fired you because he hated green ties? All kinds of weird shit happens in life. I often tell people that in my own experience, at least 60% of success was luck. Though of course you can’t take advantage of luck unless you also use skill, so the two are intimately connected. But anyway, that’s an aside. When you made a network less connected, you found more inequality in each person’s value of their portfolio essentially. Did it impact the total steps necessary to find the solution one way or the other?

Cody: It did. Yeah. So essentially that when you had less equal networks, you had higher performance in the task. And in fact a group’s performance in the tasks seem to be proportional to the inequality or vice versa. This is an effect because if you think about what’s kind of scaffolding performance, if you lower connectivity, what’s happening is you’re introducing a heterogeneity of edges. So you have a lot of people who are maybe connected to a bunch of others, some people aren’t connected to very many others, and so that heterogeneity is also just going to be a heterogeneity of the information that’s transmitted through the network.

Jim: Now let’s dig in one more detail question. So when you have a less connected network, were all the agents of the same degree or were there variants of degree amongst the agents?

Cody: There are variants in degree among the agents, and so we look to see where the highest performing agents were. We use things like the degree centrality, which is who has the highest degree. And what we found is that the agents who tended to find the final potion, which is worth a whole lot, tended to be the most central in terms of both degree centrality and betweenness centrality-

Jim: And also their own personal degree presumably, right?

Cody: That’s right.

Jim: So you have what is my degree, how many connections do I have? And then various measures of centrality. So the number of networks I have is are my parents rich, as an analog. Centrality, did my father go to the Harvard Business School? The other measures in centrality, I don’t have immediate metaphors that come to mind, but I suppose there are some. Now, did you try an experiment where you searched through different levels of connectivity, but you kept each agent at the same degree? Did you try that experiment?

Cody: No, no, I didn’t do that.

Jim: I’d be damn interested in seeing what that did. Where everybody had the same degree, but we varied the degree that way you’d see is the connectedness by itself significant or is it the heterogeneity of degree that is significant?

Cody: Yeah, yeah, I love that.

Jim: I think that’s a very important question, I would be very interested in for sure. So that’s interesting. So it’s either the degree, or the heterogeneity of degree, or both, and we don’t know the answer that produces the superior outcome on the task.

Cody: I’m inclined to say it might be the heterogeneity just because going back to this idea that the amount of work in the network done at each population size is about the same. The idea that a population of 10 agents and one of a hundred agents holding connectivity of constant, they’re going to have different degrees, but it seems like they’re doing the same amount of work to complete the task.

Jim: Okay, that’s interesting. Yeah, so that’s at least a pretty strong guess that I call it a conservation law, that the total number of trades is going to be approximately conserved to reach a solution. That indicate that it’s probably heterogeneity and not degree. Probably, but you can be fooled as we know. There’s no guarantee that correlation equals causation, especially in an end of one type situation. In this case, we could call the universal laws the end of one. You mentioned in passing, and this caused my eyebrow to go up the experiment of randomly adding connections. Again, sort of our naive way of thinking about the world that probably ought to be a good thing, but turned out to have no effect at all it sounds like or damn close to it.

Cody: So in this experiment, for some reason, randomly adding connections didn’t do much, and at first I thought it was because we use these what you call Erdos-Renyi random networks where the path length, the average path length, so the distance between you and another agent is actually quite low. It’s usually always less than two, meaning that you’re within two steps of any other agent in the network no matter what. So if you start changing connections, that doesn’t really do much to the overall network structure. But then we started looking into it in these other networks called connective caveman networks, and these are pretty cool. So it’s like you take these fully connected groups, you setter one within tie within these little networks, and then you connect them to another network that’s near you.

So what you get is you get these rings of networks, these cliques that they call them that are all sort of connected to each other but have still a lot of individuals within their groups that they’re connected to. And even when we started to change the random link alteration there, we didn’t get too many effects. That might be due honestly to the fact that this is random link alteration. So what they’re doing is they’re not selecting out someone who’s maybe a high performer, which could be something else where instead of just picking someone at random, you look at a say, friends of friends effect and see who’s doing the best in terms of people you’re near.

Jim: So yeah, who do you respond to? The person with the high Twitter follower count, as a cynical equivalent. Now that’s interesting. As I think about it, caveman networks are at least approaching my previous concept of membrane onyx, because they are essentially connected to each other, only thinly connected elsewhere, which is essentially the idea of membranes and protocols. So that’s interesting that you didn’t see any real effect even there, where you would think that propagation would help. Particularly if you had the rule of publishing to your connections, because if you’re in a caveman network when you publish everybody’s publishing to a small group, relatively little of it diffuses out widely into the world because of the network structure. If I got the description correctly, while in let’s say a short path length network, everybody, or not everybody, but most people get it pretty damn quickly. And then subsequent reactions, it propagates to everybody pretty quickly. And a small world network, it would be in between presumably.

Cody: Yeah, no, exactly.

Jim: If you had a path length of 3, 4, 5, 6, something like that kind of thing, you might see in a small world network with a few thousand agents, then you would have a different cascade and none of that made much difference. That’s interesting.

Cody: Yeah, that’s right. Honestly, I think that there could be another paper there, people exploring different ways to alter this kind of random link alteration to see what’s going on. One thing I would like to talk about is the connected caveman networks, because they did extraordinarily well at this task. Something that you can do with them is you can alter both the size of the click and the number of clicks. So you could hold population size constants say at 24, and you can have either four units with six individuals in the units or six units with four individuals in the units. What we did is we played with that back and forth to see holding population size constant, which of these outperform the other. And what we found is that what you really want in these kinds of tasks is many small groups, rather than a few larger groups.

Jim: I’m going to guess that there are limits at both ends of that. And if you explored a range, a group of one isn’t going to be very useful. Duh, especially if you’re not connected to anybody else, you’re just sitting in the dark. So I would imagine that if you did a parameter sweep, you would find some range where more is better and some range where less is better. As you guys do anything like that, did you find some sweet spot in your artificial universe?

Cody: No. That’s precisely right. So actually yeah, as you get too low in terms of numbers in the group, you approach what’s called a ring network where basically everyone… It’s just a line basically where everyone has two partners and at that point the performance drops off again. It seems that what really helps these connected caveman networks is they’re able to both do a little bit of internal exploitation, in that whatever information they have they’re sharing with each other and they’re not venturing too far outside of that. But then also do a little bit of exploration by basically bridging that gap to the other group that’s near them.

Jim: Yep, that’s about what I would’ve expected. I would expect no free lunch theorem that the optimal topology would vary based on the problem. Again, another paper might be to find three or four other problems that are sort of vaguely analogous to the potion task problem, and consider just the caveman network. And do some parameter sweeps on caveman network designs with respect to size and end, and then see how they do on these different problems. And I’m going to hypothesize that other than by bad luck, probably you’re going to find different optimal solutions.

Cody: 100%. I’ve been thinking about this a lot in terms of Ross Ashby’s kind of good regulator theorem. That essentially a good regulator of some kind of problem has to have a model of what that problem’s doing. So if you think about it, it’s like the network structure should in some way converge onto the problem that it’s presented with.

Jim: Or it would be nice if it did, though very seldom are there mechanisms to make that occur. Think about our institutions. It’s like the problem of scientific research, which you are intimately familiar with. It’s in an institution called a university, which is an institution called a department, which then has labs. And then has poor graduate students serfs, and those things don’t move around very much. I’m actually on the governance board for a couple of research institutes, including one embedded in a very large STEM university and it’s one of our hair pulling exercises is that making anything change structurally is essentially impossible. A real world takeaway from this insight you just had is that we are probably grossly sub-optimized in our research, let’s just keep it at scientific research by the fact that we have this extraordinarily stereotypical hierarchical institutional structure. And we use the same structure with probably very similar degree structure for biology, physics, chemical engineering, et cetera, when they probably have very different attributes, which is kind of interesting.

Of course we have workarounds like in physics, we now see these massive papers with 600 authors and such. Where the physicists probably… Because they’re so arrogant, they think they know the answer to everything. The thing is they’re smart enough that they often do. They have broken out of the boxes more name of the other disciplines when they were the first to adopt archive for instance, and publish their work before waiting for the scholarly journals. They’re the first to really went nuts on large end collaborations worldwide, fuck the institutional boundaries. It might just be a takeaway that we should think about that more generally in the world of scientific research that falls into an area of increasing interest of mine called meta science.

How do we think about the science of science, so that we can do science better. As a globe, we’re spending something like $400 billion a year on science or closely related to science, and I bet you a cheeseburger and a half that we’re probably no better than 20% effective at spending that money in terms of solving the problems that would actually move society and our knowledge of the universe forward. What else did you discover that was interesting?

Cody: Yeah, so in addition to some of the findings that as performance increases, inequality does as well. That we find that basically the agents that are driving the inequality tend to be more central in the network. We’ve done a few other things, so I had another paper with my colleague Jesse Milzman using the same model. Where instead of looking just at the connected caveman networks and the random networks, we looked at these core periphery networks. And essentially the way that these work is that you have one group that is this tightly clustered core.

So they’re connected to is quite high, and then you have a periphery which is quite low in terms of their connectivity both to each other and to themselves. And what we find there is that those networks tend to do really, really well at this problem. Essentially what we think is happening there is that the networks are somehow being able to have their lunch and eat it too almost or have their cake and eat it too, in the sense that one part of the network is able to just exploit whatever information they get. While the periphery is able to just explore the range of options available to them.

Jim: Of course, there’s another classic design parameters, exploration versus exploitation. Talk a little bit about how from within the machinery of your world that might work.

Cody: So in the case of exploration exploitation, you might think of exploitation as a case where you’re on one of these pathways with these potions and you’re just using exactly what you know. So the network in that case, it’s like, “Cool, I just found this really vital potion, I’m going to mix it with the ones I already know.” Keep doing that, keep doing that, and in the meantime, no one’s really going back in the pathway and using the older ones in different kinds of combinations. So you would think about this kind of explorer dynamic as a network having individuals, who are not necessarily doing what everyone else is doing. And in our case we call these loser nodes, because basically what they are is they’re individuals who, while everyone else is increasing in every single performance metric, they seem to be just stable. It looks like they’re not doing anything, but if you actually think about the group structure and what’s necessary to solve the problem, they tend to be really important because on a second order, what they’re doing is they’re giving new information to this core to be able to exploit.

Jim: Yeah, that’s interesting. So let’s think about that in terms of real world analogies since part of the fun of this is to say, “All right, what might this mean?” Losers, our current socioeconomic system is certainly set up to manufacture losers in large quantity. Is that actually a good design for total system efficiency?

Cody: I think about this specifically in terms of meta-science, so being able to have people work a little bit on the fringes and not necessarily embedded in the literature on some kind of specific problem case. Someone who comes to mind is Freeman Dyson for example, who for his entire career he called himself a heretic. That’s essentially what he was doing. He was trying to be that guy on the edge.

Jim: Yeah, he was a bad man and the amount of great ideas he had from just saying, “Fuck all.” He didn’t even have a PhD famously and he didn’t give a fuck what anybody thought. And he was sufficiently charming that he could find people to give him money personally, often outside the institutional structure so they couldn’t discipline him very much. He’s a great example. I’ll give you an example from I attended the annual Meta-Science conference, worldwide Meta-science conference in May, and there was an amazing number of great ideas floating around. Let me run one by you that I think is very much like this, and I actually came up with the number based on back of the envelope of calculations and what it would cost. Imagine that there were some governance mechanism which we will leave to the student to design, because it’s too fucking hard for an old fart like me, that allocated 1,000 lifetime endowed chairs per year to 25 year olds. Could you imagine what that would do to science?

Cody: Oh, man.

Jim: And of course it would be competitive. They wouldn’t give them out randomly, but there’d be some governance mechanism that tried to optimally endow 1,000 25 year olds each year with no bias towards the domain or the discipline or anything, other than whatever this governance algorithm was. And you would suddenly have a whole bunch of people who had lifetime endowed chairs that could work on anything fucking they wanted to.

Cody: Yeah, that sounds like the dream, you would have people just in full explore mode their entire careers just not having to worry about necessarily impressing people.

Jim: Or can you imagine getting out of the grind of trying to get tenure. You’re in the grind now trying to get your PhD, but let me tell you young man, that ain’t nothing compared to grind of being an assistant professor trying to get tenure, at least at an elite university. That’s a fairly horrible seven years in most people’s lives, especially if you’re at a very top tier university where the success rate is 30% or 25% or something. Those 1,000 people at least would have a wonderful situation, but 1,000 isn’t very many out of the total number of PhDs per cohort. That’s an actually interesting number. How many science are there per cohort? Let’s see what other thoughts might come from your head about possible, even wacko extrapolations from what you learned in this experiment?

Cody: Yeah, one thing I’ve been thinking about lately is specifically what this kind of inequality means. We were talking earlier inequality of outcomes or we’re talking about wealth here, and again, it kind of all goes back for me to institutional design. And essentially what kind of institutions do we want? We talk a little bit in the paper about this idea of agent negative and agent positive visions of institutional design. Where an agent positive one is one essentially where you just expect that you can improve things by finding the right people. An agent negative one is where essentially what you do is you can improve things by improving the structure of the institutions, and I feel like sure there has to be a blend between these two things though? So this question about roles and fits, it’s one that kind of deeply interests me and I think is something that moving forward in terms of setting this up for interdisciplinary research could be a really fruitful avenue.

Jim: Let’s explore on that a little bit. Of course, this is an age-old question. People talk about history this way. Is it the great man theory of history. “History been very different without Napoleon or Hitler or Einstein.” And then there’s others that say, “No, it’s actually broad forces and if it wasn’t Napoleon, it’d been somebody else. And if Einstein hadn’t figured out relativity, somebody else would’ve.” Any insights from this experiment that give you any relative weights on individual versus institutional structure?

Cody: The really good example I think of is Isaac Newton. This guy was obviously a genius, but he himself said, “I stand on the shoulder of giants.” And that seems to be him saying something about this agent negative perspective. Personally, I think the fact that essentially what we get are these genius agents. Even though the agents themselves are pretty dumb, they don’t have necessarily different cognitive strategies or anything like that, but what you get is high performers in the network based on their structure tells us that there is something to this kind of agent negative view about mechanism design and institutions.

Jim: That seems reasonable. That does seem reasonable, particularly that you discovered that the multiple measures of network centrality seemed to be correlated with higher outcomes for the individuals and that there was designs that resulted in significant inequality that were actually the superior performers, which is also quite interesting. I suppose you could explore that further by having the agents have different cognitive strategies, like you could have people that just always did random, others that use some slightly smarter algorithm, etc. That would be kind of fun, then you’d be able to actually see where some of these trade-offs are.

Of course, unfortunately we can’t come to any crisp answers, because we all know what the answer will be. It’s both. And the free lunch there will tell us, not only is it both, but it’ll depend on the nature of the problem you’re trying to solve. So if you want to build an atomic bomb, get the smartest motherfuckers you can find and lock them on the top of a desert Mesa with armed guards and electrified fences around them for two and a half years. Tried to figure out why the obesity epidemic is happening in the whole world. I have no fucking clue what the structure there is, but whatever we’re trying ain’t working because nobody’s figured it out despite the fact that it’s probably worth $5 trillion a year to do so, something like that. Other speculations or thoughts or extrapolations that your work or related work you’ve done have caused you to ponder?

Cody: Kind of what you’re talking about here. How do we know what the problem actually looks like? My advisor and I have another paper on this concept of generative entrenchment and how it might play into organizational design. So generative entrenchment is basically early path dependencies set in an organism. So for example, in order to grow hands, first you have to grow a spine, and then you have to grow the arms, and you have to grow the wrist and everything else that goes with it. And we started asking to what extent is this reflected in human organizations? Is there a process of generative entrenchment that explains for example, why the C-suite looks so similar across different corporations and things like that?

And I’m curious to what extent holding everything else constant, there’s expectations for certain roles. Let’s say I’m a CFO or something, I might not go to this weird job unless it explicitly tells me what I’m going to be doing. But we’re curious to what extent perhaps institutions themselves reflect the problem space that they’re in. Can you almost understand the construction of the problem based on what the institution looks like? And this kind of goes back to, say for example, like your research in evolutionary neural networks. You can almost look at the way that some of these feedforward networks get designed in terms of the problems that they have to solve, and you get things like emergent modularity to match modular-like tasks.

Jim: And just at a more prosaic level, the US army and the US Marine Corps have quite different… The on-paper structures are very similar, but the actual way their networks work are quite different. The Marine Corps pushes way more authority down in their hierarchy, while the army is much more command and control. Now, I’ll say that our US army is much less command and control than let’s say the Russian army or the Chinese army, but compared to the Marine Corps, it’s the difference between General Motors in 1955 and a bunch of hippies at Haight-Ashbury. Of course, the nature of the problems they have to solve are different. The Army’s job is to engage a big approximate peer foe in a years-long grudge fest and grind the other guy down, while the Marines is to take a beach and operate quickly, rapidly in a one-off situation and prevail until the army can show up. So they have different tasks, and so they ended up producing quite different doctrines, I think is quite interesting. The nature of the problem then ought to impact the solution space.

Cody: We thought about this a lot in terms of what we call information architecture. So just like the structure of the internet, the idea that the Chinese internet is not precisely the same as the one that we’re using in the United States. Where there kind of the goal is centralization of information and kind of getting people into rapid consensus. For many reasons they were able to respond to Covid much more quickly, for example, than the United States was. But what does that lead to potentially leads to you stumbling into a suboptimal solution too early. While as in our case, what we get is we get more exploration, but as a consequence we have things like polarization built into our system.

Jim: That’s a good one, and that’s a design trade-off. One of my favorite, in fact, the only philosophy book that I keep in my working office typically is Karl Popper’s, The Open Society and Its Enemies. And he wrote this in I think the late forties, early fifties where he was pondering what’s going to happen between the democracies and the totalitarians. And he came up with the hopeful, and turned out to be right so far, hypothesis that all the turbulence and conflict and polarization and disorder and slow ability to make crisp and accurate decisions of an open society will nonetheless persevere against a much brisker. I mean, when Xi says, “Hop,” they hop. In the US, Biden says, “Hop” and 42% of the people say, “You’re an alien, a reptile alien.”

And yet Popper makes the argument that because the heterodox view is allowed to incubate and that… I’m going to mix some horrible metaphors here, the rock and roll I would argue would’ve died if it wasn’t for garage bands, despite the fact that 99.9% of garage bands suck. So tolerance of the fringes is indispensable again to finding the higher maxima. Xi can say, “Hop.” And everybody hops, and they hop on a stool and they’re 18 inches off the ground. A discordant society like America, let’s say, can find its way to discover things like the internet and such. So that’s again, a trade-off in terms of what you’re trying to accomplish. Xi wants everybody to hop, the west says, “Ah, we don’t need everybody to hop. I’d rather have lots of people with discordant ideas and they fight with each other and all that shit but at the end of the day, we end up finding the big important ideas more rapidly on average.” Something like that.

Cody: Yeah, totally.

Jim: Now just for fun, let’s revisit this. I know you wrote about it in the little tri-part article that we’re in. Why don’t you recap, and maybe you’ve thought more about it since… I know I’ve thought more about what I said, and I just made some shit up. What do you think these things and other things that you know from your work have implications around things like the design of our platforms on the internet?

Cody: Yeah, something I really worry about is the centralizing role of recommendation algorithms. That essentially what it’s doing is it’s the search process for information online. It’s good perhaps that I can get the news immediately, but I wonder sometimes if the idea that things are getting closer and closer online might actually be horrible for the way that we digest information. I remember I read something about Facebook back in like 2012, it’s path length was like five. So the distance from you to the furthest person from you on Facebook was about five friends. You had to go through five people to get to that person. Now it’s less than three.

Potentially that’s because there are super frienders who are serving as certain hubs in the network and things like that. But I think regardless, it’s really bad that we’re all digesting information in the same way. And I wonder at times if that’s essentially what polarization is due to. It’s that basically we’re all exposed to the same set of information. You have to compress it. So the quickest way to compress it is just like picking one or the other. So you see random issues and becoming polarized. I think about when Will Smith slapped Chris Rock, that was immediately polarizing. Everyone knew about it in the world, but unnecessarily so.

Jim: Yeah, that is interesting that our politics… One of my favorites of what the fuck, how did this ever happen is how to teach reading is a politically polarized issue. There’s two basic approaches. One’s the whole word method and the other’s phonics. That’s a gross oversimplification. And somehow whole word became, or whole language became associated with the Democrats and phonics with the Republicans. Now, how the fuck could something as obtuse as that end up being politics? And I think you’re onto something here. This actually fits in with another Ruttian hypothesis, unproven sports fans. So don’t bet on this one unless you think Rutt is a good hypothecator, which he probably is. Which is the fundamental thing driving us crazy is not necessarily disinformation, disinformation hate or anything else. It’s the fact that we’re just receiving too many inbound messages each day. Hunter-gatherers, which is who we really are under our fancy clothes and $25 haircuts, how many things did they had to deal with that were new each day? Not many. S.

O our brains weren’t adapted to that. And even in the mass culture days, it was all the same. You’ve seen one Beverly Hillbillies episode, you’ve seen them all pretty much. Now, if you open yourself up to the valve of the world, boom, boom, boom, boom. There are people who are receiving 3,000 interrupts per day. And by hypothesis, no fucking way you can deal with 3,000. A very simple heuristic is A, B. What bucket do I put it in? Is this team red or is this team blue? A very natural adaptive strategy for gross interrupt overload is to become exceedingly simplistic in your heuristic. If you only had to deal with 10 a day, which by the way, I think is the optimal number. There’s some people who are designing smart agents to wrap ourselves around a semipermeable membrane so we can still interact with the internet, but at the rate which we choose. And I have hypothesized that at least initially I would like to set my membrane parameter to 10 in-bounds per day from people I don’t know.

So I am not allowed to read more than 10 news stories, Tweets, Facebook posts, or anything per day, and maybe only allowed to make three. It’s a way to reduce the cognitive load from inputs that I allegedly… As you know, I go on social media sabbatical six months of every year, and I’m almost at the end of my current six-month sabbatical. I’ve been sniffing around a little bit because I do towards late in my sabbatical, see what’s going on. And just as I predicted, it always is. Same shit, different day. If you were frozen in a vault and then thawed six months later and thrown into Facebook and told nothing else about the world, you might well guess only a couple of days had went by or a week or something because it’s the same shit.

It’s the same tribe, even though the actual details of each atom are the same. So anyway, it’s a long way to say, “Yeah, I agree that too much connectivity, but it may just be too many fucking messages.” Particularly when they’re highly mixed that we’re seeing all kinds of messages we don’t even know how to think about, could easily cause us to collapse into a very simplistic tribalism in the world.

Cody: Yeah, totally. I think about this in terms of what I call the parochial pyramid. It goes back to this idea that Nassim Taleb used to talk about where he’s like, “Oh, among my family, I’m basically a communist in my neighborhood. I’m a socialist.” And then as it scales up, basically he cares less and less about the situation. And I think about this that sometimes it seems like that pyramid has been upturned, where we seem to weigh the most, these global issues way more than the local ones. So it’s like perhaps tribalism was always there, but what it was directed at was it was directed at your city council. It was directed at your union meeting. So it didn’t bubble up into this kind of huge national issue where now everyone’s having to go on one side or the other in order to squeeze through.

Jim: Yep. I think that’s absolutely interesting and true. Now, I think about it that tribalism makes sense when you’re at the Dunbar number. 150 people trying to figure out how to govern ourselves and avoid starving to death by turning the wrong way down a canyon while we’re following the antelope herd. Tribalism at the, let’s say at the size of the United States, that’s kind of nuts. In fact, our game B work, that’s what my other projects working on. So radical social change, we believe that society should be reinvented from the Dunbar number up, and that is one of our membranes. We have whole theory of membronics of all different sorts that interpenetrate each other, but the real payoff membrane is the one of about 150 that we are evolved for. And also by the way, it has the great side effect that we have proven for 200,000 years that you can manage a group of 150 without any fucking computers.

People always say, “Well, how would your Dunbar community… What kind of software would you use to run it?” And I go, “Hopefully none. We might have computers running our solar farm or our hydroponics or something, but in terms of how we govern ourselves, why do we need any stinking computers?” We’ve proven that humans can manage at that level without any stinking computers, and that may be a way to think about this. And also it would ground people, if they’re actually personally involved in the governance of their life at the Dunbar number, they’re going to have way less concern that somebody is or is not wearing a burqa in fucking Iran. Why do I give three what somebody does in Iran? And in reality, I couldn’t care less. Mining other people’s business about things that are so far away have no impact on me whatsoever strikes me as nuts.

On the other hand, there are a few things that does have a big impact on me. Well, all of us, which are the collective action problems that do occur at the global level. Of course, classically it’s using the atmosphere as a dump for greenhouse gases. Anyone that does that anywhere in the world is impacting me. So part of the art and science of this hierarchy of membranes is you still do have to have an outer global membrane, but it only has a very short writ. Just things that actually have global implications, like climate change being the most obvious one. I would say the stupid situation of war, it’s another one. The fact that war still exists really shows us how stupid humans are. Talk about a zero negative, sub gross negative sum, at least since 1870 when the technology got good enough.

Used to be you could win war, you could go steal enough shit to more than pay for it. But since 1870 war has gotten so expensive and so destructive, it’s a priori, idiotic thing for humans to engage in. And yet we still do. But there’s a multipolar trap problem that if I’m prepared to go to war, you need to be prepared to go to war so that I don’t go to war against you and steal all your women and shit. And in reality, it costs me more to steal your women and shit than they’re worth. But nonetheless, if I make that bad action, you’re forced to respond to defend. So it is a global level problem to break the multipolar trap of war between nation states. So there’s another example of something that needs to be in that top level bubble, but most things don’t.

I would say even things that we think of as these crazy polarizing issues like abortion. Your community 150 ought to say whether abortions allowed or not. At the level of 150, the nearest one is probably a hundred yards away. So you don’t like to rule here, move to the next one. “Truthfully, if we decide abortions are bad over here, why should you give three fucks? It’s none of your business, we’ve chosen this rule.” I use this example sometimes just to be as hyperbolic as possible. Two communities side by side, one that forbids abortion for any reason whatsoever, and then one that makes it mandatory. No births are allowed. And I would suggest that if you want to take membryonic on X, both of those ought to be allowed. If the people in them through their governance process have said, “This is our role. You want to be a member of this membrane, that’s the rule. You may think it’s idiotic, but that’s our rule. And if you don’t like it, fucking leave.”

And I think there is a way… And then it’s no longer anybody’s business what people choose to do in their membrane. The other one is, okay, I can imagine 150, a Dunbar group in which rigid, traditional matrimony is the only way that you are allowed to have sex. And then the one next door, free love dude, whatever. Both are perfectly legit. I will bet on which one will be more successful. But time will sort that out. And it’s really, as long as there’s voice and exit, as long as people can easily leave and they have some input into the governance model and they can easily leave, why not let 8 billion divided by 150, what’s that come out to something like 55 million membranes and let them each have an awful lot of setting their parameters locally. And then this gets away from this problem you’re talking about that, oh yeah, something happened that Iran. I need to have a team A, team B reaction to it. What the fuck? Why are you even thinking about that?

Cody: Yeah, that’s fascinating. And I like this invocation of Dunbar. It’s almost as if this compression problem of dealing with so much information, we’ve started to treat national politics as a Dunbar process when it’s not a Dunbar process. And it seems like what you’re trying to get at with game B is to actually make these issues into, I guess you could call them Dunbar solvable processes. I wonder how much this dovetails, especially with the work of Elinor Ostrom on polycentricity and overlapping jurisdictions of different ideas and allowing people to kind of opt in out of these things.

Jim: In fact, in our A and B structure, at least as it’s evolving, we have exactly that. So for instance, a number of the local Dunbars could be members of a watershed Dunbar, not Dunbar, but a container which holds the river as a commons, which we all agree to. And Elinor Ostrom’s governance modality for commons, I would say, are not perfect, but they’re damn good. A damn good place to start. So they negotiate amongst all the Dunbars to accord, as we’re starting to call this within this membrane. And all these members here agree to agree to this outer accord about this one thing about how we’re going to treat the river. We’re not going to dump shit in the river, and we’re not going to catch more than X amount of fish in the river, and we’re not going to allow silt to go into the river and four or five other things. But that’s the only thing the accord’s about.

We don’t have to agree about gun control or abortion or whether we should cover our heads when we go into church or not. It’s totally irrelevant at the level of the watershed. While the things that impact the watershed or of the essence of the watershed, so that each membrane has a commons within it and it has an accord for the governance of its commons.

Cody: Going back to this kind of structure of the model of the connected caveman, but also some of the things you’re talking about Popper and hypocrisy. It seems like what we’re trying to do in each of these places is introduce basically a better selection mechanism into the governance structure so that things aren’t necessarily so sticky so that they’re better adapted to local problems. It’s almost as if what democracy was, was it provided a means for selection, but perhaps as it’s sitting right now, as a legislative democracy is imperfect and there could be a better design out there to-

Jim: In fact, if you look at the repute in which Congress is held, which is like 9% of people has a positive of view on Congress, it scores lower than cable TV operators. If you can imagine. Comcast. Well, American people think Comcast is better than Congress. So that would seem to me to imply that there is much room for institutional improvement. All righty, this has been a very interesting conversation. I would certainly encourage you all to read Cody’s paper if you are interested. And as always, you can get it at the jimruttshow.com on the episode page. So thank you very much, Cody for a right interesting conversation.

Cody: Thanks again for having me.

Jim: It was great. We’ll have you back sometime.

Cody: Yeah, perfect. Thank you, Jim.