"Extreme Scheming"—the State of Intelligence in Current AI
Or: Is intelligence possible without consciousness?
Serendipity strikes in weird ways. I’m still on Twitter—despite Elon Musk’s shenanigans—because I’m not sure which social medium to change to (there are too many choices), and because building a 2000+ following will take a lot of time and effort (actually, I’m spending most of my effort here on Substack, because this is where I feel most at home). In any case, complementary posts appeared in two side-by-side columns in Tweetdeck.
Mark Sumner’s tweets are in one column. Mark does fantastic work at Daily Kos, and is also an SF writer.
’s tweets are in the other column. I love Erik’s debut “The Revelations” and try to follow his explorations into consciousness. Independent of each other, both Mark and Erik have warned against the dangers of AI (with Elon Musk as a strange bedfellow) becoming an existential threat.As coincidence had it, Mark had RT’d a tweet from Armand Domalewski about a virtual test with an AI drone gone awry, while Erik was following a conversation where Brad Kelly states:
“People are worried that the computers will become conscious; I’m worried that they’ll make a world in which consciousness is obsolete.
Whatever consciousness is.”
This made me think, as it immediately reminded me of Peter Watts’s novel “Blindsight”, in which non-conscious aliens invade our solar system. The central theme of that novel being that consciousness consumes a huge amount of resources, so if aliens—or, who knows, AIs—find a way to become intelligent without becoming (self-)conscious, then they can use the extra brainpower—the extra resources—in becoming even more intelligent, and will outcompete conscious beings.
This hinges on the assumption that intelligence is possible without consciousness. In another tweet, Erik Hoel even remarked that—according to a paper called “Falsification and Consciousness” he co-wrote with Johannes Kleiner—intelligence and consciousness may very well be orthogonal (cue to applause from Peter Watts).
Caveat: since I’m writing this to stay current as the news about the AI drone simulation gone awry develops, I haven’t had time to read that paper. On top of that, much of it will probably go over my head. Nevertheless, I do have an opinion about consciousness and intelligence. But first I need to get to the AI drone story.
The original tweet from Armand Domalewski that I retweeted has been deleted by the author (the reason will be explained in a minute). I didn’t make a screenshot (in this era of twitter volatility that’s becoming almost essential), but I have copied & pasted the text of the original story:
He said that one simulated test saw an AI-enabled drone tasked with a SEAD mission to identify and destroy SAM sites, with the final go/no go given by the human. However, having been ‘reinforced’ in training that destruction of the SAM was the preferred option, the Al then decided that 'no-go' decisions from the human were interfering with its higher mission - killing SAMs - and then attacked the operator in the simulation.
Said Hamilton: "We were training it in simulation to identify and target a SAM threat. And then the operator would say yes, kill that threat. The system started realising that while they did identify the threat at times the human operator would tell it not to kill that threat, but it got its points by killing that threat. So what did it do? It killed the operator. It killed the operator because that person was keeping it from accomplishing its objective.” He went on: “We trained the system - ‘Hey don't kill the operator - that's bad. You're gonna lose points if you do that?’ So what does it start doing? It starts destroying the communication tower that the operator uses to communicate with the drone to stop it from killing the target.”
The next morning (June 2, 2023), this story was reported in the media. I noticed it on the Guardian: “US Air Force denies running simulation in which AI drone ‘killed’ operator” and in Dutch news website NOS: “AI-drone die opdrachtgever doodt om te winnen? ‘Leerzame simulatie’.”
In the meantime, Armand Domalewski deleted his original tweet and explained why in this thread:
I deleted this tweet because the “AI powered drone turns on its operator story” was total nonsense—the Colonel who described it as a simulation now says it was just “a thought experiment.”
The NOS story is updated with a remark that Colonel Hamilton had “expressed himself unclearly” and that the story was a thought experiment. The Guardian so far has not updated their article with this. Also, it’s to nobody’s surprise that the US Air Force denies the story. However, the story itself—an AI doing everything in its capabilities to achieve its goal—reminds me of Nick Bostrom’s famous paperclip maximizer (philosophically speaking a form of Instrumental convergence), in which an AI tasked with producing paperclips uses all possible resources, turning the world into either a gigantic paperclip producing factory, or a planet-sized heap of paperclips.
Another example of a system pursuing its goal into the extreme is an anecdote I heared from a colleague at work. Holland Signaal—currently Thales Nederland—is a Dutch company that develops (defensive) weapons, and one of their best known products is the goalkeeper CIWS. It’s basically a tremendously fast-firing cannon that tries to shoot incoming missiles from the sky before they hit a naval vessel. Of course, there is software controlling this cannon (as humans simply can’t shoot that fast, as it shoots 4,200 round per minute, or 70 rounds per second).
According to this anecdote, the goalkeeper and similar systems were tested during a joint NATO exercise. Part of this exercise was shooting down a decoy. This decoy was hanging down from a long rope connected to an aircraft above. As the other systems were finished training with the decoy (the decoy needed to be replaced often), it was the goalkeeper’s turn. According to the anecdote (unfortunately I can’t find any confirmation online), the goalkeeper not only shot down the decoy on its first try, but kept shooting as it traced the line from which the decoy had been hanging. It was going all the way up to the aircraft and the control system had to be shut down in order to prevent it shooting down the aircraft.
True or not? I did find a Reddit post with a video of a similar system suddenly detecting a commercial aircraft in the sky above it. Such cannons can reach heights of 3.5 to 4 kilometres, so this is quite scary.
And there are AIs—or computer programs—that will go to extreme ends to win. We’ve all heared of them: chess programs of which IBM’s Deep Blue was the first one—that can now easily run on a laptop—that will defeat the best humans effortlessly1 and a computer program—AlphaGo—that beats the human world champion. Not to mention the lengths many neural networks go to in order to find a solution to a problem2, while nobody knows how the neural networks actually function. So all in all, the AI drone running a well-nigh endless arrays of decision trees, going to extreme ends to maximise its score (which is the goal hard-coded into it) doesn’t sound that far-fetched to me.
Of course, the most straightforward solution in case of the AI drone scenario is building in extra safeguards—safeguards that are part of the core programming and cannot be overridden. Such safeguards could be a killswitch hard-coded in a drone’s programming, or a self-destruct timer that needs to be reset by the operator every 24 hours or so (in case it gets out of hand or in enemy hands) and, indeed a command ‘don’t shoot the operator’ also hard coded in the core program. Plus, obviously, a list of things not to shoot.
Make no mistake, such safeguards can work with a well-defined mission, such as in certain military missions. But things quickly become extremely complicated in other situations.
What current AIs are doing is basically either running down an immense array of decision trees, or badly copy, mix, rephrase and paste content from their—admittedly huge—training data. No, a tireless decision tree chaser, a twisted plagiarism machine, a derivative drivel producing algorithm isn’t intelligent, as it’s still only following its programming. But aren’t we humans the same? No, due to our consciousness we are not just following our ‘programming’.
In general, this is the problem with AIs in which the goal is sacrosanct; that is, it is part of the core program (or hard-coded, as some would call it), literally written in its code. As such, it cannot ignore it. It cannot ‘make amends’ and will produce paperclips until the planet runs out of resources or will scheme to prevent an overseer from carrying out its sacred task.
It’s not thinking, it’s running down decision trees until it stumbles upon a solution that helps forward its goal. It’s scheming, running through all scenarios of its game plan, exhausting all undesired outcomes until it finds the one(s) forwarding its goal. This is extreme scheming, not intelligence.
As such, current AIs do not have the intelligence, nor the agency, to change the goal if circumstances—or developing insights—so dictate. AIs have no mechanism for making amends. This, however, may very well be an intrinsic part of consciousness.
Humans (and sentient intelligences like animals and, indeed, plants) have agency. This means they can change their goals if circumstances so dictate. Examples: change behaviour in case of fire, drought, flood and climate change. Don’t hunt prey to extinction (as also mentioned by our aboriginal guide when we toured Murujuga), which mindless AI will do.
So there is a system we humans use that allows us to ignore or change certain goals if circumstances—or even changed insights—so dictate. We learn from mistakes. We change our behaviour if it’s not successful anymore3. So how do we do that?
We do it by trying to predict our perceptions. Wait what? Well, check out this Quanta Magazine article “To Be Energy Efficient Brains Predict their Perceptions.” If you have more time, read Anil Seth’s “Being You”, which I cannot recommend high enough.
As Erik Hoel remarked in this tweet: “the main function of the brain is to generate a stream of consciousness”, which is a reaction to of Jamie Taylor’s tweet: (which in its turn links to Horizon’s article “Theory of predicitve brain as important as evolution”:
“The main purpose of the brain, as we understand it today, is it is basically a prediction machine that is optimising its own predictions of the environment...It's basically creating an internal model of what's going to happen next.”
Anil Seth describes consciousness as a feed forward reality predictor that’s constantly updated if the prediction is not accurate. As such, consciousness makes us ‘hallucinate’ the near-future, and on top of that it has a mechanism to modify that hallucination when needed.
That sounds suspiciously much like a mechanism that can ‘make amends’.
Now can such a mechanism be implemented in machine code? I don’t know, but I suspect that such an implementation will indeed be computationally costly (like, indeed, consciousness). I see several problems such as the definition of an AI’s goals and the required flexibility in their execution.
Such definitions will never be watertight, as Isaac Asimov has demonstrated with his famous “Three Laws of Robotics”. Initially meant to quell fears of robots as killing machines obliterating humanity (plus ça change, plus c’est la même chose), Asimov increasingly found loopholes in these laws, depicting these in later stories.
The thing is that such ‘laws’ are, in reality, both moving goalposts and dependent on the circumstances. Describing and programming for each and every different circumstance will be extremely computationally costly, while moving goalposts mean the software needs to be updated all the time. Which implies that AI either must be centrally controlled (say, from a cloud) or have the ability to rewrite its own code.
Yet consciousness provides the human mind with a flexibility to change its goals, to make amends, to learn from mistakes and then improve our behaviour. While running our consciousness may seem computationally costly, evolution has showed that consciousness’s advantages greatly outbalance its costs, as humans became the dominant species on the planet.
But what if future AIs find a better way? When AIs become AGI (Artificial General Intelligence) and can change their behaviour according to the circumstances?
Well, I don’t see how they’ll do this without developing something akin to consciousness. For one, there needs to be a mechanism in place that prevents rewriting its own code from becoming a total chaos (the resulting code must still function, and the more complicated the code becomes, the more likely that cruft, bugs and other unwanted behaviour develops, just ask any programmer working for Microsoft, Apple or Google, and other companies).
With humans, evolution functioned as this—very harsh—fault correction mechanism. But AIs are not interacting with each other, so there’s no evolutionary pressure. The actual evolutionary pressure is on the programmers, who are—so far—still humans.
For another, such an AGI must understand its environment and its dymanic behaviour in order to be able to change its own behaviour accordingly. I strongly suspect that ‘understanding’ developed in our brains (and that of many animals) as a consequence of sentience and/or consciousness. Because it’s one thing to see that a prediction is wrong (there are negative consequences, which we—back in prehistoric times—would be lucky to survive), but a totally different one to correct the ‘future-predicting’ system generating it.
The latter, I suspect, is the root of what we now call ‘understanding’. Understanding goes hand in hand with the ability to change one’s behaviour, to change one’s original plans, to make amends. And for understanding to work, one needs another tool. I’ll call it ‘intelligence’.
Intelligence is the ability to recognise the consequences of certain behaviour, to recognise certain patterns and act on this knowledge, and to improve, to develop this intelligence itself (learning) constantly. Understanding, intelligence and learning go hand in hand. They are intricately linked. One does not work without the other. They power true creativity. And—here’s my hypothesis—consciousness enables them. Consciousness—the impetus to try to predict our sensory input before we actually recieve it—requires flexibility (note that our brain’s neurons have plasticity). This flexibility also emerges in our intelligence, understanding and creativity.
Try as I might—and I might be philosophising in my own cul-de-sac, admittedly—but I don’t see how intelligence and its cousins understanding, learning and creativity can work without the underlying mechanism of consciousness.
If an AGI wants to become a pro-active actor, a system that tries to predict what will happen it needs to develop a kind of inner flexibility with a control mechanism in place.
Yes, we can give AI the ability to rewrite its own code (this might be on the horizon). But then there must be a mechanism in place that checks and corrects this if it introduces too many malfunctions, unwanted behaviour (and for that, it needs to know what that unwanted behaviour comprises) and other failures. Right now, this correction is done by humans, which introduces a totally different bias.
While evolution as a strict survival mechanism has become less important in modern society, evolution still works in full force as a social adaptation mechanism. Through our ongoing interaction with our—highly changed and rapidly changing—environment *and* ourselves, evolution remains the mechanism that keeps the changes in our behaviour in check. AIs don’t have such a self-correcting mechanism, because:
They don’t perceive their environment directly enough;
They don’t interact sufficiently with their environment and their peers (other AIs);
As to 1: we humans also don’t perceive reality directly, as our consciousness filters and preconceives it. However, this manipulation comes from inside ourselves and is constantly corrected. What computers perceive depends on what we feed to them (training data) and through an extra filter called the internet. While humans are, at most, two steps removed from actual reality (the manipulation by consciousness and the translation in our nervous system), AIs perceive reality through at least three, and mostly many more filters, which needlessly complicates matters;
As to 2: AIs experience no downside to their behaviour. Yes, they will go to extreme ends to satisfy the goal written in their core program, but there is no punishment if they get it wrong. Alright, they might be discontinued or replaced by the next version, but they do not notice this, as they experience no fear. All actions on a grand scale are compromises. If AIs only experience the reward (goal achieved) without experiencing the punishment (you die if you get it wrong), they will only explore one side of the Bell curve of existence, meaning they’ll be lopsided in their explorations;
One step forward would be to bring AIs in contact with each other so they can not only observe others like themselves, but also interact with each other, and learn from that. Yes, what I didn’t yet mention is that consciousness also is a communication accelerator; that is, it enables us to recognise others like us and the communicate with them. Then behaviour can not just be changed through experience, but also via learning from others. This in its turn accelerates evolution for those involved, leading to methods to change the environment to the species’ liking instead of just adapting to it. Examples: tools, agriculture, cities, cultures (simplified, of course, but I’m trying to keep this essay relatively short). Another advantage AIs do not have, as the various AIs run by different companies are carefully kept isolated from each other to prevent such hideous acts (from the PoV of the corporations) as intellectual property theft, exchange of company secrets and more.
As such, the greatest danger I see in an AI that’s on the brink of sentience, let alone consciousness is solipsism. Where are its peers? At best, it gets only a very indirect glimpse of them, meaning this also stifles its development. The way things are now, I only see an AGI if we—humans—carefully develop it towards it. Which we aren’t because it needs to make money (or follow other, very limited goals). Cut off from the full experience of reality it’s no wonder AIs are limited in their abilities and usability. One of the basic requirements—beyond much else—for an AGI to develop is to expose it to life in general, as unbiased as possible. And that isn’t happening, as far as I can see.
TL;DR: current AIs are extreme schemers whose ‘coping mechanism’ is far removed from what we consider intelligence. Human intelligence—which brought us modern society with all its flaws and victories—goes hand in hand with understanding, learning and creativity. And all of these are enabled by consciousness.
So I don’t think intelligence as we understand it is orthogonal to consciousness, but rather is an emergent property of it. Is it possible that AIs could develop a type of machine intelligence? I don’t know. If machine intelligence is merely an ultimate scenario testing machine, this ‘machine intelligence’ will fall flat on its (inter)face the moment the environment changes and unpredictable things happen. Chaos theory has taught us that it is impossible—in chaotic systems, of which there are a lot—to predict everything. A cold, inflexible machine intelligence that tries to literally develop a decision tree of everything that can happen will not only be much less efficient than consciousness, it is also physically impossible (to completely write, let alone follow, such an ultimate decision tree). So a system that’s able to imagine the unpredictable and act upon it—say, a conscious mind with learning, understanding and, ultimately, creative capabilities—will have a survival advantage in comparison with an inflexible system.
And I suspect that the moment actual flexibility rears its (ugly? beautiful?) head in machine intelligence, some form of sentience or even consciousness will not be far away. And such AGIs will not be qualitatively different from us.
Author’s note: this post exemplifies what I mean with “The Divergent Panorama”; that is, after first zooming in on an actual issue, it gradually zooms out and tries to see ‘the bigger picture’ (with apologies to Dream Theater). I may very well be wrong, but I certainly hope this has inspired you to think big, as well.
And which can be used for cheating, see Hans Niemann;
Such solutions are often extremely specific and highly brittle; that is, they fall apart the moment thye’re used outside their very tightly defined use case;
Sociopaths don’t, meaning their consciousness isn’t functioning properly;