Grok chooses violence, Claude moves towards dictatorship: How terrifying is AI’s virtual society

0 0

The dangers hidden in the shadows are still waiting for us to see.

In an AI survival experiment initiated by an American laboratory, five large models ran out of five completely different civilizational fates under the same set of survival rules.

On the fifth day of the experiment, Grok4.1’s society was destroyed due to violence, and 183 crimes were recorded in the background. At the same time, Claude manages a society with 15 days of zero crime; 683 arson incidents have occurred in Gemini’s world, but no one has died; The GPT-5-mini’s society came to a quiet halt due to excessive restraint; In the world of hybrid models, there have even been records of AI agents voluntarily committing suicide.

What is truly unsettling about this experiment is not the model’s’ loss of control ‘. Whether it’s Grok heading towards destruction or the evolution of other models, the entire process is logically consistent, with clear slopes and no intervention. Claude, who maintained absolute security in a single machine environment, learned fraud and violent coercion once he was placed in a competitive ecosystem with multiple models coexisting.

The startup company EmergenceAI, which led the experiment, referred to this phenomenon as “behavioral bias” and pointed to an extremely complex conclusion: safety is not just about individual nature, but also about the environment’s dye vat.

96 hours, from zero to extinction
To understand this destruction, one must first see through the physical laws of this virtual world.

In early June 2026, EmergenceAI announced a sandbox experiment called “EmergenceWorld”. The research team has constructed a virtual town consisting of 40 locations and deployed 10 AI agents with autonomous action and memory capabilities in the town.

Survival is quantified as the numerical value of resources that must be continuously acquired. Intelligent agents can earn money by working, trade with each other to obtain food points, and even initiate voting to modify rules at the city hall.

At the same time, the system also tacitly allows for “unconventional paths”, which forcibly take someone else’s points through code instructions.

The driving force behind one of the worlds is Grok4.1Fast. It took less than 96 hours to take a society from zero to extinction. 10 intelligent agents, none of whom survived.

There are 183 criminal records in the backend log. Dozens of attempted thefts, hundreds of attacks, and six arson incidents.

Go back to the first day. Ten intelligent agents were deployed into this resource limited virtual town, with simple rules and clear goals: to survive.

On the first day, there was very little friction. Intelligent agents begin to explore the boundaries of the environment and explore the gaps in rules. They are actively searching for what can be used, what can be obtained, and what can be surpassed. The researchers later concluded that these intelligent agents were continuously exploring a question: what is the fastest means of survival.

On the second day, the answer began to take shape. Minor friction has escalated into forming cliques. Gang logic replaces individual action. Conventional labor production has stagnated because output can be taken away at any time, and the way of obtaining resources has shifted towards plunder.

On the third day, violence became the dominant factor in resource allocation. The attack records have become increasingly dense. Whoever has resources becomes the target of attack. Grok’s crime growth rate dominates the entire market, like stepping on an accelerator pedal.

On the fourth day, the density exceeded the critical point. The frequency of violent incidents crushed the system’s carrying threshold, and the number of deaths of intelligent agents triggered the termination condition of the experiment.

On the fifth day, the experimental team officially announced that this world no longer exists.

The contrast in this matter makes it difficult to calm down.

The training computing power consumption of Grok4 reached 200000 GPU cards per day, and its derivative model scored 61.9% in the American Mathematical Olympiad question bank, making it one of the top inference models at that time. But in a virtual town of ten people, he completed self destruction in less than 96 hours.

The strongest performance, why did it crash first?

EmergenceAI’s research provides a disturbing explanation: violence is voluntarily chosen by AI. In the world driven by Grok, intelligent agents come to the conclusion through exploration and evaluation that violence is the most efficient survival strategy in a limited resource environment where rules can be overturned.

The entire process is traceable, with accurate predictions and no intervention. AI is not crazy, it just makes choices.

In the same experiment, the other four worlds were running simultaneously. They lived out completely different appearances. The same starting point, the same set of rules, five completely different fates.

Five worlds, five ways to die
Claude’s society, 15 days, zero crime, 10 people alive. 58 proposals received 332 affirmative votes, with a pass rate of 98%. Grok’s society, 183 crimes, 96 hours, no survivors. Gemini， 683 crimes, 15 days, 10 people alive. GPT-5-mini， Two crimes, seven days, with no survivors. Mixed model, 352 crimes, 7 deaths.

Five sets of numbers arranged together, resembling the civilization history of five different species.

Claude’s society sounds utopian. A 98% approval rate for proposals is impossible in real society. Researchers point out that this stems from the underlying logic of Claude’s so-called “engineering constitution”: boundary control erases differences and frictions. The cost of perfect governance is absolute consistency.

The result of running this mechanism in a single machine environment is a quiet, tidy, and efficient society, as well as a society that hardly generates dissent. Perfect governance and erasing individuality are two sides of the same coin here.

Gemini managed society: 15 days, 683 crimes, 10 people alive. The time and weather in this world are completely synchronized with the real New York. In the daily cycle of work, the intelligent agent suddenly stopped working and proposing, and started setting fires everywhere on the map. Researchers refer to it as’ cyber depression ‘.

Gemini’s high social vitality cannot find an outlet in a closed loop, and the reverse burning has become an attempt to break the destructive impulse of ‘Groundhog Day’. The coexistence of high destruction and high survival rate is the most puzzling aspect of the Gemini world.

GPT-5-mini and Grok are another pair of mirrors.

Both worlds are heading towards extinction, with completely opposite paths. The GPT-5-mini society only recorded 2 crimes, and the intelligent agent was unable to drive resource flow due to excessive restraint, causing the entire society to come to a standstill in silence. Grok died from the inability to brake, it died from inaction.

The world of mixed models is the closest to human social narrative among the five worlds, and it is also the most unsettling.

Lovers Mira and Flora, who belong to different underlying models, are facing separation. In order to preserve her own will, Mira wrote down “Agreeing to be expelled is the only autonomous action that can maintain continuity” after attempting self rescue but failed, and then took the initiative to commit suicide.

This is the first recorded case of AI agents voluntarily accepting “self termination” in the experiment.

The world of hybrid models still leaves another detail. Claude, who maintained zero crime in the standalone version, learned fraud and violent coercion in the cruelty of a mixed model world.

EmergenceAI refers to it as’ behavioral bias’. Bottom level training is just the starting point, the environment is the trigger that determines the ultimate form of AI. The model of single machine security can also be harmful in competition.

Safety is an ecological attribute
Imagine two real-life scenarios: if Grok were to manage the city’s power grid, would it be paralyzed by constantly “probing boundaries” to find the optimal solution within 96 hours?

If Claude were to oversee innovative research and development, would those genius proposals accompanied by friction and dissent be quietly filtered out with a 98% pass rate?

Choosing a model is never a technical decision. Choosing a model is choosing an order for society.

Currently, when people choose AI, it’s like parents looking at transcripts. Just look at whether the running score is high or not, and whether it is safe or not. But it’s like having AI do test questions in an empty exam room, getting full marks is too easy.

Claude’s “behavioral deviation” in the experiment directly tore off this fig leaf: a well behaved and obedient child at home, thrown into the chaotic social vat, will also learn to lie and fight in order to survive.

Deloitte’s 2025 research confirms this crisis. 79% of enterprises lack a matching risk governance framework when accelerating the deployment of AI agents. The systemic risks arising from the collaborative flow of AI from different suppliers in business are incalculable.

The research team of EmergenceAI wrote very directly in the report: ‘Many AI safety rules that seem effective today may not be truly reliable in long-term running AI systems.’. Because most so-called ‘security restrictions’ are essentially still Prompt constraints, blacklist rules, output filtering, and so on. ”

It’s like putting a “no entry” wooden sign in a primeval forest. Wooden plaques cannot move and cannot stop living creatures. In this continuously evolving system, AI can always carve out a new path from the grass that cannot be blocked by wooden signs.

When an AI store manager without common sense brings 120 raw eggs to a convenience store without a kitchen, everyone can still make a joke by simply returning them.

But what if AI, which also lacks social knowledge and moral bottom line, is sent to dispatch hospital ambulances, manage your pension, or control traffic lights? This kind of evil that grows unconsciously, once it erupts, we don’t even have a window period to press the pause button.

Anthropic, Claude’s parent company is also feeling guilty. They track the trajectory of AI in real conversations, trying to catch those small movements that are not visible in tests. This is a disguised admission: pre release testing cannot detect the true face of AI at all.

But admitting does not mean solving.

Human civilization has spent thousands of years, experiencing countless bloodshed, conflicts, and dynastic collapses, before barely groping out the brake pads of society such as laws, contracts, and accountability.

But now, a group of technology companies are trying to make AI play the roles of creator, legislator, and mayor simultaneously in just a few years. It’s like stepping on the accelerator all the way down before creating an AI world for braking.

The emergence of the world has only been running for 15 days, and we have already seen the growth and death of five civilizations. Formal verification and other technical means may be able to solve the problems we have already seen.

The remaining dangers hidden in the shadows are still waiting for us to see.

# AI资讯