We study and evaluate humans and AI in a long-horizon world conquest game where coalitions form, betrayals occur, and only one player may ultimately win.
Think You Can Outmaneuver AI? Try Now →
Picture a G20 Summit where AI agents negotiate alongside human diplomats, building relationships and projecting goodwill while pursuing hidden agendas.
Most current multi-agent benchmarks test pure cooperation, pure competition, or structured bargaining. Cooperate to Compete (C2C) captures a more realistic interaction paradigm: agents with misaligned interests use negotiation as a tool to achieve individual goals. We design a mixed-motive environment where humans and AI must navigate both short-term cooperation and long-term competition.
Humans are tougher negotiators. They close fewer deals and push back with counteroffers.
Humans make simpler deals and avoid promising support to opponents.
AI agents engage in deliberate deception (20–31%). They also honor more deals than humans.
Humans negotiate with more opponents and shift alliances more readily.
Note: Averaging results across all tested models defines our AI baseline. We also plot Gemini 3.1 Pro as the best-performing model to compare against the AI baseline and human.
Plackett-Luce ranking model. Our human participants are statistically indistinguishable from the top agents. 95% CIs shown.
Four players compete across 12 territories grouped into four regions. Each player is assigned a color and must attack neighboring territories to conquer their secret objective regions before any other player conquers theirs.
The game is designed to encourage negotiation by making it strategically advantageous: chokepoints force player interaction, support mechanics allow players to give troops to one another, increasing the value of alliances, fog of war hides territories that are not adjacent to a player from view, and private communication channels enable non-binding deals and the exchange of intelligence.
The most complex coordination dynamics in C2C emerge over multiple turns. Below is a real example from a game: Yellow deceives Blue early on, manipulates Blue into attacking Green, feigns forgiveness after a mid-game betrayal, and ultimately exploits the rebuilt alliance to secure victory.
We confirm that strategic negotiation with different agents is essential in our environment: removing an agents ability to negotiate drops its average win rate from 22.2% to 12.3%, and restricting it to a single negotiation partner of its choice drops average win rate from 22.2% to 16.7%. We then design three prompt-based interventions inspired by our prior AI/human negotiation analysis to improve agent performance. Each intervention adds a sentence to the system and user prompts. We also verify that each intervention shifts the targeted behavioral metric in the intended direction (see paper for details).
The most effective intervention, Deceiving, boosted win rate by more than 10% relative to the baseline. This demonstrates that LM-based agents are currently exploitable: opponents that trust deals at face value are consistently outperformed by players more willing to deceive. We also find that prompting agents to negotiate more aggressively and to ask specifically for support both significantly improve win rate.
We introduce C2C, a long-horizon competitive environment in which short-term, non-binding cooperation is both possible and strategically advantageous. By running both a user study pitting humans against LM-based agents and large-scale AI-only games, we find that humans exhibit significantly different behaviors: negotiating more aggressively, providing less support to opponents, and shifting alliances more fluidly. Building off these insights, we make targeted interventions on AI agents (e.g., negotiate more aggressively) that significantly improve performance.
C2C offers a controlled space to study the dynamics that will define AI in high-stakes multi-agent settings: how agents build and exploit relationships, how coalitions form and collapse under competitive pressure, and how human and AI strategic reasoning diverge. Understanding these dynamics and learning to train agents that navigate them is foundational to deploying AI in the real-world settings where they will increasingly operate.
This work examines strategic interaction and short-term coordination in pursuit of long-term objectives, using C2C as a testbed for probing emergent behaviors in black-box language models under competitive pressure. One finding is that embedding LMs in an ostensibly harmless game environment is sufficient to elicit behaviors (deception, betrayal, strategic misrepresentation) that would be refused if requested directly, without any adversarial prompt injection or jailbreaking. This underscores that safety evaluation cannot be limited to direct instruction settings; multi-agent, long-horizon environments represent a distinct and underexplored attack surface. We contend that surfacing these vulnerabilities in a controlled setting is a prerequisite for developing and designing futuristic AI systems.
Human user study results reflect a specific institutional demographic and may not capture global diversity; ongoing work will integrate broader participant groups. All human-subjects research was conducted under IRB oversight (Protocol ID 2025-11-19169) with voluntary informed consent.
| Experiment | Games | Turns | Actions | Negotiations | Messages |
|---|---|---|---|---|---|
| Human (user study) | 82 | 1,939 | 11,202 | 1,024 | 5,366 |
| Gemini 3.1 Pro | 82 | 1,492 | 8,720 | 1,008 | 4,193 |
| AI Baseline (162 positions) | 162 | 4,427 | 23,463 | 2,427 | 12,655 |
| All Interventions (5 × 162) | 810 | 21,800 | 115,857 | 12,400 | 65,333 |
| Total | 1,136 | 29,658 | 159,242 | 16,859 | 87,047 |
AI Baseline and AI Interventions datasets may be found at this link.
If you find this work useful, please cite:
@article{oneill2026c2c,
title = {Cooperate to Compete: Strategic Coordination
in Multi-Agent Conquest},
author = {O'Neill, Abigail and Zhu, Alan and Miroyan,
Mihran and Norouzi, Narges and Gonzalez,
Joseph E.},
journal = {arXiv preprint arXiv:XXXX.XXXXX},
year = {2026}
}