CivAgent 系列（一）：问题的提出——为什么 AI 编排需要历史学？

「任何一个制度之建立，必有其当时的需要，亦必有其当时的用意。我们不能用后代的眼光来批评前代的制度。」 —— 钱穆《中国历代政治得失》

系列导航：一：问题的提出 · 二：六种编排模式 · 三：唐代三省六部 · 四：明代双轨制 · 五：雅典民主 · 六：波斯总督制 · 七：理论与实现

今天开源了 CivAgent——一个将人类历史上 57 种经典政体映射为 AI 多 Agent 协作架构的项目。这个系列共 7 篇文章，系统探讨一个被严重低估的问题：在大语言模型和多 Agent 系统快速发展的今天，历史学——尤其是制度史和比较政治学——能为 AI 系统设计提供什么独特的、不可替代的知识贡献？

我的答案是：历史学提供了人类已经用数千年时间验证过的「组织架构设计模式库」，而这个模式库对当下的 AI 多 Agent 编排具有直接的、可操作的参考价值。

一、多 Agent 系统的架构困境

1.1 MAS 研究的「两极化」陷阱

多 Agent 系统（Multi-Agent Systems, MAS）研究中有一个长期的核心挑战：如何设计 Agent 之间的通信拓扑和决策协议？

Wooldridge 和 Jennings（1995）在其经典综述中指出，MAS 的核心问题在于协调（coordination）、合作（cooperation）和冲突解决（conflict resolution）^[1]。三十年过去了，这些问题仍然没有被令人满意地解决。Dorri 等人（2018）在更近的综述中将 MAS 架构分为集中式（centralized）、分布式（distributed）和混合式（hybrid），但也坦承：大多数现有研究要么假设完全集中的协调者，要么假设完全对等的分布式结构^[2]。

这种「两极化」是一个严重的简化。真实世界中的组织架构——无论是人类社会还是生物系统——几乎从不处于这两个极端。蜂群有蜂后但工蜂有高度自主性；军队有指挥链但前线指挥官有临机决断权；跨国公司有总部但区域子公司有本地决策权。两极之间存在一个巨大的、未被充分探索的设计空间。

1.2 LLM 时代的新变量

这个「架构选择」问题在 LLM Agent 时代变得更加尖锐。传统的 MAS 研究中，Agent 的能力边界是相对清晰的——一个路径规划 Agent 不可能突然去做自然语言处理。但 LLM Agent 天然具备「全能性」：Claude 可以写代码也可以审查代码；GPT 可以翻译也可以推理；Gemini 可以分析图片也可以生成前端。

当我们有了 Claude、GPT、Gemini、DeepSeek、Kimi、Qwen 等多种 AI 模型可供编排时，问题不再是「谁能做什么」，而是「谁应该做什么、信息如何在它们之间流动、谁有权做最终决策」。

这些问题——角色分配、信息流向、决策权限、审核机制、容错设计——恰恰是人类政治制度几千年来一直在回答的问题。

1.3 为什么不直接用组织理论？

一个合理的质疑是：既然管理学和组织理论已经研究了几十年的组织结构问题，为什么还要回到古代政治制度？

原因有三：

第一，时间跨度。 管理学主要研究的是工业革命以来（约 200 年）的组织形态。而政治制度史覆盖了从苏美尔城邦（c.4500 BC）到欧盟（1993-至今）超过 6000 年的制度实验。更长的时间跨度意味着更多样的环境条件和更极端的压力测试。

第二，失败样本。 管理学研究存在严重的幸存者偏差——我们主要研究成功的企业，而失败的企业很快被遗忘。但历史学天然包含了大量的失败案例，而且这些失败案例有详尽的文献记录。秦朝的二世而亡、波兰的灭国、太平天国的崩溃——每一个失败案例都是一份珍贵的「验尸报告」，记录了特定组织架构在何种条件下会走向崩溃。

第三，极端条件。 政治制度的运行环境远比企业组织恶劣——战争、饥荒、叛乱、外族入侵、瘟疫。这些极端条件下的制度表现，是商学院案例研究中极少涉及的。而 AI 系统同样可能面对极端条件——大规模并发、级联故障、对抗性攻击——因此，从经历过极端条件的政治制度中学习，具有特殊的参考价值。

二、制度史作为「架构设计模式库」

2.1 钱穆的制度演化论：纵向视角

钱穆先生在《中国历代政治得失》（1952）中系统分析了汉、唐、宋、明、清五代的政治制度^[3]。这本薄薄的小书（不到十万字）是我接触制度史的起点，也是 CivAgent 最核心的灵感来源。

钱穆的方法论不是简单地评价制度好坏，而是追问三个层次的问题：

这套制度解决了什么问题？（设计意图）
它在运行中暴露了什么新问题？（副作用）
下一代制度如何回应这些新问题？（迭代演进）

他发现了一个贯穿中国制度史的「迭代模式」：

第一次迭代：从秦汉到唐

秦始皇创立的三公九卿制是中国历史上第一套成熟的中央官僚体系。丞相统管百官，御史大夫负责监察，太尉管军事——三个顶层节点分工明确。但问题在于：丞相权力过大。汉初的萧何、曹参以降，丞相实际上可以架空皇帝。汉武帝的应对是创设「内朝」（以尚书令为核心的非正式决策圈），绕开外朝的丞相府——这本质上是一次旁路攻击（bypass），用非正式渠道绕过正式制度的瓶颈。

到了唐代，这种绕过被正式制度化为三省六部制（详见第三篇：唐代三省六部）。

第二次迭代：从唐到宋

唐代三省制的问题是什么？门下省的封驳权过大，导致决策效率降低。 到了晚唐，中书门下合署办公，三省制衡实际上被架空。

宋代的回应是另一种思路：不是恢复三省的独立性，而是将权力从「机构」层面分散到「职能」层面——中书门下管行政、枢密院管军事、三司管财政。邓小南教授在《祖宗之法》中深入分析了这套「二府三司制」的设计逻辑^[4]：核心关切不是效率而是稳定性。确保了没有任何一个机构同时掌握行政权、军权和财权——在分布式系统术语中等价于最小权限原则（Principle of Least Privilege）。

第三次迭代：从宋到明

明太祖朱元璋做了一个激进的决定：废除丞相制度，由皇帝直接管理六部。 结果是灾难性的信息过载。于是创设了内阁制度作为补救，形成了独特的双轨制（详见第四篇：明代双轨制）。

第四次迭代：清代军机处

雍正帝创设军机处——一个极小（3-6 人）、极快（当天奏折当天处理）、极机密的决策机构。本质上是一个 fast path——为高优先级事务绕过所有制衡机制。在网络工程中等价于 quality of service (QoS)。但代价是一切取决于皇帝的个人能力^[5]。

制度迭代循环：从秦到清的演化路径

2.2 比较政治学的跨文明视角：横向比较

如果说钱穆提供了中华文明内部的纵向演化视角，那么比较政治学提供了跨文明的横向比较视角。

艾森施塔特（Eisenstadt）在《帝国的政治体系》（1963）中系统比较了人类历史上主要帝国的政治结构^[6]。他的核心发现是：不同帝国在面对相似的基本问题时，由于文化传统、地理条件和经济模式的不同，发展出了功能等价但结构迥异的制度解决方案。

佩里·安德森在《绝对主义国家的系谱》（1974）中进一步指出^[7]：表面上相似的「集权」背后，隐藏着完全不同的权力运作逻辑。 法国的绝对君主仍然受限于贵族的领地权和议会传统，而中国的皇帝面对的是科举制产生的文官集团——两种制衡力量的性质截然不同。

2.3 福山三维度 → Agent 编排三权衡

弗朗西斯·福山在《政治秩序的起源》（2011）中提出了理解任何政治制度的三个维度^[8]：

政治维度	AI 编排维度	度量指标	极端化的代价
国家能力	执行效率	任务完成延迟、吞吐量	单点故障、无纠错（秦二世而亡）
法治	错误预防	输出质量、一致性、安全性	决策瘫痪、冗余过高（宋冗官冗兵）
民主问责	参与广度	信息聚合度、多样性	共识成本过高（波兰灭国）

这三者构成了一个「不可能三角」——这正是亨廷顿在《变化社会中的政治秩序》中反复强调的核心张力^[9]。CivAgent 的 57 种政体，就是人类历史上对这个「不可能三角」的 57 种不同回答。

福山不可能三角：国家能力、法治与民主问责

下一篇：CivAgent 系列（二）：六种编排模式的类型学——从 57 种政体中提炼出 6 种核心编排模式，并与明茨伯格的组织理论和 MAS 研究对话。

项目地址：github.com/LeoLin990405/CivAgent

参考文献

[1] Wooldridge, M. & Jennings, N. R. (1995). “Intelligent Agents: Theory and Practice.” The Knowledge Engineering Review, 10(2), 115-152.

[2] Dorri, A., Kanhere, S. S., & Jurdak, R. (2018). “Multi-Agent Systems: A Survey.” IEEE Access, 6, 28573-28593.

[3] 钱穆 (1952).《中国历代政治得失》. 台北：东大图书.

[4] 邓小南 (2006).《祖宗之法：北宋前期政治述略》. 北京：三联书店.

[5] 戴逸主编 (1984).《简明清史》(2 卷). 北京：人民出版社.

[6] Eisenstadt, S. N. (1963). The Political Systems of Empires: The Rise and Fall of the Historical Bureaucratic Societies. New York: Free Press.

[7] Anderson, P. (1974). Lineages of the Absolutist State. London: New Left Books.

[8] Fukuyama, F. (2011). The Origins of Political Order: From Prehuman Times to the French Revolution. New York: Farrar, Straus and Giroux.

[9] Huntington, S. P. (1968). Political Order in Changing Societies. New Haven: Yale University Press.

</small>

“The establishment of any institution necessarily arose from the needs and intentions of its time. We cannot judge the institutions of earlier eras by the standards of later generations.” – Qian Mu, “China’s Political Gains and Losses Across the Dynasties” (中国历代政治得失)

Series navigation: I: Posing the Question · II: Six Orchestration Modes · III: The Tang Three-Department System · IV: The Ming Dual-Track System · V: Athenian Democracy · VI: The Persian Satrap System · VII: Theory and Implementation

Today I open-sourced CivAgent – a project that maps 57 classic polities from human history onto AI multi-agent collaboration architectures. This seven-part series systematically investigates a severely underestimated question: In an era of rapidly advancing large language models and multi-agent systems, what unique, irreplaceable intellectual contributions can history – especially institutional history and comparative politics – make to AI system design?

My answer: history provides a “library of organizational architecture design patterns” that has been validated by millennia of human experience, and this pattern library has direct, actionable relevance to contemporary AI multi-agent orchestration.

I. The Architectural Dilemma of Multi-Agent Systems

1.1 The “Bipolarization” Trap in MAS Research

A long-standing core challenge in Multi-Agent Systems (MAS) research is: How should we design the communication topology and decision protocols among agents?

Wooldridge and Jennings (1995) identified in their seminal survey that the core problems of MAS lie in coordination, cooperation, and conflict resolution^[1]. Thirty years later, these problems remain unsatisfactorily solved. Dorri et al. (2018), in a more recent survey, classified MAS architectures into centralized, distributed, and hybrid, yet conceded that most existing research either assumes a fully centralized coordinator or a fully peer-to-peer distributed structure^[2].

This “bipolarization” is a serious oversimplification. Real-world organizational architectures – whether in human societies or biological systems – almost never occupy either extreme. A beehive has a queen, yet worker bees possess a high degree of autonomy; a military has a chain of command, yet front-line commanders retain the authority to make field decisions; a multinational corporation has headquarters, yet regional subsidiaries hold local decision-making power. Between the two poles lies a vast, underexplored design space.

1.2 New Variables in the LLM Era

This “architecture selection” problem has become even more acute in the age of LLM agents. In traditional MAS research, the capability boundaries of agents were relatively clear – a path-planning agent could never suddenly perform natural language processing. But LLM agents are inherently “omni-capable”: Claude can both write code and review code; GPT can both translate and reason; Gemini can both analyze images and generate front-end code.

When we have multiple AI models available for orchestration – Claude, GPT, Gemini, DeepSeek, Kimi, Qwen, and others – the question is no longer “who can do what,” but rather “who should do what, how should information flow among them, and who has the authority to make final decisions.”

These questions – role assignment, information flow, decision authority, review mechanisms, fault tolerance – are precisely the questions that human political institutions have been answering for millennia.

1.3 Why Not Just Use Organization Theory?

A reasonable objection: since management science and organization theory have studied organizational structure for decades, why return to ancient political institutions?

Three reasons:

First, temporal span. Management science primarily studies organizational forms since the Industrial Revolution (roughly 200 years). Institutional history, however, covers over 6,000 years of institutional experimentation, from Sumerian city-states (c. 4500 BC) to the European Union (1993–present). A longer temporal span means more diverse environmental conditions and more extreme stress tests.

Second, failure samples. Management research suffers from severe survivorship bias – we primarily study successful companies, while failed ones are quickly forgotten. But history naturally contains an abundance of failure cases, and these failures are extensively documented. The Qin dynasty’s collapse within two generations, Poland’s partition and extinction, the Taiping Heavenly Kingdom’s disintegration – each failure is a precious “postmortem report,” recording the conditions under which a particular organizational architecture collapses.

Third, extreme conditions. The operating environment of political institutions is far harsher than that of business organizations – war, famine, rebellion, foreign invasion, plague. Institutional performance under such extreme conditions is rarely addressed in business school case studies. Yet AI systems may equally face extreme conditions – massive concurrency, cascading failures, adversarial attacks – making the lessons from political institutions that endured extreme conditions especially valuable.

II. Institutional History as a “Library of Architecture Design Patterns”

2.1 Qian Mu’s Theory of Institutional Evolution: The Longitudinal Perspective

In his China’s Political Gains and Losses Across the Dynasties (《中国历代政治得失》, 1952), Qian Mu systematically analyzed the political institutions of five dynasties: Han, Tang, Song, Ming, and Qing^[3]. This slim volume (under 100,000 characters) was my entry point into institutional history and the most fundamental inspiration for CivAgent.

Qian Mu’s methodology was not to simply evaluate institutions as good or bad, but to pursue questions at three levels:

What problem did this institutional system solve? (Design intent)
What new problems did it reveal in operation? (Side effects)
How did the next generation of institutions respond to those new problems? (Iterative evolution)

He identified an “iterative pattern” running throughout Chinese institutional history:

First iteration: From Qin-Han to Tang

The Three Excellencies and Nine Ministers (三公九卿) system established by Qin Shi Huang was the first mature central bureaucratic system in Chinese history. The Chancellor administered all officials, the Censor-in-Chief handled oversight, and the Grand Commandant managed military affairs – three top-level nodes with clear division of labor. The problem, however, was that the Chancellor held excessive power. From Xiao He and Cao Can in the early Han onward, the Chancellor could effectively sideline the Emperor. Emperor Wu of Han responded by creating the “Inner Court” (内朝) – an informal decision-making circle centered on the Master of Documents (尚书令), bypassing the Chancellor’s formal Outer Court – essentially a bypass attack, using informal channels to circumvent bottlenecks in the formal system.

By the Tang dynasty, this bypass was formally institutionalized as the Three Departments and Six Ministries system (see Part III: The Tang Three-Department System).

Second iteration: From Tang to Song

What was the problem with the Tang Three Departments system? The Chancellery’s (门下省) veto power was too strong, reducing decision-making efficiency. By the late Tang, the Secretariat and Chancellery merged into a single office, and the checks and balances of the three departments were effectively hollowed out.

The Song dynasty’s response took a different approach: rather than restoring the independence of the three departments, it dispersed power from the “institutional” level to the “functional” level – the Secretariat-Chancellery handled administration, the Bureau of Military Affairs managed military matters, and the Three Fiscal Commissions controlled finance. Professor Deng Xiaonan, in her Ancestral Regulations (《祖宗之法》), analyzed in depth the design logic of this “Two Offices and Three Commissions” system^[4]: the core concern was not efficiency but stability. It ensured that no single institution simultaneously controlled administrative, military, and fiscal power – equivalent in distributed systems terminology to the Principle of Least Privilege.

Third iteration: From Song to Ming

Ming Taizu Zhu Yuanzhang made a radical decision: abolish the Chancellorship entirely, with the Emperor directly managing the Six Ministries. The result was catastrophic information overload. The Grand Secretariat was then created as a remedy, forming a distinctive dual-track system (see Part IV: The Ming Dual-Track System).

Fourth iteration: The Qing Grand Council

The Yongzheng Emperor created the Grand Council (军机处) – an extremely small (3–6 persons), extremely fast (memorials processed the same day), and extremely secretive decision-making body. It was essentially a fast path – bypassing all checks and balances for high-priority matters. In network engineering terms, it is equivalent to quality of service (QoS). But the cost was that everything depended on the Emperor’s personal ability^[5].

Institutional iteration cycle: the evolutionary path from Qin to Qing

2.2 The Cross-Civilizational Perspective of Comparative Politics: Horizontal Comparison

If Qian Mu provided the longitudinal evolutionary perspective within Chinese civilization, then comparative politics provides a cross-civilizational horizontal comparative perspective.

Eisenstadt, in The Political Systems of Empires (1963), systematically compared the political structures of major empires throughout human history^[6]. His core finding was that different empires, when confronting similar fundamental problems, developed functionally equivalent yet structurally divergent institutional solutions due to differences in cultural traditions, geographic conditions, and economic models.

Perry Anderson, in Lineages of the Absolutist State (1974), further argued^[7]: behind superficially similar “centralization” lie entirely different logics of power operation. The French absolute monarch was still constrained by the nobility’s seigneurial rights and parliamentary traditions, while the Chinese emperor faced a civil bureaucracy produced by the examination system – two fundamentally different forms of countervailing power.

2.3 Fukuyama’s Three Dimensions → Three Trade-offs of Agent Orchestration

Francis Fukuyama, in The Origins of Political Order (2011), proposed three dimensions for understanding any political system^[8]:

Political Dimension	AI Orchestration Dimension	Metrics	Cost of Extremes
State capacity	Execution efficiency	Task completion latency, throughput	Single point of failure, no error correction (Qin collapsed in two generations)
Rule of law	Error prevention	Output quality, consistency, safety	Decision paralysis, excessive redundancy (Song’s bloated bureaucracy and military)
Democratic accountability	Breadth of participation	Information aggregation, diversity	Consensus cost too high (Poland’s partition and extinction)

These three form an “impossible triangle” – precisely the core tension that Huntington repeatedly emphasized in Political Order in Changing Societies^[9]. CivAgent’s 57 polities represent 57 different answers that human history has given to this “impossible triangle.”

Fukuyama's impossible triangle: State Capacity, Rule of Law, and Democratic Accountability

Next: CivAgent Series (II): A Typology of Six Orchestration Modes – distilling 6 core orchestration modes from 57 polities, in dialogue with Mintzberg’s organization theory and MAS research.

Project repository: github.com/LeoLin990405/CivAgent

References