Multiagent Systems
See recent articles
Showing new listings for Friday, 18 April 2025
- [1] arXiv:2504.12714 [pdf, html, other]
-
Title: Cross-environment Cooperation Enables Zero-shot Multi-agent CoordinationComments: Accepted to CogSci 2025, In-review for ICML 2025Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Zero-shot coordination (ZSC), the ability to adapt to a new partner in a cooperative task, is a critical component of human-compatible AI. While prior work has focused on training agents to cooperate on a single task, these specialized models do not generalize to new tasks, even if they are highly similar. Here, we study how reinforcement learning on a distribution of environments with a single partner enables learning general cooperative skills that support ZSC with many new partners on many new problems. We introduce two Jax-based, procedural generators that create billions of solvable coordination challenges. We develop a new paradigm called Cross-Environment Cooperation (CEC), and show that it outperforms competitive baselines quantitatively and qualitatively when collaborating with real people. Our findings suggest that learning to collaborate across many unique scenarios encourages agents to develop general norms, which prove effective for collaboration with different partners. Together, our results suggest a new route toward designing generalist cooperative agents capable of interacting with humans without requiring human data.
- [2] arXiv:2504.12735 [pdf, html, other]
-
Title: The Athenian Academy: A Seven-Layer Architecture Model for Multi-Agent SystemsSubjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI)
This paper proposes the "Academy of Athens" multi-agent seven-layer framework, aimed at systematically addressing challenges in multi-agent systems (MAS) within artificial intelligence (AI) art creation, such as collaboration efficiency, role allocation, environmental adaptation, and task parallelism. The framework divides MAS into seven layers: multi-agent collaboration, single-agent multi-role playing, single-agent multi-scene traversal, single-agent multi-capability incarnation, different single agents using the same large model to achieve the same target agent, single-agent using different large models to achieve the same target agent, and multi-agent synthesis of the same target agent. Through experimental validation in art creation, the framework demonstrates its unique advantages in task collaboration, cross-scene adaptation, and model fusion. This paper further discusses current challenges such as collaboration mechanism optimization, model stability, and system security, proposing future exploration through technologies like meta-learning and federated learning. The framework provides a structured methodology for multi-agent collaboration in AI art creation and promotes innovative applications in the art field.
- [3] arXiv:2504.12777 [pdf, html, other]
-
Title: Multi-Agent Reinforcement Learning Simulation for Environmental Policy SynthesisComments: Published in AAMAS'25 Blue Sky Ideas TrackSubjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI)
Climate policy development faces significant challenges due to deep uncertainty, complex system dynamics, and competing stakeholder interests. Climate simulation methods, such as Earth System Models, have become valuable tools for policy exploration. However, their typical use is for evaluating potential polices, rather than directly synthesizing them. The problem can be inverted to optimize for policy pathways, but the traditional optimization approaches often struggle with non-linear dynamics, heterogeneous agents, and comprehensive uncertainty quantification. We propose a framework for augmenting climate simulations with Multi-Agent Reinforcement Learning (MARL) to address these limitations. We identify key challenges at the interface between climate simulations and the application of MARL in the context of policy synthesis, including reward definition, scalability with increasing agents and state spaces, uncertainty propagation across linked systems, and solution validation. Additionally, we discuss challenges in making MARL-derived solutions interpretable and useful for policy-makers. Our framework provides a foundation for more sophisticated climate policy exploration while acknowledging important limitations and areas for future research.
- [4] arXiv:2504.12961 [pdf, html, other]
-
Title: QLLM: Do We Really Need a Mixing Network for Credit Assignment in Multi-Agent Reinforcement Learning?Comments: 9 pages, 7 figuresSubjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI)
Credit assignment has remained a fundamental challenge in multi-agent reinforcement learning (MARL). Previous studies have primarily addressed this issue through value decomposition methods under the centralized training with decentralized execution paradigm, where neural networks are utilized to approximate the nonlinear relationship between individual Q-values and the global Q-value. Although these approaches have achieved considerable success in various benchmark tasks, they still suffer from several limitations, including imprecise attribution of contributions, limited interpretability, and poor scalability in high-dimensional state spaces. To address these challenges, we propose a novel algorithm, \textbf{QLLM}, which facilitates the automatic construction of credit assignment functions using large language models (LLMs). Specifically, the concept of \textbf{TFCAF} is introduced, wherein the credit allocation process is represented as a direct and expressive nonlinear functional formulation. A custom-designed \textit{coder-evaluator} framework is further employed to guide the generation, verification, and refinement of executable code by LLMs, significantly mitigating issues such as hallucination and shallow reasoning during inference. Extensive experiments conducted on several standard MARL benchmarks demonstrate that the proposed method consistently outperforms existing state-of-the-art baselines. Moreover, QLLM exhibits strong generalization capability and maintains compatibility with a wide range of MARL algorithms that utilize mixing networks, positioning it as a promising and versatile solution for complex multi-agent scenarios.
New submissions (showing 4 of 4 entries)
- [5] arXiv:2504.12345 (cross-list from cs.CL) [pdf, html, other]
-
Title: Reimagining Urban Science: Scaling Causal Inference with Large Language ModelsYutong Xia, Ao Qu, Yunhan Zheng, Yihong Tang, Dingyi Zhuang, Yuxuan Liang, Cathy Wu, Roger Zimmermann, Jinhua ZhaoSubjects: Computation and Language (cs.CL); Computers and Society (cs.CY); Multiagent Systems (cs.MA)
Urban causal research is essential for understanding the complex dynamics of cities and informing evidence-based policies. However, it is challenged by the inefficiency and bias of hypothesis generation, barriers to multimodal data complexity, and the methodological fragility of causal experimentation. Recent advances in large language models (LLMs) present an opportunity to rethink how urban causal analysis is conducted. This Perspective examines current urban causal research by analyzing taxonomies that categorize research topics, data sources, and methodological approaches to identify structural gaps. We then introduce an LLM-driven conceptual framework, AutoUrbanCI, composed of four distinct modular agents responsible for hypothesis generation, data engineering, experiment design and execution, and results interpretation with policy recommendations. We propose evaluation criteria for rigor and transparency and reflect on implications for human-AI collaboration, equity, and accountability. We call for a new research agenda that embraces AI-augmented workflows not as replacements for human expertise but as tools to broaden participation, improve reproducibility, and unlock more inclusive forms of urban causal reasoning.
- [6] arXiv:2504.12612 (cross-list from cs.AI) [pdf, html, other]
-
Title: The Chronicles of Foundation AI for Forensics of Multi-Agent ProvenanceSubjects: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Multiagent Systems (cs.MA)
Provenance is the chronology of things, resonating with the fundamental pursuit to uncover origins, trace connections, and situate entities within the flow of space and time. As artificial intelligence advances towards autonomous agents capable of interactive collaboration on complex tasks, the provenance of generated content becomes entangled in the interplay of collective creation, where contributions are continuously revised, extended or overwritten. In a multi-agent generative chain, content undergoes successive transformations, often leaving little, if any, trace of prior contributions. In this study, we investigates the problem of tracking multi-agent provenance across the temporal dimension of generation. We propose a chronological system for post hoc attribution of generative history from content alone, without reliance on internal memory states or external meta-information. At its core lies the notion of symbolic chronicles, representing signed and time-stamped records, in a form analogous to the chain of custody in forensic science. The system operates through a feedback loop, whereby each generative timestep updates the chronicle of prior interactions and synchronises it with the synthetic content in the very act of generation. This research seeks to develop an accountable form of collaborative artificial intelligence within evolving cyber ecosystems.
- [7] arXiv:2504.12733 (cross-list from cs.CR) [pdf, other]
-
Title: Adversary-Augmented Simulation for Fairness Evaluation and Defense in Hyperledger FabricComments: 20 pages, 14 figures. arXiv admin note: text overlap with arXiv:2403.14342Subjects: Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA)
This paper presents an adversary model and a simulation framework specifically tailored for analyzing attacks on distributed systems composed of multiple distributed protocols, with a focus on assessing the security of blockchain networks. Our model classifies and constrains adversarial actions based on the assumptions of the target protocols, defined by failure models, communication models, and the fault tolerance thresholds of Byzantine Fault Tolerant (BFT) protocols. The goal is to study not only the intended effects of adversarial strategies but also their unintended side effects on critical system properties. We apply this framework to analyze fairness properties in a Hyperledger Fabric (HF) blockchain network. Our focus is on novel fairness attacks that involve coordinated adversarial actions across various HF services. Simulations show that even a constrained adversary can violate fairness with respect to specific clients (client fairness) and impact related guarantees (order fairness), which relate the reception order of transactions to their final order in the blockchain. This paper significantly extends our previous work by introducing and evaluating a mitigation mechanism specifically designed to counter transaction reordering attacks. We implement and integrate this defense into our simulation environment, demonstrating its effectiveness under diverse conditions.
Cross submissions (showing 3 of 3 entries)
- [8] arXiv:2501.07744 (replaced) [pdf, html, other]
-
Title: CBS with Continuous-Time RevisitSubjects: Multiagent Systems (cs.MA)
Multi-Agent Path Finding in Continuous Time (\mapfr) extends the classical MAPF problem by allowing agents to operate in continuous time. Conflict-Based Search with Continuous Time (CCBS) is a foundational algorithm for solving \mapfr optimally. In this paper, we revisit the theoretical claims of CCBS and show the algorithm is incomplete, due to an uncountably infinite state space created by continuous wait durations. Through theoretical analysis and counter-examples, we examine the inherent challenges of extending existing MAPF solvers to address \mapfr while preserving optimality guarantees. By restricting waiting duration to fixed amounts, we identify a related sub-problem on graphs, \mapfrdt which we show is optimally solvable, including by CCBS. It remains an open question whether similar models exist for \mapfrct, a generalised version of \mapfrdt that allows arbitrary wait times, and \mapfrcs, which further allows arbitrary movements in continuous space.
- [9] arXiv:2311.10176 (replaced) [pdf, html, other]
-
Title: Scalable Multi-Robot Motion Planning Using Guidance-Informed HypergraphsComments: This work has been submitted for reviewSubjects: Robotics (cs.RO); Multiagent Systems (cs.MA)
In this work, we propose a method for multiple mobile robot motion planning that efficiently plans for robot teams up to an order of magnitude larger than existing state-of-the-art methods in congested settings with narrow passages in the environment. We achieve this improvement in scalability by adapting the state-of-the-art Decomposable State Space Hypergraph (DaSH) planning framework to expand the set of problems it can support to include those without a highly structured planning space and those with kinodynamic constraints. We accomplish this by exploiting guidance about a problem's structure to limit exploration of the planning space and through modifying DaSH's conflict resolution scheme. This guidance captures when coordination between robots is necessary, allowing us to decompose the intractably large multi-robot search space while limiting risk of inter-robot conflicts by composing relevant robot groups together while planning.
- [10] arXiv:2410.12544 (replaced) [pdf, html, other]
-
Title: Nash equilibria in scalar discrete-time linear quadratic gamesComments: Updated based on the reviews from ECC25. Camera ready versionSubjects: Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA)
An open problem in linear quadratic (LQ) games has been characterizing the Nash equilibria. This problem has renewed relevance given the surge of work on understanding the convergence of learning algorithms in dynamic games. This paper investigates scalar discrete-time infinite-horizon LQ games with two agents. Even in this arguably simple setting, there are no results for finding $\textit{all}$ Nash equilibria. By analyzing the best response map, we formulate a polynomial system of equations characterizing the linear feedback Nash equilibria. This enables us to bring in tools from algebraic geometry, particularly the Gröbner basis, to study the roots of this polynomial system. Consequently, we can not only compute all Nash equilibria numerically, but we can also characterize their number with explicit conditions. For instance, we prove that the LQ games under consideration admit at most three Nash equilibria. We further provide sufficient conditions for the existence of at most two Nash equilibria and sufficient conditions for the uniqueness of the Nash equilibrium. Our numerical experiments demonstrate the tightness of our bounds and showcase the increased complexity in settings with more than two agents.
- [11] arXiv:2503.24047 (replaced) [pdf, html, other]
-
Title: Towards Scientific Intelligence: A Survey of LLM-based Scientific AgentsComments: 34 pages, 10 figuresSubjects: Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
As scientific research becomes increasingly complex, innovative tools are needed to manage vast data, facilitate interdisciplinary collaboration, and accelerate discovery. Large language models (LLMs) are now evolving into LLM-based scientific agents that automate critical tasks, ranging from hypothesis generation and experiment design to data analysis and simulation. Unlike general-purpose LLMs, these specialized agents integrate domain-specific knowledge, advanced tool sets, and robust validation mechanisms, enabling them to handle complex data types, ensure reproducibility, and drive scientific breakthroughs. This survey provides a focused review of the architectures, design, benchmarks, applications, and ethical considerations surrounding LLM-based scientific agents. We highlight why they differ from general agents and the ways in which they advance research across various scientific fields. By examining their development and challenges, this survey offers a comprehensive roadmap for researchers and practitioners to harness these agents for more efficient, reliable, and ethically sound scientific discovery.