Skip to content

Governance boundary: what should MetaGPT-generated agents be allowed to do? #2072

@cschanhniem

Description

@cschanhniem

Context

MetaGPT generates multi-agent systems that execute code, modify files, and interact with APIs. But there's no governance layer between "MetaGPT designs an agent" and "that agent executes in a user's repository."

This matters because MetaGPT is a library — users embed it in their workflows. When a MetaGPT-generated agent runs, it has the same permissions as the process that spawned it.

The gap

Currently, the model is:

User request → MetaGPT creates agents → Agents execute tools → Files modified

There's no step where the user or repo can say: "Generated agents may write to output/ but not to src/", or "Agents need approval before sending API requests."

Proposal: Agent capability declaration + policy boundary

Two complementary mechanisms:

1. Agent-generated capability manifest

When MetaGPT generates an agent, it should produce a capability declaration alongside the agent definition:

# meta-generated agent output
agent_capabilities = {
    "role": "Architect",
    "capabilities": ["filesystem:write:output/*", "api:read"],
    "require_approval_for": ["filesystem:write:src/*", "shell:execute"]
}

2. Repo-level policy file (META.yml or AGENTOWNERS.yml)

A governance file at the repo root that gates what ANY agent (MetaGPT-generated or otherwise) may do:

# AGENTOWNERS.yml
agents:
  - path: "src/**"
    permissions: require_approval
  - path: "output/**"
    permissions: allow
  - path: "tests/**"
    permissions: allow
  - path: "*.md"
    permissions: allow

Why this matters for MetaGPT specifically

MetaGPT is the only multi-agent framework where agents design and spawn other agents. This creates a transitive trust problem: if MetaGPT generates an agent that itself generates code, the user needs governance at every layer.

Without a policy boundary, MetaGPT-generated agents inherit the full permissions of the process — creating the same eval() vulnerability class that has surfaced in other frameworks (e.g., CVE-2026-2275).

Next steps

  • Discuss whether capability declaration should be part of the core agent model or a plugin
  • Consider integration with the AGENTS.md proposal (Add AGENTS.md — guide AI coding assistants working in this repo #2045) — AGENTS.md for instructions, AGENTOWNERS for permissions
  • Explore deterministic permission checking (no LLM gatekeeping — binary pass/fail) as the enforcement model

I've been working on deterministic policy evaluation for AI agents and happy to discuss patterns that fit MetaGPT's architecture.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions