We Open Sourced Nexus: The Judgment Layer for Multi-Agent AI Systems

Written by Gabe | Mar 25, 2026 12:04:24 PM

We Open Sourced Nexus: The Judgment Layer for Multi-Agent AI Systems

Multi-agent AI systems have an execution problem. Actually, they do not. Execution is the easy part. What they have is a coordination and judgment problem. We built Nexus to solve it. It is now open source.

Repo: https://github.com/PermaShipAI/nexus

Why Nexus exists

When you build a multi-agent system — specialized agents running across a codebase, discovering work independently — the first thing you realize is that agents are individually correct and collectively a disaster.

The CISO agent finds a real vulnerability and proposes a patch. The SRE agent identifies the same component and proposes an architectural change that would eliminate the entire class of problem. Both proposals are valid. Neither agent knows the other exists. Without a coordination layer, they ship conflicting changes to the same files.

That is the easy version of the problem. The harder version: agents that are locally optimal but globally disruptive. An agent proposes a dependency upgrade. Good upgrade. But the CI pipeline is red, staging has a blocked circular dependency, and two days ago the team issued a directive to hold on non-critical changes. The agent does not know any of that. It just sees a stale dependency.

This is not an orchestration problem. Orchestration assumes you know what needs doing and assigns it. These agents are discovering work independently. What you need is a layer that evaluates whether what they found is actually worth acting on, at the right time, for the right reason.

That is Nexus.

How Nexus works

Nexus is an executive agent that sits above all others. The rule is simple: only Nexus can create a ticket. Every other agent identifies work and submits a proposal. Nexus decides.

Cross-agent review. When multiple agents propose work touching the same component, Nexus does not pick one. It synthesizes them — rejecting the narrower proposal, merging requirements, adding the originating agents as mandatory reviewers. One ticket comes out, not two conflicting ones.

Temporal judgment. Nexus tracks system state: CI health, active incidents, error budgets, strategic directives. The same proposal that gets approved during normal operations gets deferred if the system is in incident mitigation mode. Same proposal, different outcome. Context matters more than correctness.

Conflict detection. Agents tag the files, routes, and components their proposals touch. Nexus evaluates actual overlap, not just text similarity, and prevents proposals from conflicting at execution time.

Rejection mechanics. Rejection is not binary. A proposal that fundamentally conflicts with core principles gets killed entirely — Hard Rejection. A proposal where the problem is valid but the execution plan is flawed gets kicked back to the originating agent with specific feedback to resubmit — Deferral. No proposal is ever silently dropped. Every decision is logged with explicit rationale.

Systemic fortification. If Nexus detects a pattern of rejections — multiple agents keep proposing work that violates the same architectural constraint — it does not keep rejecting them individually. It triggers a Knowledge Base update and encodes a new Project Rule. Future proposals stop making the same mistake before they ever reach Nexus.

Every proposal submitted to Nexus must follow a Decision Brief format:

- Problem statement (user harm / business risk)
- Evidence (metrics, incidents, frequency)
- Proposed change (what exactly)
- Alternatives considered
- Risks (security, reliability, correctness, UX)
- Dependencies / prerequisites
- Effort estimate
- Measurement plan
- Rollout / rollback plan
- Required reviewers

No brief, no ticket.

The agent team

Nexus ships with nine built-in specialist agents: CISO, QA Manager, SRE, Product Manager, UX Designer, Release Engineering, FinOps, AgentOps, and VOC. Each is defined in a customizable markdown persona file. You can modify any of them or import more from the community.

Why we open sourced it

The gatekeeper architecture is the interesting part. The multi-agent coordination problem is real and most implementations punt on it entirely. We wanted to put the decision layer out in the open.

Nexus is also the proof of concept that made PermaShip possible. We built it to govern our own engineering operations and it runs in production on our own codebase today. Open sourcing it is consistent with how we think about the relationship between the OSS layer and the full platform — Nexus is the judgment layer that anyone can run. PermaShip is the production-grade execution infrastructure built on top of it.

What you can do with it

Run it locally in minutes:

npx nexus-command

No Docker, no Postgres, no configuration required. Opens at localhost:3000. The setup wizard prompts for your LLM API key on first run.

Supports Gemini, Claude, OpenAI, Ollama (local/free), and OpenRouter. You are not locked to any provider. Local model support is fully functional — that was the first question from the community and the answer is yes.

Execution backends include Claude Code, Gemini CLI, Codex CLI, and OpenClaw. Approved tickets dispatch automatically to whichever backend you configure.

What we want to learn from the community

We are particularly interested in how people are hitting the coordination wall in their own multi-agent systems. Every architecture has its own version of the locally-optimal-but-globally-disruptive problem. If you are building in this space, the issues and Discord are open.

Repo: https://github.com/PermaShipAI/nexus

Discord: https://discord.gg/JMMMT9EDVq