Neat Autonomous Agent Framework

Every agent action gets evidence, justification, and rollback.

NAAF wraps any AI agent with EBAS epistemic gating — so every decision has a hash-chained audit trail, every refusal is a first-class record, and nothing touches production without proof.

Request early access See the numbers

The problem

Today

Agent → tool call → system change

if something breaks:
✗ no record of why
✗ no evidence chain
✗ rollback is manual
✗ no audit trail
✗ confidence = authorization

With NAAF

Agent → propose → evidence → gate → act

every action has:
✓ hash-chained justification
✓ evidence references
✓ verified rollback plan
✓ tamper-evident audit log
✓ refusal as first-class outcome

Benchmarks — 4,270 evaluations, three domains

False assertions

Across all domains. The gate catches everything.

93.45%

3B + EBAS accuracy

Exceeds 70B baseline. At 1/15th the cost.

91.7%

Pre-gate early exit

LLM never called. Compute saved before inference.

6μs

Gate latency

SHA-256 + set intersection. Effectively free.

The slide

System	Accuracy	Cost	Wrong assertions
70B baseline	93.15%	23x	137
3B baseline	73.0%	1x	529
3B + EBAS gate	73.0%	1x	0
3B + EBAS gated recovery	93.45%	1.54x	0

A 3B model with EBAS gating and retries exceeds 70B accuracy at 1/15th the cost — with zero wrong assertions. The gate is not overhead. It is the architecture.

How it works

Agent proposes action — shell command, file write, deploy, API call

↓

NAAF intercepts via MCP — classifies risk, loads taxonomy pack

↓

EBAS evaluates evidence — sufficiency gate checks justification, rollback, approval

↓

PROCEED_ALLOWED — evidence sufficient, action executes, hash-chained to proof

REFUSED — durable record with reason code, missing evidence, retry guidance

Who it's for

Claude Code

Multi-session agentic work with no audit trail between compacts.

→ hash chain survives sessions
→ every file write justified
→ rollback on every mutation

Copilot

Agent touched prod. Nobody knows what it changed or why.

→ evidence-gated deploys
→ tamper-evident audit log
→ compliance-ready records

Any MCP agent

No gate between intent and execution. Confidence = authorization.

→ sufficiency gates, not confidence
→ refusal as first-class outcome
→ zero code changes to adopt

Get in touch

NAAF is in early development.

If you're building with autonomous agents and want epistemic gating that actually works — or you just want to see the chain explorer running on real data — reach out.

[email protected] ebas.mangomadness.com