Neat Autonomous Agent Framework

Every agent action gets evidence, justification, and rollback.

NAAF wraps any AI agent with EBAS epistemic gating — so every decision has a hash-chained audit trail, every refusal is a first-class record, and nothing touches production without proof.

Request early access See the numbers
The problem
Today
Agent → tool call → system change

if something breaks:
no record of why
no evidence chain
rollback is manual
no audit trail
confidence = authorization
With NAAF
Agent → propose → evidence → gate → act

every action has:
hash-chained justification
evidence references
verified rollback plan
tamper-evident audit log
refusal as first-class outcome
Benchmarks — 4,270 evaluations, three domains
0
False assertions
Across all domains. The gate catches everything.
93.45%
3B + EBAS accuracy
Exceeds 70B baseline. At 1/15th the cost.
91.7%
Pre-gate early exit
LLM never called. Compute saved before inference.
6μs
Gate latency
SHA-256 + set intersection. Effectively free.
The slide
SystemAccuracyCostWrong assertions
70B baseline93.15%23x137
3B baseline73.0%1x529
3B + EBAS gate73.0%1x0
3B + EBAS gated recovery93.45%1.54x0

A 3B model with EBAS gating and retries exceeds 70B accuracy at 1/15th the cost — with zero wrong assertions. The gate is not overhead. It is the architecture.

How it works
1
Agent proposes action — shell command, file write, deploy, API call
2
NAAF intercepts via MCP — classifies risk, loads taxonomy pack
3
EBAS evaluates evidence — sufficiency gate checks justification, rollback, approval
4a
PROCEED_ALLOWED — evidence sufficient, action executes, hash-chained to proof
or
4b
REFUSED — durable record with reason code, missing evidence, retry guidance
Who it's for
Claude Code
Multi-session agentic work with no audit trail between compacts.
→ hash chain survives sessions
→ every file write justified
→ rollback on every mutation
Copilot
Agent touched prod. Nobody knows what it changed or why.
→ evidence-gated deploys
→ tamper-evident audit log
→ compliance-ready records
Any MCP agent
No gate between intent and execution. Confidence = authorization.
→ sufficiency gates, not confidence
→ refusal as first-class outcome
→ zero code changes to adopt

NAAF is in early development.

If you're building with autonomous agents and want epistemic gating that actually works — or you just want to see the chain explorer running on real data — reach out.

Message received. We'll be in touch.