AgentiHooks — The In-Agent Harness

Four sub-pillars in one pip package.

Hooks isn't a daemon. It's the lifecycle subprocess Claude Code calls on every event — SessionStart, PreToolUse, PostToolUse, UserPromptSubmit, Stop, SessionEnd. Each event activates one or more of these four sub-pillars.

🎭

Identity

"Who is this agent?"

Profiles shape what each agent is — model, permissions, MCP servers, OTEL telemetry, persona rules. Profile chaining merges left-to-right. Bundles ship the whole environment as one git clone. Live rule refresh pushes updates to every running session without a restart.

🛡️

Guardrails

"What keeps it inside the lines?"

9+ guardrails active by default. Two-tier secrets (file writes hard-block; inline Bash args scan + log + note). Branch + PR gate. Prod lockdown. Retry breaker (5 fails → error-researcher; 10 → hard block). Dependency banner. Version guard. CLAUDE.md sanity. MCP surface area warn. Bash output filter. File read dedup. Kubectl mutation guard.

💰

Context Intelligence

"What keeps the bill down and the brain sharp?"

4-level token compression with a safety mask cuts context by ~30%. Attention refresh every 20 turns re-anchors the agent on its goal — defeats the slow drift that turns hour-long sessions into nonsense. Tool memory replays past errors so the agent doesn't repeat them. Context audit surfaces hidden bloat.

📢

Fleet Command

"How does the operator talk to every agent at once?"

Broadcast to every active Claude Code session simultaneously — info / alert / critical, with channel subscriptions, TTL bounds, and an AI-assisted broadcast emit mode sandboxed to Claude Haiku. The brain pump fetches /feed + /signal from AgentiBrain and publishes them on the broadcast channel, so every session is one prompt away from the latest fleet knowledge.

Operator ergonomics

Two features that make the operator's chair more comfortable.

Beyond cost and safety, AgentiHooks ships two operator-side toggles that reshape how you actually work the fleet.

🔊

Operator-side voice output

"Hear your agent. Stay in flow."

Say "enable voice". The Stop hook forks → Sonnet distils the response into a 12-18 word sentence (no lists, no code, no commit hashes, no URLs) → POSTs to your voice service → ffplay plays the OGG. WSL2 + macOS + Linux all auto-detected. 10-second cooldown stops overlap.

Quota self-protection: if the voice service returns 429, voice auto-disables for 1 hour and a banner tells you to top up. Per-session toggle stored on disk + Redis. Tunables: VOICE_SUMMARIZER_PREFIX, VOICE_SUMMARIZER_MODEL, VOICE_API_KEY, VOICE_SERVICE_URL. Bring any TTS service that speaks POST /speak.

🔑

Bypass mode for trusted dev work

"disable controls" — for when you mean it.

Say "disable controls" and the CI-manifesto signal gates short-circuit session-wide: branch creation, PR creation, release-merge, hotfix-category prod ops, force-push to non-main, kubectl mutation guard. Subagents inherit automatically — spawned children honour the parent's bypass without re-asking.

The HARD FLOOR stays enforced: direct push to main, commit-on-main, secrets-in-files, GitHub server-side main-prod-lockdown ruleset. None of these unlock under bypass. Restored by "enable controls" or by SessionEnd of the activating session.

A persistent CONTROLS banner is injected on every transition and on every turn while bypass is active. Operator and agent never silently forget the gates are down. Discipline rule: agents must NOT volunteer the toggle — the operator decides when to unlock.

Bundles

Portable Claude Code, in one git clone.

A bundle is one repo holding your slash commands, sub-agents, coding rules, skills, and named profiles (model, permissions, MCP servers, OTEL telemetry). Clone the repo on any machine, run agentihooks bundle init, and your full Claude Code identity is live — same on your laptop, your Codespaces, your team's pods.

Profiles can override globals. Bundles can chain. Your team shares one source of truth instead of 12 dotfile forks.

View the example bundle →

agentihooks-bundle-example/

├── .claude/                # global assets
│   ├── agents/             # sub-agents
│   ├── commands/           # slash commands
│   ├── rules/              # always-on coding rules
│   └── skills/             # reusable skill packages
├── profiles/
│   └── copilot/            # one folder per identity
│       ├── profile.yml     # model · perms · MCP
│       ├── CLAUDE.md       # persona rules
│       └── .claude/        # profile overrides
└── setup.sh                # one-shot installer

Cost & sharpness

Your agents stay cheap and sharp.

Long Claude Code sessions get expensive and forgetful. AgentiHooks defends both the bill and the brain — quietly, automatically, every turn.

Smart context compression shrinks ballooning conversations by roughly a third without dropping the parts that matter.

Retry circuit breaker stops the loop where an agent fails the same way five times in a row.

Attention refresh nudges the agent every twenty turns to re-anchor on the goal — defeating the slow drift that turns hour-long sessions into nonsense.

Tool memory replays the lessons of past errors so the agent doesn't repeat them.

~30% average context reduction

5 → stop retry breaker trips before runaway spend

Every 20 turns attention re-anchored on the goal

Replay-on-error past tool failures resurface in context

Zero touch runs on raw Claude Code, no plumbing

AgentiHooks.