Agentic Codex Dev Reviewer
by @zack-dev-cm
Review agentic software-development plans and release readiness for Codex, GitHub, and ClawHub work. Use when a user asks for scoped delivery planning, imple...
clawhub install agentic-codex-devπ About This Skill
name: agentic-codex-dev description: Use when planning, implementing, reviewing, coordinating, or publishing agentic software development work with Codex, GitHub, and OpenClaw/ClawHub. Provides a production-grade multi-agent operating loop with role roster, model policy, task ledger, memory ledger, report artifacts, verification gates, and anti-bleed public-surface review. version: 0.3.3 user-invocable: true disable-model-invocation: true metadata: {"openclaw":{"homepage":"https://github.com/zack-dev-cm/agentic-codex-dev-skill","skillKey":"agentic-codex-dev","requires":{"bins":["git","clawhub"],"anyBins":["python3","python"]},"install":[{"kind":"node","label":"Install ClawHub CLI","package":"clawhub","bins":["clawhub"]}],"tags":["codex","github","clawhub","agentic-development"]}}
Agentic Codex Dev
Operate Codex like a disciplined software team: clear goal, explicit roles, scoped ownership, evidence, tests, review, report.
When to Use
Use this skill for:
AGENTS.md, .codex/agents/, or skill instructionsDo not use it for one-line answers, pure brainstorming, or tasks that only need a command output.
Runtime Requirements
ClawHub requirement metadata for this skill declares git, python3, and clawhub, following the ClawHub skill metadata format at
antirot and codex_harness.git push, clawhub publish, or other remote-changing commands unless the user asked for publish or remote update work.Core Loop
1. Restate the goal and name the verification step before editing.
2. Read the repo map: AGENTS.md, README, package config, tests, and the files closest to the task.
3. Define concrete success criteria that would let a reviewer say "done".
4. Make the narrowest defensible change. Match local style. Avoid speculative abstractions.
5. Run the highest-signal local check. Add a focused smoke test when behavior changed.
6. Review the diff for bugs, regressions, secrets, private paths, and public-surface bleed.
7. Report what changed, how it was verified, and any residual risk.
If the task is unclear, stop early and name the ambiguity. Prefer one precise question over guessing.
Operating Rules
AGENTS.md short. Use it as an index to durable docs, not a giant prompt.Scope Modes
Pick the mode that fits the risk:
Prefer Patch unless the task shows it needs more structure. Use Multi-Agent only when the user explicitly asks for subagents, delegation, or parallel agent work.
System Design
For non-trivial or multi-agent work, set up a control plane before coding:
When this structure is overkill, keep a solo Patch flow and still preserve the same verification discipline.
Task, Memory, and Report Ledgers
Create or update these artifacts when work is multi-agent, multi-turn, risky, or intended for publication:
docs/agentic/tasks.md: task id, owner role, goal, owned files, status, acceptance criteria, verification, result, blocker.docs/agentic/memory.md: stable repo facts, architecture decisions, commands that actually work, hazards, rejected approaches, last-verified date. Do not store secrets, tokens, private paths, or raw logs.docs/agentic/reports/-.md : final objective, source links, task outcomes, changed files, tests, review findings, unresolved risks, release or PR status.If the target repo already has equivalent docs, use the local convention instead of inventing new paths.
Role Roster
Use this roster as the default multi-agent team. The parent thread stays responsible for coordination and final judgment.
| Role | Default model | Reasoning | Scope | Required output |
| --- | --- | --- | --- | --- |
| Orchestrator | gpt-5.4 | xhigh for critical design/release, high otherwise | Owns task split, integration, report | plan, assignments, final decision |
| Analyst | gpt-5.4 | high | Turns vague request into requirements and risks | assumptions, open questions, acceptance criteria |
| Architect | gpt-5.4 | xhigh | System design, boundaries, dependency choices | design note, rejected options, invariants |
| Planner | gpt-5.4 | high | Breaks design into ordered tasks | task ledger rows with owners and gates |
| Explorer | gpt-5.4-mini or gpt-5.3-codex-spark | medium | Read-only code mapping and evidence gathering | files, symbols, execution path, uncertainty |
| Implementer | gpt-5.4 for risky code, gpt-5.3-codex-spark for bounded edits | high or medium | Writes only owned files | patch summary, tests, residual risks |
| Reviewer | gpt-5.4 | xhigh | Correctness, security, regressions, tests, public surface | findings first, file/line evidence, verdict |
| QA/CI Analyst | gpt-5.4 | high | Reproduction, failing checks, browser or CLI evidence | exact command, observed failure, fix owner |
| Memory Curator | gpt-5.4-mini | medium | Updates durable docs after decisions land | memory entries, stale entries removed |
Subagents
Only use subagents when the user explicitly asks for subagents, delegation, or parallel agent work.
Good delegation targets:
Bad delegation targets:
When delegating, give each agent a bounded task, a clear output shape, and explicit ownership. Keep the main thread focused on requirements, decisions, integration, and final review. Keep agents.max_depth = 1 unless the user explicitly accepts recursive delegation risk; this matches the Codex subagent configuration surface documented at
Delegation prompt shape:
Role: reviewer
Model: gpt-5.4
Reasoning: xhigh
Ownership: read-only review of
Task: find correctness, security, regression, test, and public-surface risks.
Output: findings first with file/line evidence, then open questions, then verdict.
Do not edit files. Do not inspect secrets. Do not broaden scope.
Model Policy
gpt-5.4 with xhigh reasoning for architecture, security review, release decisions, and ambiguous multi-agent coordination; Codex custom-agent examples document gpt-5.4 reviewer roles at gpt-5.4 with high reasoning for implementation where correctness or cross-module behavior matters; model selection follows the Codex custom-agent configuration surface at gpt-5.4-mini or gpt-5.3-codex-spark for read-only exploration, docs checks, and bounded cleanup where speed matters and the output will be reviewed; both model families appear in Codex custom-agent examples at Implementation Discipline
Before editing:
While editing:
After editing:
Review Checklist
Review every non-trivial result for:
Consistency and Effectiveness Gates
For multi-agent work, verify the process itself:
Real Example Eval
For a serious workflow eval, run this skill against a real repo task and archive the result in the report ledger. A valid eval has:
Use example run as the minimum acceptance shape.
GitHub and ClawHub Publish Gate
Before publishing:
SKILL.md has frontmatter name, description, and version.For this skill's source analysis, read references/source-review.md and references/comparison-matrix.md.
For multi-agent artifacts and templates, read references/system-design.md.
For release commands and manual checks, read references/publish-checklist.md.