🎁 Get the FREE AI Skills Starter Guide β€” Subscribe β†’
BytesAgainBytesAgain
πŸ¦€ ClawHub

Self-Improving Operations

by @jose-compu

Captures process bottlenecks, incident patterns, capacity issues, automation gaps, SLA breaches, and toil accumulation to enable continuous operations improv...

Versionv1.1.0
πŸ“‹ Tips & Best Practices

1. Conduct blameless postmortems β€” focus on systemic causes, not individual blame 2. Automate toil aggressively β€” if you do it manually 3 times, automate it 3. Define SLOs before SLAs β€” internal targets should be stricter than customer commitments 4. Maintain runbooks β€” keep them current, test them during game days, include verification steps 5. Track error budgets β€” use them to balance feature velocity and reliability work 6. Rotate on-call fairly β€” equitable distribution, adequate rest, compensatory time off 7. Rehearse incident response β€” run tabletop exercises and chaos engineering experiments 8. Log immediately β€” incident context fades fast after resolution 9. Include timelines β€” timestamps are critical for postmortems and pattern detection 10. Measure DORA metrics β€” track deployment frequency, lead time, change failure rate, and MTTR 11. Review before on-call shifts β€” check .learnings/ for known issues and recent patterns

View on ClawHub
TERMINAL
clawhub install self-improving-operations

πŸ§ͺ Use this skill with your agent

Most visitors already have an agent. Pick your environment, install or copy the workflow, then run the smoke-test prompt above.

πŸ” Can't find the right skill?

Search 60,000+ AI agent skills β€” free, no login needed.

Search Skills β†’