knowledge-health-checker
by @xb19960921
Audit and improve Markdown knowledge-base health across Obsidian, Logseq, Notion exports, docs folders, and wiki repositories. Detect empty placeholder notes...
clawhub install knowledge-health-checkerπ About This Skill
name: knowledge-health-checker description: Audit and improve Markdown knowledge-base health across Obsidian, Logseq, Notion exports, docs folders, and wiki repositories. Detect empty placeholder notes, broken wiki links, weak content density, orphan notes, graph fragmentation, stale files, and repair opportunities. Generate health scores, actionable reports, and safe fix plans. Use for knowledge base audit, wiki lint, broken link detection, Obsidian vault cleanup, markdown graph health, content quality review, and documentation garden maintenance. version: "1.1.0" last_updated: "2026-04-25" changelog: "ClawHub-ready Darwin optimization: public positioning, clearer workflow, safety boundaries, scoring rubric, output format, and test prompts."
Knowledge Health Checker
Knowledge Health Checker audits a Markdown-based knowledge base as a living system, not a folder full of files.
It detects whether the knowledge garden is:
The goal is not only to find problems, but to produce a prioritized, safe, actionable health report.
When to Use
Use this skill for:
Do not use it for semantic fact-checking. This skill checks structure, links, density, freshness, and maintainability, not whether every claim is true.
Core Principle
A healthy knowledge base has four properties:
1. Substance β notes contain enough content to be useful. 2. Connectivity β important notes are linked into the graph. 3. Navigability β links, headings, and structure help readers move through knowledge. 4. Maintainability β stale, broken, duplicate, or low-value content is visible and repairable.
A knowledge base can be large and still unhealthy. Size is not health.
Default Workflow
Step 1: Confirm scope and safety
Before scanning, identify:
Target path:
Formats: markdown / wiki links / relative links
External URL check: yes/no
Generate fix script: yes/no
Auto-apply fixes: no by default
Exclude directories:
Estimated file count:
Safe default:
scan only β report only β generate fix plan β user reviews β user applies
Never delete, rename, rewrite, or auto-apply fixes without explicit confirmation.
Step 2: Build file and heading index
Index:
.md files[[note]] and [[note#heading]]textExclude by default:
.git/
node_modules/
__pycache__/
.obsidian/
.trash/
dist/
build/
Step 3: Detect hollow or low-value notes
Flag likely hollow notes when they match one or more:
Classify severity:
| Severity | Meaning | Typical action | |---|---|---| | P0 | Empty or pure placeholder | delete, archive, or fill immediately | | P1 | Too thin to be useful | expand with definition, context, examples | | P2 | Usable but weak | improve structure or add links |
Step 4: Detect broken links
Check:
[[filename]][[filename#heading]]textFor each broken link, report:
source file
link text
target
link type
probable fix if a similar file exists
Step 5: Analyze content density and structure
Measure:
Suggested ranges:
| Signal | Healthy range | Warning | |---|---|---| | Short note | 300+ words or intentionally atomic | <200 characters | | Long note | still navigable with headings | >3000 words without structure | | Internal links | at least 1-3 for durable notes | zero links = possible orphan | | Freshness | depends on domain | stale if >90 days and marked active |
Step 6: Analyze knowledge graph health
Build a graph:
node = markdown file
edge = internal link
Report:
A perfect graph is not required. The goal is to identify the highest-value repair points.
Step 7: Score health
Default scoring:
| Dimension | Weight | Good state | |---|---:|---| | Hollow note rate | 25% | few or no empty placeholders | | Broken link rate | 30% | no broken internal links | | Content density | 25% | most notes have useful substance and structure | | Network connectivity | 20% | important notes are connected; few accidental orphans |
Health score:
health = weighted score from 0 to 100
Use labels:
| Score | Label | |---:|---| | 90-100 | Excellent | | 75-89 | Healthy | | 60-74 | Needs maintenance | | 40-59 | Fragile | | 0-39 | Critical |
Step 8: Generate report and fix plan
Return a concise summary first. For large scans, provide a full report path.
Fix plans must be safe:
Never silently delete or rewrite knowledge files.
Output Format
Use this format:
## Knowledge Health Summary
Target:
Files scanned:
Health score:
Label:
Top risks: Findings
| Category | Count | Severity | Notes |
|---|---:|---|---|
| Hollow notes | | | |
| Broken links | | | |
| Orphan notes | | | |
| Overlong notes | | | |
| Stale active notes | | | |Highest-Impact Fixes
1. P0:
2. P1:
3. P2:Safe Fix Plan
Auto-safe fixes:
Needs human review:
Do not auto-apply: Artifacts
Report:
Fix script:
Raw JSON:
For small knowledge bases, include concrete file examples. For large ones, include top 10 examples per category and write full details to a report file.
Safe Fix Policy
Classify fixes by risk:
| Risk | Examples | Permission | |---|---|---| | Low | generate report, list broken links, suggest links | no extra confirmation | | Medium | create fix script, add missing backlinks in draft output | ask before writing files | | High | delete notes, rename files, rewrite links globally, split files | explicit confirmation required |
Default behavior: report and propose, do not mutate.
Bundled Scripts
Use these when available:
scripts/health_check.py β core scanner for hollow files, broken links, density, and graph stats.scripts/report_generator.py β HTML report generation.scripts/auto_fix.py β fix-plan or repair-script generation.Run scripts from the skill directory or pass absolute paths. If a script lacks CLI ergonomics, inspect it and adapt safely rather than guessing destructive behavior.
Example Commands
Basic scan:
python3 scripts/health_check.py /path/to/knowledge-base
Generate a report from scan results if supported:
python3 scripts/report_generator.py results.json --output health-report.html
Generate a fix plan, not auto-apply:
python3 scripts/auto_fix.py results.json --dry-run
If the bundled script does not support these exact flags, read the script first and use its actual interface.
Test Prompts
Use test-prompts.json for Darwin-style regression evaluation. Good test coverage should include:
Anti-Patterns
Avoid:
Quality Bar
A good knowledge health check must be:
If the output only says βyou have broken linksβ without showing where, why it matters, and what to do next, it failed.