🎁 Get the FREE AI Skills Starter Guide β€” Subscribe β†’
BytesAgainBytesAgain
πŸ¦€ ClawHub

Semantic Consistency Auditor

by @aipoch-ai

Use semantic consistency auditor for academic writing workflows that need structured execution, explicit assumptions, and clear output boundaries.

⚑ When to Use
TriggerAction
- Use this skill for academic writing tasks that require explicit assumptions, bounded scope, and a reproducible output format.
- Use this skill when you need a documented fallback path for missing inputs, execution errors, or partial evidence.
πŸ’‘ Examples

Command Line


Evaluate single case pair

python scripts/main.py \ --ai-generated "Patient presented with fever for 3 days, highest temperature 39Β°C, accompanied by cough." \ --gold-standard "Patient chief complaint of fever for 3 days, highest temperature 39Β°C, accompanied by cough symptoms." \ --output results.json

Batch evaluation from JSON file

python scripts/main.py \ --input-file batch_cases.json \ --output results.json \ --format detailed

Use specific model

python scripts/main.py \ --ai-generated "..." \ --gold-standard "..." \ --bert-model "bert-base-chinese" \ --comet-model "Unbabel/wmt20-comet-da"

Python API

from semantic_consistency_auditor import SemanticConsistencyAuditor

Initialize evaluator

auditor = SemanticConsistencyAuditor( bert_model="microsoft/deberta-xlarge-mnli", comet_model="Unbabel/wmt22-comet-da", lang="zh" )

Evaluate single case

result = auditor.evaluate( ai_text="Patient presented with fever for 3 days...", gold_text="Patient chief complaint of fever for 3 days..." )

print(f"BERTScore F1: {result['bertscore']['f1']:.4f}") print(f"COMET Score: {result['comet']['score']:.4f}") print(f"Consistency: {result['consistency']:.4f}") print(f"Passed: {result['passed']}")

Batch evaluation

results = auditor.evaluate_batch([ {"ai": "...", "gold": "..."}, {"ai": "...", "gold": "..."} ])

βš™οΈ Configuration

Configure in ~/.openclaw/skills/semantic-consistency-auditor/config.yaml:


BERTScore Configuration

bertscore: model: "microsoft/deberta-xlarge-mnli" # Or "bert-base-chinese" for Chinese lang: "zh" # Language code: zh, en, etc. rescale_with_baseline: true device: "auto" # auto, cpu, cuda

COMET Configuration

comet: model: "Unbabel/wmt22-comet-da" # COMET model batch_size: 8 device: "auto"

Evaluation Thresholds

thresholds: bertscore_f1: 0.85 comet_score: 0.75 semantic_consistency: 0.80 # Comprehensive score threshold

View on ClawHub
TERMINAL
clawhub install semantic-consistency-auditor

πŸ§ͺ Use this skill with your agent

Most visitors already have an agent. Pick your environment, install or copy the workflow, then run the smoke-test prompt above.

πŸ” Can't find the right skill?

Search 60,000+ AI agent skills β€” free, no login needed.

Search Skills β†’