kb-framework
by @minenclown
Erstellt eine hybride Knowledge Base mit automatischer Markdown-, PDF- und OCR-Indexierung, SQLite- und ChromaDB-Integration plus tägliche Datenqualitätsprüf...
Python API
import sys
sys.path.insert(0, "~/.openclaw/kb")Core Indexer
from kb.indexer import BiblioIndexerwith BiblioIndexer("~/.openclaw/kb/knowledge.db") as idx:
idx.index_file("/path/to/file.md")
Hybrid Search
from kb.framework.hybrid_search import HybridSearch
hs = HybridSearch()
results = hs.search("Your search term", limit=10)LLM Engine API
from kb.biblio.config import LLMConfig
from kb.biblio.engine.registry import EngineRegistry
from kb.biblio.engine.factory import create_engineconfig = LLMConfig.get_instance()
print(f"Source: {config.model_source}")
Create engine (auto mode mit HF primary + Ollama fallback)
engine = create_engine(config)Registry für Multi-Engine Zugriff
registry = EngineRegistry.get_instance(config)
primary, secondary = registry.get_both()Generator Parallel Support
from kb.biblio.generator import EssenzGenerator
generator = EssenzGenerator()
result = await generator.generate_essence(
topic="Topic",
parallel_strategy="primary_first" # primary_first, aggregate, compare
)
CLI (Recommended)
# Core commands:
kb index /path/to/file.md # Index a file
kb search "machine learning" # Search knowledge base
kb sync # Sync ChromaDB with SQLite
kb audit # Run full audit
kb ghost # Find orphaned entries
kb warmup # Preload ChromaDB modelLLM commands:
kb llm status # LLM system status
kb llm generate essence "topic" # Generate an essence
kb llm generate report daily # Generate a daily report
kb llm watch start # Start file watcher
kb llm scheduler list # List scheduled jobs
kb llm config show # Show LLM configLLM Engine management:
kb llm engine status # Show all engine status
kb llm engine switch huggingface # Switch to HuggingFace
kb llm engine test # Test both engines
Legacy Scripts (kb/scripts/)
# Index PDFs with OCR
python3 ~/.openclaw/kb/kb/scripts/index_pdfs.py /path/to/pdfs/Ghost Scanner (finds orphaned DB entries)
python3 ~/.openclaw/kb/kb/scripts/kb_ghost_scanner.pyFull Audit
python3 ~/.openclaw/kb/kb/scripts/kb_full_audit.pyChromaDB Warmup (at boot)
python3 ~/.openclaw/kb/kb/scripts/kb_warmup.py
Configuration is managed via kb/base/config.py:
from kb.base.config import KBConfigGet singleton instance
config = KBConfig.get_instance()Key properties:
config.base_path # ~/.openclaw/kb
config.db_path # ~/.openclaw/kb/knowledge.db
config.library_path # ~/.openclaw/kb/library
config.chroma_path # ~/.openclaw/kb/chroma_dbEnvironment variable override:
KB_BASE_PATH=/custom/path
LLM Configuration (kb/biblio/config.py)
from kb.biblio.config import LLMConfigconfig = LLMConfig.get_instance()
print(f"Source: {config.model_source}") # auto, ollama, huggingface, compare
print(f"Model: {config.model}") # Full model name
print(f"HF Model: {config.hf_model_name}") # google/gemma-2-2b-it
Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| KB_BASE_PATH | ~/.openclaw/kb | Base installation path |
| KB_LLM_MODEL_SOURCE | auto | Engine: ollama/huggingface/auto/compare |
| KB_LLM_OLLAMA_MODEL | gemma4:e2b | Ollama model name |
| KB_LLM_HF_MODEL | google/gemma-2-2b-it | HuggingFace model |
| KB_LLM_PARALLEL_MODE | false | Enable parallel generation |
| KB_LLM_PARALLEL_STRATEGY | primary_first | primary_first/aggregate/compare |
"ChromaDB slow on first start"
python3 ~/.openclaw/kb/kb/scripts/kb_warmup.py
or
kb warmup
"Search finds nothing"
# Run audit
kb audit -vGhost Scanner (find orphaned entries)
kb ghostCheck sync status
kb sync --stats
"OCR too slow"
# Enable GPU in index_pdfs.py:
GPU_ENABLED = True # Default: False
"LLM engine not responding"
# Check engine status
kb llm engine statusTest both engines
kb llm engine testSwitch engine if needed
kb llm engine switch ollama
"Database locked"
# Check for running processes
ps aux | grep kbRestart if needed
pkill -f "kb.*"
"Config not found"
# Set environment variable
export KB_BASE_PATH=~/.openclaw/kbOr programmatically
from kb.base.config import KBConfig
config = KBConfig.reload(base_path="/path/to/kb")
clawhub install knowledge-base-framework