🎁 Get the FREE AI Skills Starter Guide β€” Subscribe β†’
BytesAgainBytesAgain
πŸ¦€ ClawHub

Dify Knowledge Base Search

by @xiaowenzhou

Search Dify Knowledge Base (Dataset) to get accurate context for RAG-enhanced answers.

Versionv1.1.1
Downloads1,505
Installs3
Stars⭐ 1
TERMINAL
clawhub install dify-kb-search

πŸ“– About This Skill


name: dify-kb-search description: Search Dify Knowledge Base (Dataset) to get accurate context for RAG-enhanced answers. metadata: openclaw: requires: env: - DIFY_API_KEY - DIFY_BASE_URL install: - id: python kind: node package: python3 bins: - python3 - id: requests kind: node package: requests bins: [] label: Install Python requests library commandDispatch: tool commandTool: exec commandArgMode: json

Dify Knowledge Base Search Skill

πŸ” Search your Dify Knowledge Base to get accurate, contextual answers

This skill enables AI agents to query Dify datasets for RAG (Retrieval-Augmented Generation) context retrieval. Perfect for knowledge base Q&A, documentation search, and contextual AI responses.

!Dify Knowledge Base

✨ Features

  • List Knowledge Bases - Discover all available Dify datasets
  • Smart Search - Query datasets with hybrid, semantic, or keyword search
  • Auto-Discovery - Automatically find available datasets if ID not provided
  • Configurable Results - Adjust top-k, search method, and reranking
  • Error Handling - Graceful error messages for debugging
  • Zero Hardcoding - All configuration via environment variables
  • πŸš€ Quick Start

    1. Configure Environment Variables

    Set up in openclaw.json:

    {
      "env": {
        "vars": {
          "DIFY_API_KEY": "${DIFY_API_KEY}",
          "DIFY_BASE_URL": "https://dify.example.com/v1"
        }
      }
    }
    

    Environment Variables:

    | Variable | Required | Default | Description | |----------|----------|---------|-------------| | DIFY_API_KEY | βœ… Yes | - | Your Dify API Key (from Settings β†’ API) | | DIFY_BASE_URL | ❌ No | http://localhost/v1 | Your Dify instance base URL |

    2. Install Dependencies

    pip3 install requests
    

    πŸ› οΈ Tools

    dify_list

    Lists all available knowledge bases (datasets) in your Dify instance.

    Invocation: dify_list tool

    Example Response:

    {
      "status": "success",
      "count": 2,
      "datasets": [
        {
          "id": "dataset-abc123",
          "name": "Product Documentation",
          "doc_count": 42,
          "description": "All product guides and tutorials"
        },
        {
          "id": "dataset-xyz789",
          "name": "API Reference",
          "doc_count": 156,
          "description": "REST API documentation"
        }
      ]
    }
    

    Usage:

    {}
    

    dify_search

    Searches a Dify Dataset for relevant context chunks.

    Invocation: dify_search tool (mapped to python3 scripts/search.py)

    Parameters:

    | Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | query | string | βœ… Yes | - | Search query or question | | dataset_id | string | ❌ No | Auto-discover | Specific dataset ID to search | | top_k | integer | ❌ No | 3 | Number of results to return | | search_method | string | ❌ No | hybrid_search | Search strategy | | reranking_enable | boolean | ❌ No | false | Enable reranking for better results |

    Search Methods:

  • hybrid_search - Combine semantic + keyword search (recommended)
  • semantic_search - Meaning-based similarity search
  • keyword_search - Exact keyword matching
  • Example Usage:

    {
      "query": "How do I configure OpenClaw?",
      "top_k": 5
    }
    

    {
      "query": "API authentication methods",
      "dataset_id": "dataset-xyz789",
      "search_method": "semantic_search",
      "reranking_enable": true
    }
    

    Example Response:

    {
      "status": "success",
      "query": "How do I configure OpenClaw?",
      "dataset_id": "dataset-abc123",
      "count": 3,
      "results": [
        {
          "content": "To configure OpenClaw, edit the openclaw.json file...",
          "score": 0.8923,
          "title": "Installation Guide",
          "document_id": "doc-001"
        },
        {
          "content": "OpenClaw supports environment variables via...",
          "score": 0.8451,
          "title": "Configuration Options",
          "document_id": "doc-002"
        }
      ]
    }
    

    πŸ“‹ Complete Workflow Example

    [
      {
        "tool": "dify_list",
        "parameters": {}
      },
      {
        "tool": "dify_search",
        "parameters": {
          "query": "What are the system requirements?",
          "top_k": 5,
          "search_method": "hybrid_search"
        }
      }
    ]
    

    πŸ”§ Troubleshooting

    Common Errors

    | Error | Solution | |-------|----------| | Missing DIFY_API_KEY | Set DIFY_API_KEY in environment variables | | Connection refused | Check DIFY_BASE_URL is correct and accessible | | No datasets found | Verify dataset exists in your Dify workspace | | API request failed | Check network connectivity and API key permissions |

    Debug Mode

    Run manually to see detailed errors:

    DIFY_API_KEY=your-key python3 scripts/search.py <<< '{"query":"test"}'
    

    πŸ“š Integration Tips

    RAG Pipeline Integration

    # Example: Use search results in AI response
    results = dify_search(query, top_k=5)
    context = "\n".join([r["content"] for r in results["results"]])
    final_prompt = f"Answer based on context:\n\n{context}\n\nQuestion: {query}"
    

    Multiple Datasets

    For searching across multiple datasets, loop through them:

    {
      "query": "Find information about authentication",
      "dataset_id": "dataset-api-docs"
    }
    

    Then query another dataset separately.

    πŸ”’ Security

  • Never commit API keys - Use environment variables or .env files
  • Rotate keys regularly - Generate new keys in Dify Settings
  • Restrict access - Limit API key permissions where possible
  • πŸ“– Implementation Details

    This skill uses the Dify Dataset API:

  • List Datasets: GET /v1/datasets
  • Search: POST /v1/datasets/{id}/retrieve
  • For API documentation, see: https://docs.dify.ai/reference/api-reference

    πŸ“ Changelog

    v1.1.0 (2026-02-08):

  • βœ… Added search method selection (hybrid/semantic/keyword)
  • βœ… Added reranking support
  • βœ… Auto-discovery of datasets
  • βœ… Improved error handling
  • βœ… Removed hardcoded URLs (fully configurable)
  • βœ… Added detailed logging
  • v1.0.0 (2026-02-06):

  • Initial release
  • Basic list and search functionality