🎁 Get the FREE AI Skills Starter Guide β€” Subscribe β†’
BytesAgainBytesAgain
πŸ¦€ ClawHub

HTML Analysis

by @mzlzyca

Analyze the structure and content of HTML documents using MinerU. Returns structured Markdown with layout information, headings, and content hierarchy preser...

Versionv0.4.0
Installs1
πŸ’‘ Examples

# Analyze a local HTML file (requires token)
mineru-open-api extract page.html -o ./out/

Analyze a remote HTML file by URL (requires token)

mineru-open-api extract https://example.com/page.html -o ./out/

Crawl a live web page (requires token)

mineru-open-api crawl https://example.com/article -o ./out/

πŸ“‹ Tips & Best Practices

  • HTML is NOT supported by flash-extract β€” use extract with token
  • For web page crawling, use mineru-open-api crawl instead of extract
  • Output goes to stdout by default; use -o to save to a file or directory
  • All progress/status messages go to stderr; document content goes to stdout
  • MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
  • View on ClawHub
    TERMINAL
    clawhub install html-analysis

    πŸ§ͺ Use this skill with your agent

    Most visitors already have an agent. Pick your environment, install or copy the workflow, then run the smoke-test prompt above.

    πŸ” Can't find the right skill?

    Search 60,000+ AI agent skills β€” free, no login needed.

    Search Skills β†’