# html2md
> html2md is a high-fidelity HTML-to-Markdown converter and web scraper. It uses headless Chrome with advanced stealth plugins to bypass WAFs and Cloudflare, renders JavaScript-heavy SPAs, and extracts clean Markdown using Readability.js and Turndown.js. Designed for AI agents, RAG pipelines, and data extraction workflows. For complete documentation in a single file, see [Full Documentation](/llms-full.txt).
- This tool provides a REST API for programmatic access — no UI interaction required
- All API endpoints stream real-time progress logs as plain text, then append a `__JSON__` delimiter followed by the final JSON result
- If a client disconnects mid-stream, the server kills the headless browser processes to conserve memory
- The tool automatically handles anti-bot measures (Cloudflare, WAFs) using rotating User-Agents, stealth plugins, and browser fingerprint masking
## API Endpoints
- [Health Check](/api/health): `GET /api/health` — Returns `{"status":"ok","time":"..."}` to verify the server is running
- [Convert Single URL](docs/api-reference.md): `POST /api/convert` — Convert one URL to Markdown. Body: `{"url":"https://...","downloadImages":true,"frontMatter":true}`
- [Batch Convert](docs/api-reference.md): `POST /api/batch` — Convert multiple URLs. Body: `{"urls":["https://...","https://..."],"downloadImages":true,"frontMatter":true}`
- [Crawl Site](docs/api-reference.md): `POST /api/crawl` — Discover and convert an entire domain. Body: `{"url":"https://...","depth":3,"maxPages":50,"treeOnly":false}`
- [Download Archive](docs/api-reference.md): `GET /api/download/:site` — Download all crawled content for a domain as a ZIP archive
## Agentic Skills
- [html2md-scraper](/skills/html2md-scraper.md): Actionable instructions, usage guidelines, and CLI architectures for autonomous software agents to use html2md locally or via API.
## Documentation
- [Getting Started](docs/getting-started.md): Installation, first conversion, job folder structure
- [CLI Reference](docs/cli-reference.md): All command-line flags and options
- [API Reference](docs/api-reference.md): Full REST API documentation with request/response schemas
- [Architecture](docs/architecture.md): Pipeline steps, stealth evasion, caching, and file layout
## Optional
- [Interactive CLI](docs/interactive-cli.md): Menu-driven terminal UI for guided usage
- [Deployment](docs/deployment.md): Docker, Railway, and Netlify deployment guides