Open Source|MIT Licensed

The fastest open-source web scraper for LLMs.

Distill the web.

Convert any web page into clean, LLM-ready Markdown. Built in Rust with intelligent HTTP-to-browser fallback. Self-hosted, no API keys, no rate limits.

Get Started View on GitHub

498msMedian ScrapeAcross 35 real-world URLs

97%Quality Win RateLLM judge (vs alternatives)

1.8xFasterThan nearest competitor

0DependenciesNo Redis, no Postgres

Features

Everything you need to scrape the web

Built for developers who need fast, reliable web data extraction for LLM pipelines and AI agents.

Two-Tier Rendering

Starts with a fast HTTP fetch. If content density is too low, automatically escalates to full Chromium browser automation. Speed when possible, reliability when needed.

LLM-Optimized Markdown

Strips navigation, ads, footers, cookie banners, and boilerplate. Preserves code blocks, heading hierarchy, and extracts metadata including OG tags.

Five Endpoints, One Server

Scrape a page, crawl a site, map URLs via sitemaps, search the web, or generate llms.txt files. All from a single lightweight server.

MCP Server for AI Agents

Built-in Model Context Protocol server. Connect Claude, Cursor, or any MCP client to give your agent web access.

Zero Dependencies

A single Rust binary. No Redis, no PostgreSQL, no external services. Run with cargo run or docker-compose up.

100% Open Source

MIT licensed. Self-hosted, no vendor lock-in, no API keys, no rate limits. Free forever.

How it works

URL to Markdown in milliseconds

Three steps from URL to LLM-ready Markdown.

Send a URL

POST to any of the 4 endpoints -- scrape, crawl, map, or search.

Smart rendering

HTTP first. If the page is a JavaScript SPA, automatic fallback to headless Chromium.

Get clean Markdown

Structured output with metadata, ready for your LLM pipeline.

Benchmarks

Better output. Faster.

Benchmarked across 35 real-world URLs spanning 7 content categories against leading alternatives. Quality evaluated by LLM judge (Claude).

Quality (LLM Judge)

Per-category win rate across 35 URLs

Category	Essence	Alternatives	Ties
Structured	5/5	0/5	0
News	4/5	0/5	1
Reference	5/5	0/5	0
Content	5/5	0/5	0
Dynamic	4/5	1/5	0
Docs	5/5	0/5	0
E-Commerce	4/5	1/5	0
Total	32/35	2/35	2

Speed Comparison

Average response time by category

Docs1.6x faster

Essence

334ms

Next best

536ms

News2.6x faster

Essence

449ms

Next best

1152ms

Dynamic2.2x faster

Essence

540ms

Next best

1166ms

Reference1.2x faster

Essence

826ms

Next best

968ms

Structured1.3x faster

Essence

929ms

Next best

1178ms

Benchmark conducted April 2026 against Firecrawl and Crawl4AI (both self-hosted via Docker). LLM judge evaluated content relevance, noise removal, readability, structural coherence, and information completeness. Full methodology

Comparison

Why Essence

How Essence compares to other scraping tools. No spin -- just data.

Feature	Essence	Alternatives
LLM-ready Markdown
Open source license	MIT	AGPL / Apache
Self-hosted	Single binary, zero deps	Redis + services / Docker
Browser fallback	Automatic (content-aware)	Manual / always-on
MCP server	Built-in	Separate package
API key required		Cloud tiers
Rate limits	None	Tiered pricing
Quality (LLM judge)	97% win rate	Best alternative: 26%
Median speed	498ms	908ms+
Built-in search	DuckDuckGo	Varies
Language	Rust	TypeScript / Python
Pricing	Free forever	Free tier + paid

Integration

Works with everything

A simple REST API. Use it from any language, any framework, or connect your AI agent via MCP.

curl -X POST http://localhost:8080/api/v1/scrape \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

ScrapeFetch a single page/api/v1/scrape

CrawlTraverse a site/api/v1/crawl

MapDiscover URLs/api/v1/map

SearchSearch the web/api/v1/search

LLMs.txtGenerate llms.txt/api/v1/llmstxt

Get started

Ready to build?

Start getting Web Data for free and scale seamlessly. Self-hosted, no credit card needed.

Get Started Read the docs

Terminal

git clone https://github.com/ruchit-p/essence.git
cd essence/backend
cp .env.example .env
cargo run --release

# Server running at http://localhost:8080