Getting Started
Configuration
Environment variables and server configuration
Essence is configured via environment variables. Copy .env.example to .env and adjust as needed.
Environment Variables
| Variable | Default | Description |
|---|---|---|
PORT | 8080 | Server port |
RUST_LOG | essence=info | Log level (debug, info, warn, error) |
BROWSER_HEADLESS | true | Run Chromium in headless mode |
BROWSER_POOL_SIZE | 5 | Number of browser instances in the pool |
BROWSER_TIMEOUT_MS | 30000 | Browser page load timeout |
ENGINE_WATERFALL_ENABLED | true | Enable HTTP-to-browser fallback |
ENGINE_WATERFALL_DELAY_MS | 5000 | Delay before triggering browser fallback |
CRAWL_RATE_LIMIT_PER_SEC | 2 | Rate limit per domain for crawling |
CRAWL_MAX_DURATION_SEC | 300 | Maximum crawl duration |
MAX_CONCURRENT_REQUESTS | 10 | Max concurrent crawl requests |
MAX_PARALLEL_SCRAPES | 5 | Parallel scrapes for /api/v1/search |
MAX_REQUEST_SIZE_MB | 1 | Max request body size |
Engine Selection
Essence uses a two-tier rendering strategy:
- HTTP Engine (default, fast) — lightweight fetch via reqwest. Handles most pages in under a second.
- Browser Engine (fallback) — full Chromium automation via CDP. Used for SPAs, JavaScript-rendered content, and anti-bot pages.
When engine is set to "auto" (default), Essence automatically detects whether a page needs browser rendering based on:
- Content density analysis
- JavaScript framework hydration markers (
__NEXT_DATA__,__NUXT__, etc.) - Meta-refresh redirects
- Anti-fetch response headers
Set ENGINE_WATERFALL_ENABLED=true to race both engines in parallel for maximum reliability.