Essence
API Reference

Scrape

POST /api/v1/scrape — single-page capture

Single-page capture with automatic engine selection. Converts any web page into clean, LLM-ready Markdown.

Endpoint

POST /api/v1/scrape

Parameters

ParameterTypeDefaultDescription
urlstringRequired. URL to scrape
formatsstring[]["markdown"]Output formats: "markdown", "html", "rawHtml", "links", "images", "screenshot"
enginestring"auto""auto", "http", or "browser"
timeoutinteger30000Timeout in milliseconds
headersobject{}Custom HTTP headers
onlyMainContentbooleantrueExtract only main content (strips nav, footer, etc.)
waitForinteger0Wait time before scraping (ms, browser only)
waitForSelectorstringCSS selector to wait for (browser only)
removeBase64ImagesbooleantrueStrip base64-encoded images from output
skipTlsVerificationbooleanfalseSkip TLS certificate verification
screenshotbooleanfalseCapture page screenshot (browser only)
screenshotFormatstring"png""png" or "jpeg"
actionsarrayBrowser actions to perform before scraping
includeTagsstring[]CSS selectors to include
excludeTagsstring[]CSS selectors to exclude

Examples

cURL

curl -X POST http://localhost:8080/api/v1/scrape \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "formats": ["markdown", "html"]
  }'

Python

import requests

response = requests.post("http://localhost:8080/api/v1/scrape", json={
    "url": "https://example.com",
    "formats": ["markdown", "html"]
})

data = response.json()
print(data["data"]["markdown"])

JavaScript

const response = await fetch("http://localhost:8080/api/v1/scrape", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    url: "https://example.com",
    formats: ["markdown", "html"],
  }),
});

const data = await response.json();
console.log(data.data.markdown);

Response

Metadata is always included in the response regardless of the formats parameter. Other fields (markdown, html, rawHtml, links, images, screenshot) are only present when requested via formats.

{
  "success": true,
  "data": {
    "title": "Example Domain",
    "description": "",
    "url": "https://example.com",
    "markdown": "# Example Domain\n\nThis domain is for use in illustrative examples...",
    "html": "<html>...</html>",
    "metadata": {
      "title": "Example Domain",
      "description": "",
      "language": "en",
      "keywords": null,
      "robots": null,
      "ogTitle": null,
      "ogDescription": null,
      "ogUrl": null,
      "ogImage": null,
      "url": "https://example.com",
      "sourceUrl": "https://example.com",
      "statusCode": 200,
      "contentType": "text/html",
      "canonicalUrl": null,
      "wordCount": 58,
      "readingTime": 1,
      "excerpt": null,
      "detectedFrameworks": null,
      "detectionReason": null,
      "contentScriptRatio": null
    }
  }
}

Fields like links, images, rawHtml, and screenshot appear when their corresponding format is requested. The title, description, and url fields are duplicated at the top level of the document for convenience.

Browser Actions

When using engine: "browser", you can perform actions before scraping:

curl -X POST http://localhost:8080/api/v1/scrape \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/login",
    "engine": "browser",
    "actions": [
      {"type": "waitForSelector", "selector": "#username"},
      {"type": "type", "selector": "#username", "text": "user@example.com"},
      {"type": "type", "selector": "#password", "text": "password"},
      {"type": "click", "selector": "#submit"},
      {"type": "wait", "milliseconds": 2000}
    ]
  }'

Supported action types: click, type, scroll, wait, waitForSelector. All action type names use camelCase.

On this page