Python SDK

The Essence Python SDK provides a simple, typed client for all Essence endpoints.

Installation

pip install essence-sdk

Or install from source:

cd sdks/python
pip install .

Quick Start

from essence import EssenceClient

client = EssenceClient("http://localhost:8080")

# Scrape a page
result = client.scrape("https://example.com")
print(result["data"]["markdown"])

Client Reference

`EssenceClient(base_url, timeout)`

Parameter	Type	Default	Description
`base_url`	str	`"http://localhost:8080"`	Essence server URL
`timeout`	float	`60.0`	HTTP timeout in seconds

The client supports context manager usage:

with EssenceClient("http://localhost:8080") as client:
    result = client.scrape("https://example.com")

`client.scrape(url, **kwargs)`

Scrape a single page.

result = client.scrape(
    "https://example.com",
    formats=["markdown", "links"],
    engine="auto",
    only_main_content=True,
    timeout_ms=30000,
)

`client.crawl(url, **kwargs)`

Crawl a website.

result = client.crawl(
    "https://docs.example.com",
    max_depth=3,
    limit=50,
    include_paths=["/docs/*"],
)

`client.map(url, **kwargs)`

Discover URLs from a site.

result = client.map("https://example.com", limit=1000)
urls = result["links"]

`client.search(query, **kwargs)`

Search the web via DuckDuckGo.

result = client.search("rust web scraping", limit=5, scrape_results=True)
for item in result["data"]:
    print(item["title"], item["url"])

`client.extract(urls, **kwargs)`

Extract structured data from pages.

# CSS extraction (no API key needed)
result = client.extract(
    ["https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html"],
    mode="css",
    selectors={"title": "h1", "price": "p.price_color"},
    schema={"properties": {"title": {"type": "string"}, "price": {"type": "number"}}},
)

# LLM extraction
result = client.extract(
    ["https://example.com/about"],
    mode="llm",
    prompt="Extract company info",
    schema={"properties": {"name": {"type": "string"}, "founded": {"type": "string"}}},
    llm_base_url="https://api.openai.com",
    llm_model="gpt-4o-mini",
    llm_api_key="sk-...",
)

`client.llmstxt(url, **kwargs)`

Generate llms.txt from a website.

result = client.llmstxt("https://docs.example.com", max_urls=50)
print(result["llmstxt"])

Python SDK

On this page