LLMs.txt

Generate llms.txt files from any website. Discovers pages via sitemap and in-page links, scrapes each page, and produces a structured index plus optional full-text output.

Endpoint

POST /api/v1/llmstxt

Parameters

Parameter	Type	Default	Description
`url`	string	—	Required. Website URL to generate llms.txt for
`maxUrls`	integer	`20`	Maximum number of pages to process (max: 500)
`showFullText`	boolean	`true`	Also generate llms-full.txt with full page markdown
`engine`	string	`"auto"`	`"auto"`, `"http"`, or `"browser"`
`maxConcurrentScrapes`	integer	`10`	Parallel scrape concurrency
`ignoreSitemap`	boolean	`false`	Skip sitemap.xml during URL discovery
`includeSubdomains`	boolean	`true`	Include subdomain pages in discovery
`llmBaseUrl`	string	—	OpenAI-compatible API base URL for generating descriptions
`llmModel`	string	`"gpt-4o-mini"`	LLM model name for description generation
`llmApiKey`	string	—	API key for the LLM service

Example Request

curl -X POST http://localhost:8080/api/v1/llmstxt \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://docs.example.com",
    "maxUrls": 50
  }'

Response

{
  "success": true,
  "llmstxt": "# Example Docs\n\n> Documentation for Example\n\n## Pages\n\n- [Getting Started](https://docs.example.com/getting-started): Setup guide\n- [API Reference](https://docs.example.com/api): Full API docs\n",
  "llmsFulltxt": "# Example Docs\n\n> Documentation for Example\n\n## Getting Started\n\nFull page content here...\n\n## API Reference\n\nFull page content here...\n",
  "urlsProcessed": 12,
  "urlsDiscovered": 45
}

How It Works

Discover URLs — Uses sitemap.xml and in-page link crawling to find pages on the site
Prioritize — Same-host pages ranked first, limited to maxUrls
Scrape — Each page scraped in parallel using the selected engine
Generate — Produces llmstxt (index with titles and descriptions) and optionally llmsFulltxt (full markdown of all pages concatenated)

LLM-Generated Descriptions

By default, page descriptions come from HTML metadata (<meta name="description">). For higher-quality descriptions, provide an OpenAI-compatible LLM endpoint:

{
  "url": "https://docs.example.com",
  "llmBaseUrl": "https://api.openai.com/v1",
  "llmApiKey": "sk-...",
  "llmModel": "gpt-4o-mini"
}

The LLM generates a one-line description for each page based on its content.