Structured data extraction
Pull screenshots, page content, and structured JSON from any URL in a single API call
Turn any URL into a structured bundle — screenshot, main-content markdown, and selector-driven JSON — without stitching a browser, an HTML parser, and a Markdown converter together yourself.
The problem
Most "fetch data about a URL" workflows end up looking the same:
- Run a headless browser to capture a screenshot
- Run a separate fetch to download the HTML
- Parse the HTML to pull meta tags, favicons, and OG images
- Run a Markdown converter to feed an LLM
- Hope all four of those see the same version of the page
Four round trips, two browsers, drifting page state, and four billing surfaces.
How Allscreenshots helps
The outputs array on POST /v1/screenshots returns the screenshot, raw HTML, main-content markdown, and CSS-selector-driven JSON from a single browser session — and counts as one screenshot against your quota no matter how many output types you request.
- One call, one billing unit — five outputs cost the same as one screenshot
- Same DOM for every output — the screenshot and the scrape are consistent by construction
- No glue infrastructure — no Puppeteer fleet, no parser cache, no Markdown converter to maintain
- LLM-ready — pair the screenshot with the markdown for industry/tone classification
Multi-output works on the sync, async, and bulk endpoints. For batching prospect lists or nightly refreshes, see bulk processing.
Scenarios
Onboarding auto-fill
Users paste their company URL; you pre-fill name, description, logo, and brand colors.
Sales / CRM enrichment
Paste a prospect URL into the contact card and auto-derive company info from one request.
Link previews
Generate rich URL cards with title, description, image, and favicon for chat or comments.
Competitor & SEO audits
Snapshot titles, metas, canonicals, and headings for any list of URLs on a schedule.
Onboarding auto-fill
Pull title, description, ogImage, themeColor, and favicon to seed a brand profile. Pair the screenshot and markdown with an LLM to classify industry and write a tagline.
{
"title": { "selector": "meta[property=\"og:title\"]", "type": "attribute", "attribute": "content" },
"description":{ "selector": "meta[name=description]", "type": "attribute", "attribute": "content" },
"logoImage": { "selector": "meta[property=\"og:image\"]", "type": "attribute", "attribute": "content" },
"themeColor": { "selector": "meta[name=theme-color]", "type": "attribute", "attribute": "content" }
}Sales / CRM enrichment
Add a prospect's URL to a contact card and auto-fill company name, tagline, and homepage hero copy. The markdown gives the LLM enough to write a one-line industry summary.
{
"company": { "selector": "meta[property=\"og:site_name\"]", "type": "attribute", "attribute": "content" },
"tagline": { "selector": "h1", "type": "text" },
"description": { "selector": "meta[name=description]", "type": "attribute", "attribute": "content" }
}Competitor & SEO audits
Run the same schema against every competitor URL on a schedule. Diff the results to catch title rewrites, meta-description changes, or canonical drift.
{
"title": { "selector": "title", "type": "text" },
"metaDesc": { "selector": "meta[name=description]", "type": "attribute", "attribute": "content" },
"canonical": { "selector": "link[rel=canonical]", "type": "attribute", "attribute": "href" },
"h1": { "selector": "h1", "type": "text", "multiple": true }
}Quick example
One request that covers most workflows above:
curl -X POST 'https://api.allscreenshots.com/v1/screenshots' \
-H 'X-API-Key: YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://acme.com",
"blockCookieBanners": true,
"responseType": "url",
"outputs": [
{ "type": "screenshot", "format": "png" },
{ "type": "markdown", "mainContentOnly": true },
{ "type": "json", "schema": {
"title": { "selector": "title", "type": "text" },
"description":{ "selector": "meta[name=description]", "type": "attribute", "attribute": "content" },
"ogImage": { "selector": "meta[property=\"og:image\"]", "type": "attribute", "attribute": "content" },
"themeColor": { "selector": "meta[name=theme-color]", "type": "attribute", "attribute": "content" },
"favicon": { "selector": "link[rel~=icon]", "type": "attribute", "attribute": "href" }
}}
]
}'Next steps
- Multi-output extraction guide — the step-by-step build, including LLM post-processing with Claude or OpenAI.
- Outputs API reference — the canonical option list and response shapes.
- Async jobs and bulk — for batches and webhooks.