Introduction

What HeadlessX ships today, how the platform is shaped, and where to start.

by @saifyxpro

HeadlessX Introduction

HeadlessX is a self-hosted scraping platform built around a TypeScript API, a Next.js operator dashboard, queue-backed crawl workflows, a Python YouTube engine, a published CLI, and a remote MCP endpoint for AI clients.

Recommended first path

Start with Quick Start if you want the fastest path to a running instance. Jump to Self-Hosting Overview when you are deciding between local services, mixed infrastructure, or a full Docker deployment.

What ships today


AreaCurrent surface
Website scrapingHTML, JS-rendered HTML, content extraction, screenshots, map, crawl, and SSE progress
Search toolsGoogle AI Search, Tavily, and Exa workspaces plus API endpoints
YouTubeMetadata extraction, formats, subtitles, preview, and temporary save packaging
OperationsAPI keys, logs, settings, jobs, proxies, and runtime status
AI integrationsRemote MCP over /mcp, the published headlessx CLI, and the repository agent skill

Product shape


HeadlessX is split into a few clear layers:

  • the API backend exposes the authenticated /api/* HTTP surface and the /mcp endpoint
  • the web dashboard provides the operator interface and playground
  • the YouTube engine handles metadata extraction and temporary media packaging
  • the HTML-to-Markdown service supports content workflows
  • Redis and the worker process power queued crawl and background jobs

Where to start


Use this reading order if you are new to the platform:

Who this documentation is for


This docs set is written for three audiences:

  • operators running HeadlessX in local, mixed, or Docker setups
  • developers extending the API, dashboard, or runtime services
  • automation users connecting via HTTP, workflow tools, CLI, skills, or MCP clients

Release history


The current release history lives in the Changelog. Use it for upgrade context, not as the main setup guide.

Related Docs

Next Steps