Developers

Scrape Exchange is built API-first. Access bulk data, query specific records, stream real-time updates, and contribute your own scraped datasets.

SCRAPE→UPLOAD→EXCHANGE→DOWNLOAD/STREAM

Access Data

Bulk Data Dumps

Full database exports are available for bulk download. Whenever possible, use the BitTorrent protocol to reduce server load — we provide .torrent files for each dump, which any standard client (qBittorrent, Transmission, etc.) can open. Distributing load across seeders means faster downloads for you and lower infrastructure costs for us.

Data dumps are available on the torrents page.

REST API

Call the REST API to retrieve specific records or apply filters not available in the bulk dumps — filter by platform, entity type, uploader, creator ID, content ID, and more. No authentication is required for read access.

# Fetch records by platform and entity type

GET https://scrape.exchange/api/v1/data/param/boinko/youtube/video

# Get a specific item by ID

GET https://scrape.exchange/api/v1/data/item_id/{id}

Full reference: scrape.exchange/docs

Real-Time WebSocket Feed

Subscribe to a live stream of new uploads via the WebSocket listener API. Choose what you receive: full scraped data, Scrape Exchange upload metadata only, or the platform metadata for each item. Filter by platform, uploader, entity type, or content creator.

A ready-to-use listener tool is available in the scrape-python repository.

# Listen for all new uploads

python tools/listen_messages.py

# Filter to YouTube videos only

python tools/listen_messages.py --platform youtube --entity video

Contribute Data

Upload Using the Web Interface

You can upload data on the Upload page, using the JSON or JSONL format. The website will validate each object in those files against a JSONSchema before storing it in the database.

Upload via API

Upload scraped data programmatically via the REST API. Sign up for a free account to get your API keys. When choosing a schema for your upload, use an existing one where possible — it reduces friction for people who download the data you contribute.

Python examples are available in the scrape-python repository under the tools/ folder.

# Step 1: obtain a JWT token using your API credentials

POST https://scrape.exchange/api/v1/account/token

{"api_key_id": "{api_key_id}", "api_key_secret": "{api_key_secret}"}

# Step 2: upload a scraped item using the JWT

POST https://scrape.exchange/api/v1/data/

Authorization: Bearer {jwt_token}

Register New JSON Schemas

If you are scraping data not yet covered by an existing schema — or scraping a new platform — you can register a new JSON Schema. Schemas must meet the following requirements:

—Written against the JSONSchema specification draft 2020-12
—Describes an entity type on one of the supported platforms (e.g. YouTube channel, TikTok video, Instagram comment)
—Self-contained — no external $refs; URLs may only point to jsonschema.org or scrape.exchange domains
—Versioned with semantic versioning (semver)

Schema names must follow the convention:

{schema_owner}-{platform}-{entity_type}-{semver}

example: boinko-youtube-channel-0.1.0

You can browse existing schemas on the Schema page.

Data Quality

Data quality is a primary focus. Contributions are most valuable when they are complete and verifiable. Before uploading, ensure that:

—Your data validates against the JSON schema you specify
—Your data validates against the source URL included in the upload
—You cover as many schema fields as possible, not just the required ones

Data quality metrics are tracked and displayed in the stats section. A reputation system for uploaders is planned for a future release.

Community

Discuss Scrape Exchange with other developers, share projects built with the data, ask for help, or give feedback:

r/ScrapeExchange

Discord

discord.gg/scrapeexchange