Developers

Scrape Exchange is built API-first. Access bulk data, query specific records, stream real-time updates, and contribute your own scraped datasets.

SCRAPEUPLOADEXCHANGEDOWNLOAD/STREAM

Access Data

Bulk Data Dumps

Full database exports are available for bulk download. Whenever possible, use the BitTorrent protocol to reduce server load — we provide .torrent files for each dump, which any standard client (qBittorrent, Transmission, etc.) can open. Distributing load across seeders means faster downloads for you and lower infrastructure costs for us.

Data dumps are available on the downloads page.

REST API

Call the REST API to retrieve specific records or apply filters not available in the bulk dumps — filter by platform, entity type, uploader, creator ID, content ID, and more. No authentication is required for read access.

# Fetch records by platform and entity type
GET https://scrape.exchange/api/data/v1/param/boinko/youtube/video
# Get a specific item by ID
GET https://scrape.exchange/api/data/v1/item_id/{id}

Full reference: scrape.exchange/docs

Real-Time WebSocket Feed

Subscribe to a live stream of new uploads via the WebSocket listener API. Choose what you receive: full scraped data, Scrape Exchange upload metadata only, or the platform metadata for each item. Filter by platform, uploader, entity type, or content creator.

A ready-to-use listener tool is available in the scrape-python repository.

# Listen for all new uploads
python tools/listen_messages.py
# Filter to YouTube videos only
python tools/listen_messages.py --platform youtube --entity video

Contribute Data

Upload via API

Upload scraped data programmatically via the REST API. Sign up for a free account to get your API keys. When choosing a schema for your upload, use an existing one where possible — it reduces friction for people who download the data you contribute.

Python examples are available in the scrape-python repository under the tools/ folder.

# Step 1: obtain a JWT token using your API credentials
POST https://scrape.exchange/api/account/v1/token
{"api_key_id": "{api_key_id}", "api_key_secret": "{api_key_secret}"}
# Step 2: upload a scraped item using the JWT
POST https://scrape.exchange/api/data/v1/
Authorization: Bearer {jwt_token}

Register New JSON Schemas

If you are scraping data not yet covered by an existing schema — or scraping a new platform — you can register a new JSON Schema. Schemas must meet the following requirements:

  • Written against the JSONSchema specification draft 2020-12
  • Describes an entity type on one of the supported platforms (e.g. YouTube channel, TikTok video, Instagram comment)
  • Self-contained — no external $refs; URLs may only point to jsonschema.org or scrape.exchange domains
  • Versioned with semantic versioning (semver)

Schema names must follow the convention:

{schema_owner}_{platform}_{entity_type}_{semver}
example: boinko_youtube_channel_0.1.0

For examples of valid schemas, see the schemas in scrape-python/tests/collateral.

Data Quality

Data quality is a primary focus. Contributions are most valuable when they are complete and verifiable. Before uploading, ensure that:

  • Your data validates against the JSON schema you specify
  • Your data validates against the source URL included in the upload
  • You cover as many schema fields as possible, not just the required ones

Data quality metrics are tracked and displayed in the stats section. A reputation system for uploaders is planned for a future release.

Community

Discuss Scrape Exchange with other developers, share projects built with the data, ask for help, or give feedback: