AI Visits API

A REST endpoint for agencies to pull a website's AI-Visits analytics: totals, per-day / per-hour trends, AI referral platforms, crawler activity, and a humans-vs-AI breakdown per page, to drop straight into client reports.

Endpoint

GET https://massblogger.com/api/analytics/ai-visits

One call returns a full report for one website over a date range, as plain JSON. Always call the production host https://massblogger.com — there is no localhost or self-hosted instance, so integrators on local / preview / staging must point at this host explicitly (do not rely on a relative or environment-derived base URL).

Authentication

Pass the website's content API key (a UUID) as a query parameter. It is the same key the content-delivery REST APIs use (e.g. /api/blog), and scopes every result to that one website, so an agency uses one key per client site.

?apiKey=YOUR_WEBSITE_API_KEY

Where to find it: Dashboard → your website's settings → the API section. It is generated automatically when you add a website.

Not the tracking key. The mbai_… key from the install snippet is the beacon key: it only sends visits to Massblogger and is never passed to this API. This endpoint always uses the UUID content apiKey.

Missing key → 401 { "error": "API key required" }; unknown key → 401 { "error": "Invalid API key" }.

Query parameters

  • apiKey string, required: the website API key.
  • from date, optional: range start (YYYY-MM-DD or ISO). Default: 28 days ago.
  • to date, optional: range end. Default: now. Max range 366 days.
  • grain string, optional: timeseries granularity: day (default) or hour.
  • visibility string, optional: what to return: all (every visitor) or ai-only (AI-attributed only: visitors referred from AI plus AI bots, with ordinary human visits dropped). Default: the website's saved setting (all unless changed). A query value overrides the saved default for that call. See the warning below.
  • include string, optional: comma-separated sections to return (default: all): summary,timeseries,referrals,crawlers,topPages,byUrl,byNetwork,bySource. Use this to keep responses small.
  • limit number, optional: rows for the per-URL and network tables (1–500, default 50).

Visibility (read this)

In ai-only mode, ordinary human visits (the human_other bucket) are dropped from every section, and summary.human_visits then reports only humans-who-came-from-AI. On a busy site that is a small fraction of real traffic, so a panel can look empty when the site is actually fine. If you are building a total-traffic view, always pass visibility=all. A site can also be saved with ai-only as its default, so a no-param call may already be filtered — read back meta.visibility to know which mode you got.

Summary fields

  • total_events — every event in range (pageviews, the engagement echo, and bot hits).
  • pageviews — pageview events.
  • human_visits — distinct human visits (engagement-deduped). In ai-only mode this is humans-from-AI only.
  • human_from_ai_visits — humans who arrived from an AI assistant.
  • ai_bot_hitsverified AI crawler / agent hits. This equals the sum of the crawler_activity rows.
  • spoofed_hits — bot-UA hits from unverified IPs (impersonators claiming to be e.g. Googlebot). Counted separately; not itemized in crawler_activity (no verified identity), so never add it back into the crawler table.

crawler_activity rows carry the raw bot_name plus a readable bot_label, vendor, bucket_label and a brand favicon URL; referral_platforms rows likewise carry platform_label and favicon.

Visits by AI source vs AI crawlers

Two different things, often conflated. One is people, the other is machines:

  • referral_platforms humans who arrived from an AI assistant, broken down by source (ChatGPT, Perplexity, Gemini, Claude, …). This is "how many visits came from which AI." Each row: ai_platform, platform_label, favicon, sessions, engaged (scrolled ≥50% or stayed ≥10s), conversions. For the same split over time (to chart it stacked by assistant), add include=bySource citations_by_source: one row per { t, ai_platform } with platform_label, favicon and events.
  • crawler_activity — AI bots / crawlers (GPTBot, ClaudeBot, OAI-SearchBot, …) fetching your pages. This is machine traffic, not human visits. Each row: bot_name + bot_label, vendor, bucket_label, hits, favicon.

So the "visits from AI" headline (summary.human_from_ai_visits, a human metric) breaks down by referral_platforms; the GPTBot / ClaudeBot list (summary.ai_bot_hits, a machine metric) is crawler_activity. Never fold a crawler into the "visits from AI" total.

Sessions vs the headline. referral_platforms.sessions is a per-platform session count, so summing it across platforms can come out a little higher than the deduped summary.human_from_ai_visits headline. Use human_from_ai_visits for the total and referral_platforms for the per-source split; do not expect them to reconcile to the unit.

Humans vs AI, per page

The by_url section splits each page into:

  • humans — real people who viewed the page (one visit counts once; the engagement echo is excluded). Dropped in ai-only mode.
  • humans_from_ai — of those, how many arrived from an AI assistant.
  • ai_bots — verified AI crawlers / agents (needs server-log ingestion, Path 2).

Crawler / bot rows (crawler_activity, ai_bots) only populate once the server-side feeder (Path 2) is installed on the site. With the beacon alone you get human and AI-referral data, but the crawler table stays sparse or empty.

Example

Request:

curl "https://massblogger.com/api/analytics/ai-visits?apiKey=YOUR_WEBSITE_API_KEY&from=2026-05-01&to=2026-05-31&grain=day&visibility=all"

Response (truncated):

{
  "website": { "id": "69739ebd9c4826b3e3106b94", "domain": "https://www.example.com/" },
  "range": { "from": "2026-05-01 00:00:00", "to": "2026-05-31 23:59:59", "grain": "day" },
  "summary": {
    "total_events": 4821,
    "pageviews": 2740,
    "human_visits": 2710,
    "human_from_ai_visits": 188,
    "ai_bot_hits": 1942,
    "spoofed_hits": 37,
    "by_bucket": [ { "event_type": "pageview", "bucket": "human_other", "events": 2552 }, ... ]
  },
  "timeseries":        [ { "t": "2026-05-01", "bucket": "human_ai", "events": 7 }, ... ],
  "referral_platforms":[ { "ai_platform": "chatgpt", "sessions": 92, "engaged": 41, "conversions": 0,
                           "platform_label": "ChatGPT",
                           "favicon": "https://www.google.com/s2/favicons?sz=64&domain=chatgpt.com" }, ... ],
  "crawler_activity":  [ { "date": "2026-05-01", "bot_name": "GPTBot", "bucket": "ai_training", "hits": 64,
                           "bot_label": "OpenAI GPTBot", "vendor": "OpenAI", "bucket_label": "AI training",
                           "favicon": "https://www.google.com/s2/favicons?sz=64&domain=openai.com" }, ... ],
  "top_pages":         [ { "url_path": "/guide", "ai_referrals": 31, "ai_fetches": 12, "total": 43 }, ... ],
  "by_url":            [ { "url_path": "/guide", "humans": 410, "humans_from_ai": 58, "ai_bots": 220, "total": 630 }, ... ],
  "by_network":        [ { "asn": 15169, "as_org": "Google LLC", "bot_category": "datacenter", "events": 88 }, ... ],
  "citations_by_source":[ { "t": "2026-05-01", "ai_platform": "chatgpt", "platform_label": "ChatGPT",
                            "favicon": "https://www.google.com/s2/favicons?sz=64&domain=chatgpt.com", "events": 31 }, ... ],
  "meta": { "generatedAt": "2026-06-01T12:00:00.000Z", "sections": "all", "visibility": "all" }
}

Usage

Calls are not credit-gated. Each successful pull is recorded in your API-usage log. Use include= to pull only the sections you need so responses stay lean for scheduled reports. If every requested section fails upstream the call returns 503 (retryable) rather than an empty report.