Why Raw Lighthouse JSON Is Too Big for AI Agents (And What to Do About It)

If you call the Google PageSpeed Insights API or run lighthouse --output=json, you get a file between 500KB and 2MB. Paste that into ChatGPT, Claude, or Cursor and one of three things happens: the upload fails, the model truncates mid-response, or it spends most of its context window parsing screenshots and passing audits instead of fixing your site.

The data is not wrong — it is just the wrong shape for AI workflows. Here is what is actually in a raw Lighthouse JSON file, why most of it is useless for AI agents, and what a stripped export should look like instead.

What Makes Raw PSI JSON So Large?

A Lighthouse Result (LHR) from PageSpeed Insights contains everything Google's audit engine produces. That includes:

| Payload | Typical size | Useful for AI fixes? |

|---|---|---|

| Base64 screenshots & filmstrips | 300–500 KB | No — binary image data |

| Passing audits (score ≥ 0.9) | 100–300 KB | No — nothing to fix |

| Script treemap visualization | 50–150 KB | Rarely — bundle charts, not fixes |

| Category groups & config metadata | 20–50 KB | No — UI internals |

| Failing audits & opportunities | 30–80 KB | Yes — this is what you need |

| Core Web Vitals & scores | 5–10 KB | Yes |

| CrUX field data | 5–15 KB | Yes |

| Stack pack hints | 2–10 KB | Yes |

Roughly 80% of a raw PSI JSON file is noise when your goal is to get an AI agent to write code fixes. The actionable data — failing audits, metrics, CrUX, stack hints — fits comfortably in under 50KB.

Why AI Agents Struggle With the Full Dump

Context window limits

Even models with large context windows perform worse when fed irrelevant data. A 1.5MB JSON file forces the model to scan hundreds of passing audits (color-contrast, document-title, meta-description) before it reaches render-blocking-resources or unused-javascript.

Upload and attachment limits

ChatGPT file uploads, Claude project attachments, and many API integrations have practical size limits. A 2MB Lighthouse dump may fail silently or get rejected.

Token cost

If you are paying per token, shipping 500KB of base64 screenshot data to analyze LCP is expensive and pointless. The screenshot does not tell the model which tag to add fetchpriority="high" to — the audit details do.

Poor prioritization

Raw Lighthouse JSON lists audits alphabetically or by category. It does not tell your AI agent which three fixes will save the most milliseconds. An AI-ready export should sort opportunities by estimated time savings and include Lighthouse performance weights.

What an AI Agent Actually Needs

When you ask ChatGPT or Claude to fix your Core Web Vitals, it needs:

Scores — performance, accessibility, best-practices, SEO (0–100)
Core Web Vitals — LCP, FCP, TBT, CLS, Speed Index, TTI with numeric values
CrUX field data — real-user LCP/INP/CLS vs. your lab test
Resource summary — total page weight, DOM size, network requests, main-thread work
Stack pack hints — WordPress/React/Shopify-specific remediation advice
Prioritized failing audits — opportunities sorted by ms savings, diagnostics by worst score, with detail items (URLs, sizes, selectors)

It does not need:

final-screenshot (base64 PNG)
screenshot-thumbnails and filmstrip frames
script-treemap-data
120 audits with "score": 1
Internal configSettings and categoryGroups

The Fix: Strip Before You Export

PageSpeed Exporter runs the same Google PageSpeed Insights API v5 as pagespeed.web.dev, then processes the raw response through buildAIReport():

// Conceptual pipeline
const rawPSI = await fetchPageSpeedInsights(url);  // 500KB–2MB
const aiReport = buildAIReport(rawPSI);              // < 50KB

The AIReport keeps every actionable finding and removes:

Screenshots, filmstrips, and treemap data
Passing, informative, manual, and not-applicable audits
Redundant config and UI metadata
Detail items beyond the top 10 per failing audit

The result is a JSON file you can paste directly into any AI chat — with issues already sorted by impact.

DIY vs. PageSpeed Exporter

You can strip Lighthouse JSON yourself with a script. The logic is straightforward: iterate lighthouseResult.audits, skip audits where score >= 1, delete details from non-actionable audits, and drop screenshot audit IDs entirely.

The reason to use PageSpeed Exporter instead:

No scripting — paste a URL, download the AIReport
CrUX field data mapped — URL-level and origin-level metrics in a flat structure
Stack packs preserved — framework hints included when Lighthouse detects WordPress, React, etc.
Dual strategy merge — mobile + desktop in one AIReport (Starter/Pro plans)
AI prompt templates — copy-paste prompts tuned for Full Analysis, Quick Wins, Code Diffs, and Performance Score workflows

For a free audit with no account, run any URL at speedexporter.com.

Why Raw Lighthouse JSON Is Too Big for AI Agents (And What to Do About It)

What Makes Raw PSI JSON So Large?

Why AI Agents Struggle With the Full Dump

Context window limits

Upload and attachment limits

Token cost

Poor prioritization

What an AI Agent Actually Needs

The Fix: Strip Before You Export

DIY vs. PageSpeed Exporter

Further Reading

Sources

Further reading

Stripped Lighthouse JSON vs. Raw PageSpeed Insights: Which Should You Use?

What Is an AIReport? The Stripped-Down Lighthouse JSON Explained

How to Export PageSpeed Insights Results as JSON

Try it yourself