SEO

Feed GSC + GA4 Exports Into Claude: A 4-Step Content Analysis Workflow (and the 3 Questions It Answers)

March 9, 2025

Contents

Last quarter I ran this workflow on a 1,200-URL e-commerce catalog that had been "SEO-optimized" three times by three different agencies. Ninety minutes later, I had a list of 47 pages that were either mismatched to the queries bringing them in, draining engagement once users landed, or sitting one rewrite away from doubling their traffic. The agency proposals were still in draft.

The trick wasn't anything exotic. It was treating GSC (Google Search Console, 谷歌站长工具) and GA4 (Google Analytics 4, 谷歌分析 4) as raw inputs to Claude rather than as dashboards to read. Two free exports, one good prompt, three questions you actually need answered.

Here's the workflow.

The 3 questions first (so you know what you're building)

Before touching a single export, I write down what I'm trying to learn. Without a target, Claude just summarizes — which is the easiest way to waste a token budget on a 30,000-row spreadsheet. The workflow I keep coming back to answers exactly these:

Which pages rank for queries that don't match what the page actually says? The classic intent-mismatch problem. A page titled "Best Running Shoes 2024" pulling in queries about "running shoe lacing techniques." Traffic that will never convert.
Which pages get organic traffic but kill engagement? Brings sessions, but users bounce, scroll nothing, and leave. The page is solving the wrong problem or burying the answer.
Where are the 8-to-20 quick wins? Pages ranking in the second page of Google with decent impressions. A small rewrite, a new H2, an extra 200 words — these are the moves that pay back fastest.

If a Claude session doesn't surface something useful for at least one of those three, the prompt is wrong or the data isn't loaded correctly. That's the bar.

Step 1: Pull the two exports (10 minutes)

You need exactly two CSV files, scoped tight. Don't dump everything.

From GSC → Performance report:

Date range: last 90 days (long enough to smooth out weekly noise, short enough to stay under 50K rows on most sites)
Dimensions: Query and Page, exported separately
Filter: Impressions ≥ 10 (kills the long tail that's just statistical noise)
Export: Click the row limit up to 1,000 rows. If you have more, export the top 1,000 by Impressions and accept that you're working from the head of the distribution.

You now have two CSVs: gsc-queries.csv and gsc-pages.csv.

From GA4 → Pages and screens report:

Date range: same 90 days
Dimensions: Page path + screen class
Metrics: Views, Engaged sessions, Average engagement time, Engagement rate, Event count (for key events like purchase or sign_up if you've configured them)
Filter: Views ≥ 50 (you're not making decisions on a page with 12 sessions)
Export: CSV, same row limit discipline

That gives you ga4-pages.csv.

The join key between the two is the page URL. This matters because GSC normalizes URLs differently than GA4 — https://example.com/shoe/ vs https://example.com/shoe vs https://example.com/shoe?utm_source=newsletter. Clean the URLs in a spreadsheet first: lowercase, strip UTM parameters (utm_*), strip trailing slashes, strip the protocol and host so you're left with path-only.

Step 2: Merge into one flat table (15 minutes)

Don't paste three files into Claude and ask it to "join them." That works sometimes and hallucinates the rest of the time. Pre-join in a spreadsheet.

Build one master table with these columns:

Page path	Total clicks (GSC)	Total impressions (GSC)	Avg position (GSC)	Top query (GSC)	Views (GA4)	Engaged sessions (GA4)	Engagement rate (GA4)	Avg engagement time (GA4)

The "Top query" column is the one to spend an extra minute on — sort your GSC queries export by Page first, then by Clicks descending, and pull the #1 query into a new column against each page. That single field is what makes the intent-mismatch question (Question 1) answerable.

If the table is over ~2,000 rows, sample down. A 1,200-row table is the sweet spot for Claude's analysis context. Beyond that, you'll get the summary-of-summary problem and the answers get vague.

Step 3: The prompt (the actual work, 30 minutes)

Here's the template I use, with the parts that don't change and the parts you swap in bold.

System prompt:

You are a content strategist auditing a website's organic performance.
You will receive a CSV with one row per page containing:
- Google Search Console metrics: clicks, impressions, average position, top query
- GA4 metrics: views, engaged sessions, engagement rate, average engagement time

You will answer three specific questions about each page:
1. Intent match: Does the top query align with what this page is about?
   Flag pages where it does not.
2. Engagement health: Given the traffic, is this page retaining users?
   Flag pages with high views but low engagement rate or low avg engagement time.
3. Striking distance: Is this page ranking 8-20 with decent impressions?
   These are the highest-leverage rewrites.

Output as a structured table. For each flagged row, give:
- Page path
- Flag type (1, 2, 3, or combo)
- Specific reason (cite the numbers)
- One concrete rewrite suggestion

Be ruthless. If a page is fine, don't flag it. I'd rather have 30
clear flags than 200 uncertain ones.

User prompt:

Here's the data. Audit it against the three questions.



After the per-page table, give me a 5-bullet summary of patterns
across the whole site that I would miss looking at individual rows.

The "5-bullet cross-site pattern" ask is the one that consistently produces the most useful output. A page-level table tells you what to fix on Monday. The cross-site pattern tells you what's structurally wrong with how the site is built — duplicate intent across multiple thin pages, a category template that's actively hurting engagement, a content gap that 14 queries are exposing.

Step 4: Iterate with a second pass (30 minutes)

Don't stop at the first answer. The first response is a draft. Run two follow-ups:

Follow-up A — Stress-test the flags. "For each page you flagged as intent-mismatch, look at the top 3 queries (not just #1) and tell me if the flag still holds. Intent isn't always single-query."

Follow-up B — Quantify the upside. "For the striking-distance flags, estimate the click increase if each page moved from its current average position to position 5. Use the GSC impressions at the current position to anchor the math."

The second one is the one that turns this from "an audit" into "a business case." A page with 4,000 impressions at position 12 is roughly worth 2x the traffic of a page with 4,000 impressions at position 18. Showing that on a 30-page list, with rough dollar estimates attached, is the artifact that actually gets acted on.

What to watch out for

A few things will go wrong if you don't pre-empt them:

Token limits. A 2,000-row table plus the system prompt plus the analysis will run 200K-400K tokens for a Sonnet or Opus-class model. Make sure you're on a plan that handles that, or chunk the data and run separate passes per section of the site. Don't try to compress the table — the model needs the numbers.

PII and consent. GA4 exports shouldn't contain PII (you've configured it that way, right?), but check before pasting. The day you accidentally feed customer email addresses into a third-party LLM is a bad day.

The model will over-flag. Be skeptical of any "Flag type: 2" on a page with 200 views. That's not enough data to make a call. Filter flags by traffic volume before acting.

Stale GSC data. GSC lags by 2-3 days. Don't make decisions on data that's still moving.

Don't trust the rewrite suggestions blindly. Claude's rewrite advice is pattern-based, not editorial. Use it as a starting prompt, not a final draft. Especially for topics where E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness — 经验、专业、权威、可信) actually matters — health, finance, legal — the rewrite still needs a human who knows the subject.

What I'd do differently

If I were starting over, I'd add a third export: a Screaming Frog crawl of the page titles and H1s. Feeding Claude the on-page heading alongside the top query and engagement metrics turns the intent-mismatch question from "the top query doesn't match the URL slug" into "the top query doesn't match the H1 a human will actually see." That single column has caught more misalignment than anything else in this workflow.

The 90 minutes I quoted at the top assumes you've done this three or four times before. The first run will take two to three hours — most of which is fighting CSV exports and finding the right row limits. By the third site, it's a 90-minute muscle. By the tenth, you're noticing the same patterns across different verticals, which is when the work actually compounds.

That's the workflow. Two free exports, one prompt, three questions, 90 minutes. Run it before the next quarterly review, not after, and the conversation with stakeholders changes from "we should probably audit content" to "here are the 30 pages we're rewriting in Q2."

Twitter LinkedIn Facebook Reddit Email

UTM Hygiene Audit: Broken, Duplicated, Cannibalizing Tags (1,000 URLs, Claude) AI Content Refresh: I Updated 200 Old Posts in One Weekend With Claude + GSC Claude Computer Use agent: monitor your top 20 keyword rankings daily and alert you on Slack when something changes Build Content Briefs That Match the SERP Using Claude + the Top 10 Ranking Pages