n8n + Claude: A Weekly Content Audit Agent That Catches Underperformers Before You Do
Contents
Every Monday at 8:14 a.m., a Slack message lands in #content-audit. It is a list of four to nine posts from the last 90 days that are quietly underperforming — pages that still get impressions but no clicks, or that have slipped three positions in the last month for no obvious reason. Last week's catch was an 1,800-word comparison guide sitting on page 2 for 11 weeks, pulling 220 impressions a month and exactly zero clicks. I rewrote the intro, added a real comparison table, and 18 days later the post was ranking at position 7. That single fix earned back more than the cost of running the agent for a year.
The agent that flagged it is a self-hosted n8n (a fair-code workflow automation tool) workflow pulling from GSC (Google Search Console, 谷歌站长工具) and GA4 (Google Analytics 4, 谷歌分析 4), scoring pages with Claude Sonnet 4.5, and writing the result to Slack and a Google Sheet. The whole thing runs in 3 to 4 minutes on a Monday morning, costs under $0.20 in API calls, and has been quietly catching things I would never have noticed in a manual monthly review.
This is the build.
What you are actually building
The workflow has eight nodes, fires weekly, and ends with a Slack message and a spreadsheet row per flagged page. Here is the shape:
| # | Node | Job |
|---|---|---|
| 1 | Schedule Trigger | Fires every Monday at 08:00 local time |
| 2 | HTTP Request | Pulls the last 28 days of GSC page-level data |
| 3 | HTTP Request | Pulls the same window from GA4 (engagement metrics) |
| 4 | Code | Joins the two, computes derived columns (CTR, position trend, days since publish) |
| 5 | Filter | Keeps only rows that look "interesting" (impressions > 50, position > 8, or trend slipping) |
| 6 | Loop Over Items + Anthropic Chat Model | Claude scores each surviving row and returns a structured verdict |
| 7 | IF | Branches on needsAction — true rows go to Slack, false rows go to a "watchlist" sheet |
| 8 | Slack + Google Sheets | Final output |
The reason this is an "agent" and not just a dashboard is step 6. Claude is the only piece that can read a row and decide why it is underperforming, not just that it is. The Code node can tell you CTR is 0.9%. It cannot tell you that the meta description is mismatched to the search intent (the underlying goal behind a query) or that the H2s read like an outline instead of questions. That is the human-shape part of the work, and it is the part the LLM (Large Language Model, 大语言模型) eats for breakfast.
Prerequisites
You need five things wired up before you start:
- n8n, either n8n Cloud (the $24/mo Starter plan is enough) or self-hosted (the docker image is a 90-second deploy)
- An Anthropic API key with access to Claude Sonnet 4.5, set as an n8n credential
- GSC API access to your verified property, scoped to read search analytics
- A GA4 property with the Analytics Data API enabled and a service account JSON key in n8n credentials
- A Slack incoming webhook (or a Slack OAuth credential) pointed at the channel where you want the report
If you are not on n8n Cloud, you can also do the same build in Make, Zapier, or even a cron-triggered Python script. n8n is what I landed on because the data-shaping Code nodes are first-class, the LangChain (an open-source framework for building LLM-powered applications) integration is two clicks, and the per-execution cost is flat.
Step 1 — Schedule and pull GSC
The trigger is a Schedule Trigger node set to fire at 8:00 a.m. on Monday. The first action is an HTTP Request node hitting the GSC Search Analytics API:
GET https://www.googleapis.com/webmasters/v3/sites/{siteUrl}/searchAnalyticsWith query parameters:
startDateandendDateset to the last 28 daysdimensions[]=pagerowLimit= 5,000 (more than enough for a single site)dataState=final
The response gives you a row per URL with clicks, impressions, ctr, and position. Save these into the workflow state under a key called gscPages and move on. I cap the rowLimit because anything above 5,000 is almost always a tag/category archive problem you do not want the audit to chase.
Step 2 — Pull GA4 and merge
A second HTTP Request hits the Analytics Data API for the same window, this time with dimensions: ["pagePath", "pageTitle"] and metrics screenPageViews, engagementRate, averageSessionDuration, conversions. The endpoint is:
POST https://analyticsdata.googleapis.com/v1beta/properties/{propertyId}:runReportThe two outputs do not line up perfectly because GSC normalizes URLs with trailing slashes and GA4 sometimes does not. A Code node with a small normalization pass — strip trailing slashes, lowercase the host, and join on the cleaned path — gets you a clean merged row. 30 lines of JavaScript (JS). It is the part of the build that looks boring and is the part that will save you from garbage results the most.
The merged row now looks roughly like:
{
url, title, clicks, impressions, ctr, position,
pageViews, engagementRate, avgEngagementTime, conversions,
gsc28d: {...}, ga4Previous28d: null
}Step 3 — Compute the deltas that matter
One Code node, one pass. For each merged row, I compute:
positionDelta— current period position vs the prior 28-day period (negative = improving, positive = slipping)ctrVsExpected— actual CTR (Click-Through Rate, 点击率) vs a position-impression-weighted benchmark (a #3 result should get ~10% CTR; a #9 should get ~3%)trafficShare— this page's clicks as a percentage of total site clicks in the windowdaysSinceLastUpdate— pulled from your CMS (Content Management System, 内容管理系统) if you have an API, or from the page'slastmodif you can fetch it
The deltas are the load-bearing piece. A page at position 11 with 1,200 impressions is interesting; a page at position 11 with 90 impressions is not, no matter what the LLM later says about it. The Code node's job is to make sure only the former ever reaches Claude.
Step 4 — Filter aggressively
The Filter node is the single biggest cost-control lever in the workflow. I keep only rows where any of these is true:
impressions >= 100andposition > 8positionDelta > 2(slipping at least 2 positions in the last 28 days)ctrVsExpected < 0.5(getting less than half the CTR it should for its position)engagementRate < 0.4andpageViews > 50(people are landing but bouncing)
That filter usually takes a 1,000- to 5,000-row dataset down to 30-80 rows. The workflow only sends those to Claude. At $3 per million input tokens and roughly 400 tokens per row, a 50-row audit is about $0.06 of model cost. A "send everything" version would be 50x that, and Claude would still produce roughly the same flagged set — just slower and noisier.
Step 5 — Score with Claude
This is the LangChain Agent / Anthropic Chat Model node, looping over the filtered rows one at a time. The system prompt is the part that took me five rewrites to get right. The current version:
You are a senior SEO editor reviewing a weekly content audit.
For each post, you will be given:
- 28-day GSC metrics (clicks, impressions, CTR, position, prior position)
- 28-day GA4 metrics (page views, engagement rate, average engagement time, conversions)
- Computed deltas (position trend, CTR vs expected, traffic share)
- The page title and URL
Your job: decide whether this post is underperforming in a way a human editor
should look at, and if so, explain why in one sentence a non-SEO can act on.
Output a JSON object with exactly these keys:
{
"url": string,
"verdict": "OK" | "WATCH" | "FIX",
"reason": string, // 1 sentence, max 30 words, cite the specific metric
"suggestedAction": "rewrite_intro" | "add_table" | "update_facts" |
"tighten_meta" | "consolidate" | "delete" | "none"
}
Rules:
- "OK" if the metrics are healthy for the position, even if absolute traffic is low
- "WATCH" if the trend is slipping but the page still has potential
- "FIX" only if there is a specific, concrete issue a human could address
- Be conservative. False positives waste editor time. If unsure, return "WATCH"
- Never invent data not in the input. Cite the metric in the reason.I cap max_tokens at 400 per row. Output goes through a small Code node that validates the JSON and drops malformed rows into a separate "agent errors" sheet for inspection. Roughly 1 in 200 rows comes back malformed — almost always a reason field that exceeded the 30-word limit and got truncated. The retry path is a single re-prompt with "Previous response was truncated. Be more concise." appended.
Step 6 — Branch and notify
A simple IF node splits on verdict == "FIX" || verdict == "WATCH". The "FIX" branch gets a Slack message with a one-line summary per post, sorted by potential impact (positionDelta * impressions, descending). The "WATCH" branch appends to a Watchlist tab in the audit Google Sheet — no Slack, no human attention required, but the row is preserved so the next week's audit can compare.
The Slack message itself uses Slack's Block Kit (a JSON format for rich Slack messages) so each flagged post is a clickable link with the verdict emoji and the cited metric. Sample:
:rotating_light: *Content Audit — 2025-02-24*
6 posts need attention this week
• :writing_hand: *FIX* — `/blog/best-llm-tools-2025/`
2,400 impressions at #12, CTR 0.8% vs expected 3.2%. Meta description
mismatches the query. Suggested: `tighten_meta`
• :chart_with_downwards_trend: *WATCH* — `/blog/zapier-vs-n8n/`
Position slipped from 6 to 9 in 28 days. Engagement rate dropped to 31%.
Suggested: `update_facts`What I changed after six weeks of running it
Three production learnings that are not in the docs.
The first version scored too aggressively. My initial prompt said "flag anything that could be improved." The result was 30+ FIX rows every Monday, most of them false positives — pages that were fine but had one weak section. The fix was a stricter "FIX" definition in the prompt and a hard cap of 10 FIX rows per run (anything below the cap is re-classified as WATCH). The Slack message is now actually read.
The "days since last update" metric broke the audit on a quarter-end week. A client's blog had 80 posts that had not been touched in 18+ months. The audit flagged almost all of them as stale, drowning out actual underperformers. The fix was to add a pages_within_quota filter: only flag staleness if the post still gets at least 100 impressions in the window. A stale post with no traffic is not a problem the audit should be solving — it is a problem the content calendar should be solving.
Replacing Claude Sonnet with Haiku for the WATCH branch saved 70% of the cost with no measurable quality loss. The reasoning here is simple: WATCH rows are not human-actioned, so the standard for "good enough" is low. I run FIX verdicts through Sonnet 4.5 and WATCH verdicts through Haiku 4.5 in the same loop. Total weekly cost dropped from $0.22 to $0.08.
What it is not
This agent is not a content strategy tool and it is not a writer. It does not generate new post ideas, it does not rewrite anything, and it does not publish. Its only job is to surface the 5-10 posts a human should look at this week, with a one-sentence reason for each. Everything else — deciding what to do, writing the rewrite, shipping it — is human work.
That boundary is also why it works. The agent fails in interesting, recoverable ways: a row that comes back malformed, a prompt that drifts and starts flagging healthy pages, a GSC pull that returns 0 rows because the site was re-verified last week. None of those failures are catastrophic, and the audit Google Sheet becomes a record of what the agent has been saying over time. Six months of weekly reports in one tab is a content strategy document in its own right.
If you build it, the first thing to expect is that the first two weeks' output will be noisy. Tweak the prompt, tighten the filters, accept that "delete" should almost never be the suggested action, and the signal-to-noise ratio settles within a month. The first post it catches will be the one you would have missed for another quarter.