Marketing

Subject Line A/B: The 15-Hypothesis Spreadsheet I Build Before Writing a Single Variant

Subject Line A/B: The 15-Hypothesis Spreadsheet I Build Before Writing a Single Variant
Contents

My old subject-line testing process had three candidates and a coin flip. The new one starts with a 15-row spreadsheet, no copy written, and a clear answer to "what are we even testing?"

The coin-flip version is still the default in most in-house teams I work with. Someone drafts two subject lines — "Special 50% off today" and "🔥 Limited time 🔥" — they test, one wins by a few points, and the team moves on. They learned nothing. Those two lines changed length, emoji, urgency, and tone all at once, so the winner is a stew of variables you can't isolate. The next test starts from scratch, and the team is back to guessing.

The fix is simple: write down your hypotheses before you write a single subject line. A 15-row spreadsheet takes 25 minutes and saves you from running a year of tests that all blur together.

The 15-hypothesis grid

The 15 rows cover every lever that meaningfully moves opens, with 2-4 cells per row showing the direction of the test. Here are 6 of the 15 rows I use:

# Hypothesis Control Cell A Cell B
1 Shorter subject lines (<50 chars) lift opens on mobile "Our new collection is here — shop the 12 best pieces" "New collection: 12 picks you'll wear all week" (44) "New collection is live" (24)
2 One emoji at the start lifts opens "Sale ends tonight" "🎁 Sale ends tonight" "Sale ends tonight 🎁"
3 First-name tokens lift opens "Your weekly recap" "Sarah, your weekly recap"
4 Specific numbers beat round numbers "Save 50% this week" "Save 47% this week" "Save 53% on your first order"
5 Bracketed urgency tags survive truncation "Last chance: 24 hours left" "[Last 24 hrs] 50% off everything"
6 Lowercase, no-punctuation reads as personal "We just shipped a new feature" "we just shipped a new feature" "We just shipped a new feature!"

If you can't fill the "Cell A" column for a row, you don't actually have a hypothesis — you have a vibe. Skip it.

The other 9 rows in my template cover: question vs statement, benefit-first vs curiosity-first, sender name (brand vs person), preheader alignment, time-of-day, day-of-week, segment-specific copy, reactivation framing, and discount-on vs value-on.

One hypothesis per send

The most common mistake I see is "we'll test length AND emoji AND personalization in this campaign" — which is a multivariate stew disguised as A/B. Pick one row per send. Run it. Move to the next row.

In Klaviyo, set the test to 20% of the list (10/10 between control and variant), wait for 90% confidence, then send the winner. A 40,000-subscriber list gives you ~4,000 per arm — enough to detect a 1.5-point open-rate lift. Below ~1,000 per arm, you're guessing.

Why the spreadsheet comes first

Three reasons it's worth 25 minutes of pre-work:

  1. You write less, test more. Each row has one job. The copy for row 4 takes 60 seconds, not 20 minutes. You ship 5-6 testable hypotheses per quarter instead of one.
  2. You learn cumulatively. Once "specific numbers beat round numbers" is a confirmed row 4, the next test can skip it and test the next unknown. After 3 quarters the spreadsheet is a learning history, not a planning document.
  3. You stop testing vibes. "I think emoji works for our brand" stops being a debate. It's row 2, with a control and a cell, and the data decides.

The 15-hypothesis grid doesn't replace creative judgment. It just forces the creative judgment to land on a testable claim, not a guessable one. Build the grid first, write a single subject line second, and "best subject line" stops being a coin flip.