Pricing Experiments

Metricuno
May 18, 2026
4 min read
Quick answer

Pricing experiments are constrained A/B tests on price-related elements — anchors, comparison framing, discount mechanics — designed to avoid the legal and brand fallout of charging different shoppers different headline prices.

Definition
Experimentation

Pricing Experiments

Controlled tests on price-adjacent elements (anchors, framing, discount mechanics) — rarely on the headline price itself.

Pricing experiments are A/B or hold-out tests that measure how changes to price presentation, anchors, comparison framing, or discount mechanics affect conversion, average order value, and margin. Unlike UX experiments, they're constrained by real-world risk: charging two shoppers different headline prices for the same SKU at the same time invites chargebacks, marketplace policy violations (Shopify, Amazon), GDPR-flavoured fairness complaints, and brand damage if a screenshot lands on social.

In practice, mature teams test the scaffolding around price — strike-through reference prices, tier ordering, bundle composition, free-shipping thresholds, urgency cues — rather than the number itself. When the headline price does change, it's almost always sequential (before/after with a hold-out cohort) rather than concurrent.

Also known as
Price testing
Price A/B testing
Price optimization experiments

The instinct to A/B test price the way you'd test a button colour is the single most expensive mistake in this category. Concurrent price tests on identical SKUs expose you to consumer-protection regulators in the EU and UK, violate Shopify's and Amazon's seller terms, and — the part teams underestimate — produce noisy results because the people who screenshot a £49 vs £59 split and post it to Reddit are not in your conversion model.

What you can safely test is everything that shapes how a shopper perceives the price: the strike-through anchor, the position of the tier you want them to pick, the framing of the discount (£10 off vs 20% off), bundle composition, and shipping-threshold mechanics. These sit inside the broader discipline of Pricing Psychology and inherit their statistical rigour from Behavioral Experimentation.

Formula

Revenue Lift % = ((CR_v × AOV_v) − (CR_c × AOV_c)) / (CR_c × AOV_c) × 100

Variables

CR_v

Variant conversion rate

Conversion rate of the test variant (price-framing change)

AOV_v

Variant average order value

Average order value in the variant cohort

CR_c

Control conversion rate

Conversion rate of the unchanged price presentation

AOV_c

Control average order value

Average order value in the control cohort

Worked example

An apparel store on Shopify tests a strike-through anchor (£89 → £69) against a clean £69 price on a winter-coat PDP for 14 days. Control gets the clean price; variant gets the strike-through.

Control CR: 2.8%

Control AOV: £74

Variant CR: 3.2%

Variant AOV: £71

+9.6% revenue per visitor

The anchor lifted conversion enough to offset a small AOV drop (some shoppers bought only the anchored item rather than adding accessories). Net revenue per visitor is up 9.6% — ship the variant, but monitor margin in case the anchor pulls forward demand from full-price periods.

Note the metric: revenue per visitor, not conversion rate alone. Pricing changes almost always trade conversion against AOV or margin, so a CR-only readout will mislead you. If you have COGS data wired up, swap revenue for gross profit per visitor — it's the only number that survives margin-eroding discount tests.

Benchmark

Typical revenue-per-visitor lift ranges by pricing experiment type (DTC e-commerce, AOV €40–€200)

Experiment typeApparel / accessoriesBeauty / consumablesElectronics / homeware
Strike-through anchor on PDP+4% to +12%+2% to +8%+3% to +9%
Tier reordering (decoy pricing)+3% to +9%+5% to +14%+2% to +7%
% off vs absolute £ off framing−2% to +6%+1% to +7%+3% to +11%
Free-shipping threshold change+5% to +18% AOV+8% to +22% AOV+4% to +12% AOV
Bundle vs single-SKU framing+6% to +15%+10% to +24%+3% to +9%
Urgency / scarcity cue near price+1% to +5%+2% to +6%+1% to +4%

Two patterns worth flagging from the table. First, bundles and free-shipping thresholds consistently move the most money because they change AOV without changing the perceived unit price — shoppers don't feel manipulated. Second, urgency cues underperform their reputation; the lifts are real but small, and they erode trust over repeat sessions, so measure 30-day repeat-purchase rate alongside the test, not just session conversion.

Frequently asked

Pricing experiments FAQ

Showing two shoppers different headline prices for the same SKU at the same time is risky under EU consumer-protection and price-transparency rules (notably the Omnibus Directive on reference prices). Sequential tests with a hold-out cohort, or tests on anchors, framing, and bundles, are fine. When in doubt, talk to your legal team before going live.

Shopify's terms don't ban price testing outright, but charging different customers different prices for identical products at the same moment can trigger payment-processor disputes. Most Shopify brands test price presentation (strike-throughs, anchors, tier order) via theme variants rather than the price field itself.

Anchors and strike-through reference prices, tier ordering, decoy pricing, bundle composition, free-shipping thresholds, payment-plan visibility (e.g. Klarna installments), discount framing (% vs £), and urgency cues. These move revenue per visitor without exposing you to fairness or compliance complaints.

Minimum two full weekly cycles (14 days) to absorb day-of-week buying patterns, and long enough to hit statistical significance on revenue per visitor — typically 3–6 weeks for stores doing 500–2,000 orders/week. Pricing tests need more runtime than UX tests because revenue per visitor has higher variance than click-through rate.

Revenue per visitor, always — and gross profit per visitor if you have COGS wired in. Pricing changes trade conversion against AOV and margin, so a conversion-only readout will tell you to ship variants that destroy profit. This is the single most common analysis mistake in price testing.

Pricing psychology is the body of evidence about how shoppers perceive prices (anchoring, charm pricing, decoy effects, loss aversion). Pricing experiments are how you validate which of those effects actually move money for your specific catalogue and customer base. One is the theory, the other is the proof.

Yes, and you should — but run it as a sequential test or segment by traffic source so the same shopper doesn't see both offers. Discount-depth tests almost always reveal that the deeper discount adds conversion but destroys margin; the winning depth is usually the smallest one that hits your target conversion lift.

The effect is real but smaller than folklore suggests — typical lifts are 1–4% on conversion, and it works better on lower-AOV consumables than on considered purchases above £100. Test it once on your catalogue, ship the winner, and move on; it's not a recurring experiment.

Not usually. Any A/B testing tool that can swap product-page elements (price block, anchor, tier order, badges) can run pricing-framing experiments. What matters more is your analytics setup: you need revenue per visitor and gross profit per visitor as primary metrics, not the conversion-rate default most tools ship with.

Calling a winner on conversion lift without checking AOV and margin impact. A variant that lifts conversion 8% but drops AOV 12% is a loss, but a CR-only dashboard will show it as green. The second biggest mistake is testing on holiday traffic and assuming the result generalises to evergreen demand.

Test ideas before you ship them

Run unlimited A/B tests, attach hypotheses to outcomes, and build a searchable archive of what works — and what doesn't.