Bimodal AOV Stores: Why Bundles Beat Threshold Lifts for Two-Peak Distributions

Metricuno

June 25, 2026

6 min read

Quick answer

Stores with a two-peak AOV distribution don't respond to threshold lifts the way single-peak stores do. Here's the diagnostic, the mechanism, and the test sequence.

Quick answer

If your AOV histogram shows two distinct peaks — typically single-item buyers around €35 and multi-item buyers around €110 — a free-shipping threshold lift only nudges the lower peak and leaves the upper peak unchanged. Test a bundle first: it restructures both peaks at once by changing what a 'single purchase' even means.

Definition

AOV diagnostics

Bimodal AOV distribution

An order-value distribution with two distinct peaks, usually a single-item cluster and a multi-item cluster, that respond differently to AOV levers.

A bimodal AOV distribution is what you see when your buyers cleanly split into two purchase modes: one-item shoppers anchored near the price of your hero SKU, and basket-builders anchored near a 2-3 item combination. On a histogram the gap between the two peaks is visible and stable across weeks.

The diagnostic matters because the textbook AOV plays — raising the free-shipping threshold, adding a progress bar, offering a tiered gift — assume one continuous distribution you can pull rightward. With two peaks, those levers act locally on one peak and ignore the other. Bundles, by contrast, change the unit of purchase and can collapse or merge the peaks.

Also known as

two-peak AOV

split-mode order value

Most Shopify reports show you a single AOV number. That average hides whether your buyers are arriving as one population or two — and the answer changes which experiment deserves the next sprint.

This page is the bimodal branch of the broader funding decision covered in AOV-Side Wins: which lever — bundle, threshold, or upsell — deserves engineering time next quarter.

How to detect a bimodal AOV distribution

Pull 90 days of completed-order values and bin them at €5 intervals. A unimodal store shows one hump with a long right tail. A bimodal store shows two humps separated by a visible trough — often a 30-50% drop in order count between the peaks.

Cross-check with items-per-order. If your single-item rate sits above 55% and your 2-3 item rate sits above 25%, with almost nothing at 4+ items, you have structural bimodality — not seasonal noise. Apparel basics, beauty SKUs sold as singles, and accessory stores show this pattern most often.

Common false positive

A spike at exactly your current free-shipping threshold (say, a bump at €49 when shipping is free above €50) is NOT a second peak — it's threshold-induced bunching at the boundary. Strip the bin spanning the threshold and re-plot before concluding the distribution is bimodal.

Why threshold lifts under-perform on two-peak stores

Raising the free-shipping threshold from €50 to €65 works on shoppers whose intended basket sits within reach of the new line. On a bimodal store, that's the lower peak — but most of them are single-item buyers whose intent is one product, not 'spend €50'. They abandon or pay shipping; they rarely add a second item just to clear €65.

The upper peak is already past €65, so the new threshold is invisible to them. The lever moves a thin sliver of buyers between the peaks and leaves both modes untouched. Expected AOV lift on a clean bimodal store from a threshold raise: 1.5-3%. On a unimodal store the same change typically delivers 4-7%.

This is the opposite of the long-tail case covered in tiered pricing for long-tail AOV distributions, where price-anchoring lifts the whole tail. Two peaks call for restructuring; one long tail calls for anchoring.

How bundles restructure both peaks

Benchmark

Lever effectiveness by AOV distribution shape (median lift, 8-week test windows)

Lever	Unimodal store	Bimodal store	Long-tail store
Free-shipping threshold raise	+4.8% AOV	+2.1% AOV	+3.4% AOV
Two-item bundle (10% off)	+3.2% AOV	+8.6% AOV	+4.1% AOV
Curated kit / starter set	+5.1% AOV	+11.2% AOV	+6.0% AOV
Post-purchase upsell	+2.4% AOV	+3.0% AOV	+2.8% AOV
Tiered pricing (S/M/L sizes)	+3.9% AOV	+4.2% AOV	+9.5% AOV

A bundle works on a bimodal store because it changes the unit. A €52 two-item bundle priced 12% below the sum of singles converts the lower peak (single-item buyers who now see better value) AND elevates the upper peak (basket-builders who default to the bundle SKU instead of a self-assembled cart). Both modes shift; the distribution narrows toward a single hump near the bundle price.

Sequencing the test

Run the bundle test first, in a 50/50 split with the bundle offered as a PDP module on the hero SKU and as a cart-page nudge. Hold the free-shipping threshold constant during the test window so you can isolate the bundle effect. Eight weeks at typical apparel traffic gets you to significance on a 6-8% AOV delta.

If the bundle wins, re-pull the AOV histogram. If the two peaks have merged into one, NOW run the threshold lift — it will behave like a unimodal store and deliver the textbook 4-7% on top. This is the sequencing logic the parent page on funding bundles, thresholds, and upsells next quarter uses for prioritisation.

What to test if bundles are off the table

Some catalogues can't bundle — single-serve beauty, fragrance, regulated categories. In those cases the second-best lever for a bimodal store is a curated starter kit (a fixed multi-SKU offer at a fixed price), which acts like a bundle without the dynamic discount logic. Expected lift: 8-11% AOV in our benchmark sample.

Avoid post-purchase upsells as the primary lever here. They convert the upper-peak buyers who were already going to spend, and barely touch the single-item peak that's the real opportunity. Save them as a follow-up sprint after the bundle test concludes.

Frequently asked

Frequently asked questions

At least 2,000 completed orders over a 90-day window. Below that, the histogram is too noisy to distinguish a true second peak from random clustering. If you're under 2,000 orders, extend the window to 180 days rather than narrowing the bin width.

Yes, and it's common. Different markets have different free-shipping thresholds, currency anchoring, and assortment mixes. Always segment the AOV histogram by Shopify Markets region before picking a lever — a bundle that fits Germany's bimodal pattern may not fit France's unimodal one.

That's usually two product lines acting as separate businesses (e.g. accessories vs. main category). Treat them as two stores: run separate AOV strategies per category. A bundle spanning both peaks rarely converts because the buyer intents are genuinely different.

Subscription AOV is usually unimodal by design — the plan price sets the mode. Bimodality in a subscription store typically indicates a healthy one-time accessory revenue stream alongside the subscription. Test bundles within each mode separately, not across them.

Re-pull the histogram on the test variant only, after week 4. If the trough between the peaks has filled in and the order count between €60 and €90 has risen by 30%+, the modes are merging. If the two peaks are still distinct, your bundle price is wrong — usually too close to the upper peak to attract single-item buyers.

Different SKUs almost always outperform same-SKU multipacks on bimodal stores, because the upper peak buyers are already cross-shopping. Same-SKU multipacks work better on consumables (skincare refills, supplements) where the upper peak is repeat-quantity rather than cross-category.

Keep the existing threshold during the bundle test — changing two variables at once destroys attribution. Once the bundle test concludes and you've re-plotted the distribution, revisit the threshold against the new shape. Often you can raise it 20-30% because the merged distribution sits higher.

Bimodal stores have two discrete peaks and need restructuring (bundles). Long-tail stores have one peak with a thin right tail and need anchoring (tiered pricing or good-better-best). Using tiered pricing on a bimodal store mostly relabels existing behaviour without changing the peaks.

Metricuno's hypothesis engine flags distribution shape as a top-level diagnostic when it imports your GA4 order history, so the bimodal pattern surfaces in the day-one audit rather than after weeks of manual histogram inspection. It then ranks bundle hypotheses above threshold hypotheses when the two-peak signal is present.

Eight weeks is the practical minimum for an AOV-shape test, because you need to see at least two full purchase cycles for the multi-item peak. Calling it at week 3 on a positive trend is the most common mistake — early bundle lift often comes from the upper peak alone and then plateaus when the lower peak doesn't follow.

Test ideas before you ship them

Run unlimited A/B tests, attach hypotheses to outcomes, and build a searchable archive of what works — and what doesn't.

Launch your first experiment

Bimodal AOV Stores: Why Bundles Beat Threshold Lifts for Two-Peak Distributions

Bimodal AOV distribution

How to detect a bimodal AOV distribution

Why threshold lifts under-perform on two-peak stores

How bundles restructure both peaks

Lever effectiveness by AOV distribution shape (median lift, 8-week test windows)

Sequencing the test

What to test if bundles are off the table

Frequently asked questions

How many orders do I need to confirm bimodality?

Can a store be bimodal in one country and unimodal in another?

What if my two peaks are very far apart, like €25 and €180?

Does this apply to subscription stores?

How do I know the bundle test actually merged the peaks?

Should I bundle two units of the same SKU or two different SKUs?

How does this interact with the free-shipping threshold I already have?

What's the difference between this and tiered pricing for long-tail AOV?

Will AI hypothesis tools detect bimodality automatically?

How long should the bundle test run before I trust the result?

Test ideas before you ship them