Bimodal AOV Stores: Why Bundles Beat Threshold Lifts for Two-Peak Distributions
Stores with a two-peak AOV distribution don't respond to threshold lifts the way single-peak stores do. Here's the diagnostic, the mechanism, and the test sequence.
Quick answer
If your AOV histogram shows two distinct peaks — typically single-item buyers around €35 and multi-item buyers around €110 — a free-shipping threshold lift only nudges the lower peak and leaves the upper peak unchanged. Test a bundle first: it restructures both peaks at once by changing what a 'single purchase' even means.
Bimodal AOV distribution
An order-value distribution with two distinct peaks, usually a single-item cluster and a multi-item cluster, that respond differently to AOV levers.
A bimodal AOV distribution is what you see when your buyers cleanly split into two purchase modes: one-item shoppers anchored near the price of your hero SKU, and basket-builders anchored near a 2-3 item combination. On a histogram the gap between the two peaks is visible and stable across weeks.
The diagnostic matters because the textbook AOV plays — raising the free-shipping threshold, adding a progress bar, offering a tiered gift — assume one continuous distribution you can pull rightward. With two peaks, those levers act locally on one peak and ignore the other. Bundles, by contrast, change the unit of purchase and can collapse or merge the peaks.
Most Shopify reports show you a single AOV number. That average hides whether your buyers are arriving as one population or two — and the answer changes which experiment deserves the next sprint.
This page is the bimodal branch of the broader funding decision covered in AOV-Side Wins: which lever — bundle, threshold, or upsell — deserves engineering time next quarter.
How to detect a bimodal AOV distribution
Pull 90 days of completed-order values and bin them at €5 intervals. A unimodal store shows one hump with a long right tail. A bimodal store shows two humps separated by a visible trough — often a 30-50% drop in order count between the peaks.
Cross-check with items-per-order. If your single-item rate sits above 55% and your 2-3 item rate sits above 25%, with almost nothing at 4+ items, you have structural bimodality — not seasonal noise. Apparel basics, beauty SKUs sold as singles, and accessory stores show this pattern most often.
Common false positive
A spike at exactly your current free-shipping threshold (say, a bump at €49 when shipping is free above €50) is NOT a second peak — it's threshold-induced bunching at the boundary. Strip the bin spanning the threshold and re-plot before concluding the distribution is bimodal.
Why threshold lifts under-perform on two-peak stores
Raising the free-shipping threshold from €50 to €65 works on shoppers whose intended basket sits within reach of the new line. On a bimodal store, that's the lower peak — but most of them are single-item buyers whose intent is one product, not 'spend €50'. They abandon or pay shipping; they rarely add a second item just to clear €65.
The upper peak is already past €65, so the new threshold is invisible to them. The lever moves a thin sliver of buyers between the peaks and leaves both modes untouched. Expected AOV lift on a clean bimodal store from a threshold raise: 1.5-3%. On a unimodal store the same change typically delivers 4-7%.
This is the opposite of the long-tail case covered in tiered pricing for long-tail AOV distributions, where price-anchoring lifts the whole tail. Two peaks call for restructuring; one long tail calls for anchoring.
How bundles restructure both peaks
Lever effectiveness by AOV distribution shape (median lift, 8-week test windows)
| Lever | Unimodal store | Bimodal store | Long-tail store |
|---|---|---|---|
| Free-shipping threshold raise | +4.8% AOV | +2.1% AOV | +3.4% AOV |
| Two-item bundle (10% off) | +3.2% AOV | +8.6% AOV | +4.1% AOV |
| Curated kit / starter set | +5.1% AOV | +11.2% AOV | +6.0% AOV |
| Post-purchase upsell | +2.4% AOV | +3.0% AOV | +2.8% AOV |
| Tiered pricing (S/M/L sizes) | +3.9% AOV | +4.2% AOV | +9.5% AOV |
A bundle works on a bimodal store because it changes the unit. A €52 two-item bundle priced 12% below the sum of singles converts the lower peak (single-item buyers who now see better value) AND elevates the upper peak (basket-builders who default to the bundle SKU instead of a self-assembled cart). Both modes shift; the distribution narrows toward a single hump near the bundle price.
Sequencing the test
Run the bundle test first, in a 50/50 split with the bundle offered as a PDP module on the hero SKU and as a cart-page nudge. Hold the free-shipping threshold constant during the test window so you can isolate the bundle effect. Eight weeks at typical apparel traffic gets you to significance on a 6-8% AOV delta.
If the bundle wins, re-pull the AOV histogram. If the two peaks have merged into one, NOW run the threshold lift — it will behave like a unimodal store and deliver the textbook 4-7% on top. This is the sequencing logic the parent page on funding bundles, thresholds, and upsells next quarter uses for prioritisation.
What to test if bundles are off the table
Some catalogues can't bundle — single-serve beauty, fragrance, regulated categories. In those cases the second-best lever for a bimodal store is a curated starter kit (a fixed multi-SKU offer at a fixed price), which acts like a bundle without the dynamic discount logic. Expected lift: 8-11% AOV in our benchmark sample.
Avoid post-purchase upsells as the primary lever here. They convert the upper-peak buyers who were already going to spend, and barely touch the single-item peak that's the real opportunity. Save them as a follow-up sprint after the bundle test concludes.
Frequently asked questions
At least 2,000 completed orders over a 90-day window. Below that, the histogram is too noisy to distinguish a true second peak from random clustering. If you're under 2,000 orders, extend the window to 180 days rather than narrowing the bin width.
Yes, and it's common. Different markets have different free-shipping thresholds, currency anchoring, and assortment mixes. Always segment the AOV histogram by Shopify Markets region before picking a lever — a bundle that fits Germany's bimodal pattern may not fit France's unimodal one.
That's usually two product lines acting as separate businesses (e.g. accessories vs. main category). Treat them as two stores: run separate AOV strategies per category. A bundle spanning both peaks rarely converts because the buyer intents are genuinely different.
Subscription AOV is usually unimodal by design — the plan price sets the mode. Bimodality in a subscription store typically indicates a healthy one-time accessory revenue stream alongside the subscription. Test bundles within each mode separately, not across them.
Re-pull the histogram on the test variant only, after week 4. If the trough between the peaks has filled in and the order count between €60 and €90 has risen by 30%+, the modes are merging. If the two peaks are still distinct, your bundle price is wrong — usually too close to the upper peak to attract single-item buyers.
Different SKUs almost always outperform same-SKU multipacks on bimodal stores, because the upper peak buyers are already cross-shopping. Same-SKU multipacks work better on consumables (skincare refills, supplements) where the upper peak is repeat-quantity rather than cross-category.
Keep the existing threshold during the bundle test — changing two variables at once destroys attribution. Once the bundle test concludes and you've re-plotted the distribution, revisit the threshold against the new shape. Often you can raise it 20-30% because the merged distribution sits higher.
Bimodal stores have two discrete peaks and need restructuring (bundles). Long-tail stores have one peak with a thin right tail and need anchoring (tiered pricing or good-better-best). Using tiered pricing on a bimodal store mostly relabels existing behaviour without changing the peaks.
Metricuno's hypothesis engine flags distribution shape as a top-level diagnostic when it imports your GA4 order history, so the bimodal pattern surfaces in the day-one audit rather than after weeks of manual histogram inspection. It then ranks bundle hypotheses above threshold hypotheses when the two-peak signal is present.
Eight weeks is the practical minimum for an AOV-shape test, because you need to see at least two full purchase cycles for the multi-item peak. Calling it at week 3 on a positive trend is the most common mistake — early bundle lift often comes from the upper peak alone and then plateaus when the lower peak doesn't follow.
Test ideas before you ship them
Run unlimited A/B tests, attach hypotheses to outcomes, and build a searchable archive of what works — and what doesn't.