Why GA4 Funnels and Hotjar Replays Disagree on Drop-off
GA4's modeled funnel and Hotjar's replay-derived drop-off rarely agree, and neither matches your VWO experiment counters. Here's what's actually causing the gap — and how to stop spending Mondays reconciling dashboards.
Quick answer
GA4 and Hotjar disagree because they count different things on different identities. GA4 reports modeled, sampled, session-scoped events keyed to a GA client ID; Hotjar reports observed pageviews and rage-clicks keyed to a Hotjar user ID, with its own sampling cap. Add VWO's experiment-bucketed counters and you have three tools measuring three populations. The fix is one tracking layer feeding all three views — not a weekly reconciliation spreadsheet.
GA4 / Hotjar funnel drop-off mismatch
The systematic gap between GA4's modeled session-based funnel and Hotjar's replay-derived drop-off, caused by different sampling, identity, and event-timing rules.
When a Shopify or WooCommerce team checks checkout drop-off in GA4 and then opens Hotjar to watch the same step, the numbers almost never match. The cause is structural, not a bug: GA4 fires server-side modeled events on a sampled subset of sessions and stitches identity via a first-party cookie; Hotjar records client-side pageviews and interaction signals on its own sampled subset, with its own cookie and its own session-timeout rules. VWO adds a third layer — counters bucketed by experiment exposure. The three populations rarely overlap cleanly, so drop-off percentages drift 5-25 points apart.
Most teams discover the mismatch the painful way. A weekly CRO review shows GA4 reporting 42% drop-off on the shipping step, Hotjar showing 58%, and the VWO test running on that step claiming the control converts at 4.1%. Three numbers, one funnel, zero confidence.
Why the numbers don't match
There are four mechanical reasons GA4 and Hotjar will never converge on the same drop-off figure. Each one is fixable individually, but stacking three tools means stacking three sets of assumptions — and the gaps compound.
Cause 1 — Sampling thresholds. GA4 begins modeling and sampling once a property crosses ~10M events in a date range, and the free tier never exposes the raw rows. Hotjar caps free plans at 35 daily sessions and paid plans at fixed session quotas; once you hit the cap, recording stops mid-day. You are comparing a modeled estimate against a truncated sample of the morning.
The identity problem is the biggest contributor
GA4's client ID and Hotjar's user ID are independent first-party cookies set by different scripts at different moments in the page lifecycle. A shopper who blocks one script, clears cookies between sessions, or switches from mobile Safari to desktop Chrome is counted as one user in GA4 and three users in Hotjar — or vice versa. On iOS Safari with ITP, both cookies expire after 7 days, but on different clocks.
Cause 2 and 3 — event timing and session definitions
GA4 fires a `begin_checkout` event when its dataLayer push runs, which on most Shopify themes is after the checkout page's DOM is interactive. Hotjar logs the pageview as soon as its snippet executes — often 300-800ms earlier. Shoppers who bounce in that gap show up in Hotjar's denominator but not GA4's.
Session windows differ too. GA4 closes a session after 30 minutes of inactivity by default; Hotjar uses its own inactivity timer, and VWO bucketed sessions follow the experiment's exposure rule. A shopper who leaves the checkout tab open for 45 minutes and returns is one continuing session in Hotjar, two sessions in GA4, and a re-exposed visitor in VWO.
How to detect which tool is wrong (and when neither is)
Start with a single-source-of-truth test. Pick one high-traffic day, pull GA4's raw event export via BigQuery, and join it to Hotjar's session export on timestamp + URL. If the joined row count is under 60% of either tool's total, you have an identity-stitching problem, not a counting problem.
Next, check the consent layer. If your CMP blocks Hotjar for a portion of EU traffic but allows GA4 (or vice versa) the two tools are literally measuring different audiences. Beauty and apparel stores with heavy EU traffic typically see 15-30% of sessions visible to only one of the two tools.
How to fix it — one tracking layer, three views
The structural fix is to stop treating GA4, Hotjar, and VWO as three sources of truth and start treating them as three views over one event stream. Capture events once — server-side or via a single first-party snippet — and fan them out to each tool with consistent identity and timing.
On Shopify and WooCommerce, this usually means replacing three tracking snippets with one lightweight collector that writes to a unified data layer, then forwards events to GA4, your replay tool, and your experimentation tool with a shared user ID. Drop-off numbers will still differ slightly — sampling is unavoidable on the free GA4 tier — but the gap drops from 15+ points to under 3.
What this looks like in practice
An apparel store running Metricuno replaced GA4 + Hotjar + VWO snippets with one tag. Checkout drop-off across the three derived views (analytics, replay, experiment) now sits within 1.8 percentage points — down from a 14-point spread. The Monday reconciliation meeting was deleted from the calendar.
Experiments to run once you trust the data
With reconciled drop-off, the next questions become productive: which step has the highest exit intent among returning customers, which device cohort drops at payment, which referrer source skips the shipping step the fastest. Each is testable in a week instead of three.
The broader topic — when to consolidate analytics, heatmaps, and experimentation — is covered in our comparison of fragmented stacks versus a unified CRO platform. The short version: once your team spends more than two hours a week reconciling tool outputs, the fragmented stack is costing more than it saves.
Frequently asked questions
GA4 fires events later in the page lifecycle and applies modeling thresholds, so it under-counts short sessions Hotjar captures. The gap is usually 10-20% on checkout pages, larger on mobile Safari where ITP shortens the GA4 cookie lifetime.
Partially. GA4 samples once a property exceeds ~10M events in a query range, and the free tier doesn't expose unsampled data. But sampling alone usually explains only 3-5 points of the gap. Identity stitching and event timing explain the rest.
Neither is 'correct' on its own — they're measuring different populations. The closest you'll get to truth is joining GA4's BigQuery export to Hotjar's session export on timestamp and URL, then accepting that 100% reconciliation isn't possible across independently-sampled tools.
It fixes the GA4 side — server-side events are immune to ad blockers and ITP cookie expiry. But Hotjar still runs client-side and still loses to ad blockers, so the gap narrows but doesn't close. The full fix requires a unified data layer that feeds both tools from one source.
Yes, materially. Once you hit the daily session quota, Hotjar stops recording for the rest of the day. If your peak traffic hour is 8pm but you hit the cap at 2pm, your evening drop-off data simply doesn't exist in Hotjar — while GA4 captures it all.
VWO counts conversions only among users exposed to the experiment, using its own bucketing cookie. GA4 counts all sessions hitting the success event, regardless of experiment exposure. A user who entered before the test started and converted later is a conversion in GA4 and nothing in VWO.
Intelligent Tracking Prevention caps first-party cookie lifetime at 7 days for cookies set via JavaScript. GA4's and Hotjar's cookies expire on different clocks because they were set at different moments, so returning shoppers get classified as new users at different times in each tool.
Mobile, almost always. Mobile Safari's ITP, more aggressive ad blocking on mobile browsers, and shorter session durations all compound. Expect 1.5-2x the desktop discrepancy on a typical Shopify mobile checkout.
You can, and many teams do — but then you lose either the funnel math (Hotjar) or the qualitative behavior signal (GA4). The better path is consolidating to one platform that produces both views from a single event stream, which is what a unified CRO platform delivers.
Most teams we audit spend 3-6 hours per week on reconciliation: one analyst pulling exports, one CRO lead questioning the gap, one engineer fixing dataLayer pushes. That's 150-300 hours a year, which is the implicit cost of running three tools instead of one.
Get an AI expert review of your site
Paste your URL — Metricuno's AI runs the same heuristic checks a senior CRO consultant would, scoring your page and prioritising the fixes that'll move conversion fastest.