Retail Media Incrementality in Practice
Measuring the true impact of retail media campaigns remains one of the biggest challenges for brands and retailers today. This article breaks down two proven methods for determining incrementality: running geo holdouts across matched markets and comparing customer journeys to track new buyers. Industry experts share practical approaches that deliver reliable results without the complexity of traditional measurement frameworks.
Run Geo Holdout Across Matched Markets
Geo-holdout testing is the one way to prove incrementality in retail media. It's the hard and fast approach for seeing the true value of my marketing budget.
According to it, instead of looking at who clicked an ad, I split the country into two groups of cities. In the Test markets, I show the ads, and in the Control markets, I show zero ads.
By comparing sales between these two groups, I calculate the sales that occurred only because of the ads.
The perfect real-world example of that is the skincare campaign that we ran. In a 4-week test, we compared similar cities to see the real impact. We matched cities with similar populations and shopping habits. Then, we ran ads in the "Test" cities and kept the "Control" cities organic-only. The ads not only performed well but also created a 22% lift in sales. As a result, the actual return on investment was 2.3 times that of before.

Compare Journeys and Track New Customers
In retail media, a big challenge is understanding whether ads actually create new sales - or whether they just take credit for sales that would have happened anyway. Many strategies still rely on "last click" reporting, which gives all the credit to the final ad someone clicks before buying. This often makes results look better than they truly are, especially for brand search ads. To avoid this, we focus on incrementality: would this sale still have happened if we had not shown the ad? We usually use 2 strategies for this:
1. Full Customer Journey
Instead of only looking at the final click, we analyze the full customer journey. Using Amazon Marketing Cloud (AMC), we can connect:
ad impressions (who saw which ads)
...and purchase data (who actually bought)
This allows us to understand what happened before the purchase, not just the final interaction. We then compare two groups of shoppers:
Group A: Shoppers who first saw an upper-funnel ad (such as Sponsored Display or Streaming TV), later searched for the brand and then bought the product
Group B: Shoppers who only searched for the brand (and then bought) but did not see those earlier ads
If Group A converts at a higher rate than Group B, it indicates that the earlier ad helped create demand. This approach gives us a much more realistic view of performance than standard ROAS metrics.
2. Using "New-to-Brand" as a Practical Signal
Both Amazon and Walmart provide New-to-Brand (NTB) metrics, which show whether a customer is buying from a brand for the first time.
While NTB is not a perfect measure of incrementality, it is a very useful indicator.
For example:
A campaign may show strong ROAS
But if 90% of sales come from existing customers
In that case, the campaign is likely low-incremental. Campaigns that drive a healthy share of new customers are generally much more incremental and valuable for long-term growth.
Hope this helps!
Cheers,
Moritz
Exploit Stockouts To Expose True Impact
Stockouts create sudden drops in shelf supply that break the usual link between demand and sales. When ads keep running during a stockout in some stores but not others, the gap shows the ad’s true lift on units that can be bought. Compare the change in affected stores to the change in stable stores while also controlling for season.
Link store inventory to ad logs to stop false reads from hidden supply limits. The read can also show waste from serving ads when no units are on shelf. Connect inventory feeds to media data and run the stockout study this quarter.
Use Auction Split For Unbiased Lift
Auction-time randomization assigns some bids to a holdout at the moment an ad is about to serve. This creates equal groups that differ only in ad exposure, which gives an unbiased lift read. Randomizing at that moment avoids bias from user intent, time of day, and pacing. It also keeps spend stable because unserved impressions stay in the market.
Lift can be read quickly by linking exposed and control users to sales at the retailer. Add rules for frequency and location to reduce spillover. Set up an auction-time holdout and measure the lift this month.
Leverage Delivery Thresholds For Local Causal Read
Some retail auctions use a score that gates delivery at a set cutoff. Users just above and just below that cutoff are alike, so their different exposure acts like a natural test. A local comparison around the line gives a clean read of incremental sales.
Try narrow and wider windows and check that user traits match on both sides. The result is a local effect for marginal impressions, which helps with bids and budget. Pull score and delivery logs, define the window, and run the threshold test now.
Tie Mix Models to Panel Purchases
A media mix model gets better when tied to retailer transaction panels that show real item buys. Panel data ground the sales effect, show loyalty patterns, and reveal when brands steal sales from each other. Short lift tests can guide the model so it does not over-credit impressions.
Add price, promotion, and shelf presence as controls to keep the media effect clean. A pooled model can share data across SKUs while keeping brand detail. Build the model with panel signals and run the calibration before the next plan.
Target High Uplift Users By Score
Heterogeneous treatment models predict the extra sales a user may add because of an ad, not just the chance of a click. Signals like past spend, category interest, price sensitivity, and visit recency help find who moves when shown an ad. Uplift trees or similar tools can learn these patterns while keeping effect and outcome apart.
Check the scores by seeing if high score groups show more extra sales than low score groups in a holdout read. Target then shifts to high uplift users and pulls back on low or negative uplift users. Train an uplift model and route bids to the top uplift bands this week.

