Ad Tech Insights Methodology

What is Ad Tech Insights?

Ad Tech Insights is a suite of reports detailing Ad Tech industry trends. Currently it hosts three reports. All of them look at the Top 10K US and Top 10K UK sites according to Alexa's Site Rankings. Every report is updated quarterly, during the first month of each quarter.

Header Bidding Industry Index (HBIX)

This tracks header bidding adoption and vendor usage, including what bidders and wrappers they are using.

Consent Management Platform (CMP) tracker

GDPR has prompted many publishers to use Consent Management Platforms for storing consent and passing that to programmatic partners. This tracker analyzes how many companies in the Top 10K US and UK sites use either IAB-registered CMPs or other 3rd-party consent tools.

Ads.txt tracker

Ads.txt is an IAB initiative to help fight ad fraud. It's a text file that publishers host on their servers that list what companies are authorized to sell or resell their inventory.

For the CMP and Ads.txt trackers, when you filter by 'publishers only', what does that mean?

For the CMP and Ads.txt trackers, we broke down the adoptions graph into two buckets: all sites and just sites that show programmatic ads. Why? Because some sites would likely never show programmatic ads or care about collecting consent (say, Wikipedia.org). Therefore, it's more interesting to look at the adoption rates for just programmatic publishers.

This list is manually compiled by identifying sites that are making an ad ping to an exchange/network, that do header bidding, and/or have an ads.txt tracker.

Header Bidding Industry Index (HBIX) Methodology

Overview

The Header Bidding Industry Index tracks header bidding adoption - as well as what vendors publishers are using - across the Top 10K US and Top 10K UK sites.

Methodology Notes

Set-Up
1. We analyze just the homepages. This will undercount usage slightly for sites that show ads on sub-pages versus the homepage itself.
2. We run the tool twice across all the sites, one with a desktop user agent, and one on mobile. The listener stays on each page for 1 minute.
3. We also check the source content of the site's JS files.

Eligibility
1. We considered a site to do header bidding if it made a call to one or more header bidding partners. We did not count sites that have header bidding code but were not actively making calls.
2. Some sites only "do" header bidding via a network's JS code (networks include Disqus, 33Across, and SRAX). This is also called post-bid. Sites that do this are included in the report.

Client-Side Wrappers
1. A client-side wrapper is a container that holds the codes for 2 or more header bidding adapters.
2. Individual HB tags - like Criteo's - are not considered wrappers, since they contain only their codes. These are included in the bidder breakdown, but not the wrapper breakdown.
3. When we say "Proprietary" or "Custom" wrapper, we are referring to home-grown solutions not based on Prebid.js.

Server-Side Endpoints
1. A server-side endpoint is a header bidding endpoint that pings additional exchanges server-side.
2. Any known S2S endpoings are marked as (S2S) in the report.
3. We have excluded Amazon TAM due to the difficulty of differentiating between a standard Amazon call and a server-side one.

Misc
1. For the "average bidders per site" data, the denominator includes only companies that do header bidding.
2. There will be a discrepancy between total number of wrappers and # of sites doing header bidding. This is because some sites use multiple wrappers (including multiple client-side and server-side wrappers).
3. Multiple domains: Some sites in our list may redirect to the same place (such as Twitter.co and Twitter.com). Other sites may have different domains for different countries (like, CNN.com and CNN.gr). Due to the complexity of identifying duplicates, as well as the fact that a site with multiple country versions may use different products on different domains, the data does not de-dupe based on publisher name. Instead, we analyze adoption rate by URL and thus treat CNN.com and CNN.gr as two different sites.

Consent Management Platform (CMP) Tracker Methodology

Overview

The Adzerk CMP Tracker looks at the Top 10K US and UK sites to determine who uses Consent Management Platforms (CMPs) or other 3rd-party consent tools.

CMPs are a relatively new ad tech term and have arisen thanks to the General Data Protection Regulation. They are a way to track consent and show programmatic ads in a GDPR-compliant manner. You can read more about them in our blog post "CMPs: The Definitive Guide".

While CMPs will differ by company, the more robust ones share similar qualities, including:

1. Being able to sniff the user's location and show or not show a consent prompt
2. Track whether the user has consented or not
3. Track what type of data the user has approved
4. Track what vendors the user has given permission to share data with
5. Based on 2-4, integrate with ad server/programmatic partners to determine whom to source ads to
6. Allow for enablement of data rights (such as being deleted)
7. Analytics on all of the above

How we built the CMP Tracker

1. We first manually built a list of URL endpoints that signify the publisher is using a CMP and which one. This list includes over 500 expressions, including the IAB URL formatting, open-source code from AppNexus and Axel Springer, WordPress plug-ins, and miscellaneous other vendors
2. Next, we pull the Top 10K US and UK sites using Amazon Alexa's API. This list is updated every 3 months to account for traffic fluctuations
3. Finally, we look at every site in the list using multiple geo IPs (France, Spain) to see if they are pinging any of the CMP endpoints and which ones

CMP Definition

While doing the research, we identified five main types of consent collection tools:

1. IAB-Registered Consent Management Platforms: these integrate with the IAB-list of vendors, enable company-level consent, and in general offer more complexity than other solutions. Most are 3rd-party vendors, but some are individual publishers/media groups that wanted to certify their in-house solution

2. Other 3rd-Party Consent Tools: these are consent collection tools not registered with the IAB. They vary in complexity, with some enabling company-level consent, while others are just basic cookie notification banners, such as WordPress plugins

3. In-House Code Using an Open-Source Solution: these are pubs or media companies that built their own consent tools using an open-source solution like AppNexus or Axel Springer

4. In-House Code Using the IAB 'vendorlist' File: these are pubs or media companies that built their own consent tools using the IAB 'vendorlist' file, effectively building their own CMP using the IAB framework

5. In-House With Proprietary Code: these are pubs or media companies that built their own cookie notification bar, but which are unlikely to then pass the data downstream to ad partners. Think of these as basic "we use cookies" messages.

In the report, we are tracking #1-#4. We exclude #5 because the goal of this report is to track 3rd-party CMP adoption, not whether sites are asking for consent at all (nearly all are). This methodology does mean that some 1st-party solutions will be included, though, if (1) they use open-source code or (2) they are registered with the IAB / use the IAB 'vendorlist' file. These two buckets account for less than 5% of all CMP usage, though.

Since adoption isn't at 100%, does this mean other sites aren't tracking consent?

Not at all. We are tracking 3rd-party usage, and many publishers have written their own consent-collection code. Therefore, if we say 10% of UK sites use a CMP, we aren't necessarily saying that only 10% of sites ask for cookie tracking consent; just that only 10% of sites have chosen to use a 3rd-party tool.

Methodology Notes

1. How we pull data: We scrape just the desktop homepages of the sites on the list. We run the tool once, using a French IP. The listener sits on each page for 70 seconds.

2. Multiple CMP codes: Registered IAB vendors use a specific endpoint URL like 'quantcast.mgr.consensu.org'. However, in doing our research, we found that some of these IAB vendors had other CMP codes too (likely due to building their CMP before registering with the IAB). In compiling the data, we decided to group by vendor, not by endpoint, meaning that when we say a vendor is IAB-registered, some of their instances may come from endpoints that are not in the IAB format

3. Multiple domains: Some sites in our list may redirect to the same place (such as Twitter.co and Twitter.com). Other sites may have different domains for different countries (like, CNN.com and CNN.gr). Due to the complexity of identifying duplicates, as well as the fact that a site with multiple country versions may use different products on different domains, the data does not de-dupe based on publisher name. Instead, we analyze adoption rate by URL and thus treat CNN.com and CNN.gr as two different sites

4. Publishers with multiple CMPs: In a few cases (< 3%), some sites had multiple CMP codes. This means the total number of CMP users will be lower than the total instances of CMPs seen

Ads.txt Tracker Methodology

The Ads.txt tracker is a way to see the adoption rates of the IAB's Ads.txt file initiative, as well as the breakdown of top direct sellers and resellers.

Unlike our CMP and HBIX trackers, this one is pretty basic - we scrape the Ads.txt file (https://www.site.com/ads.txt) of the domains in the Top 10K US and UK site list, and then parse the results. These lists were built using Amazon Alexa's API and are updated every 2-3 months.

Some misc methodology notes include:

1. De-dupes: For a given domain, if the same vendor/seller type combo appears in multiple records, we de-dupe them

2. Geo breakdowns: Some Ads.txt files include vendor breakdown by location. This tracker doesn't break that down

3. Publishers with different domains: Some sites in our list may redirect to the same place (such as Twitter.co and Twitter.com). Other sites may have different domains for different countries (like, CNN.com and CNN.gr). Due to the complexity of identifying duplicates, as well as the fact that a site with multiple country versions may use different products on different domains, the data does not de-dupe based on publisher name. Instead, we analyze adoption rate by URL and thus treat CNN.com and CNN.gr as two different sites.

4. Aggregated by Vendor: Some sellers may have more than one domain - or the domains were incorrectly written by the site - so we have aggregated by company brand name, not domain, in the vendor breakdown.

5. Ads.txt File, No Rows: If a site has an Ads.txt file but it's empty, we have NOT included them in the analysis.