Zero Trust Security in Digital Advertising
Q: What happens if you leave a loophole in online fraud detection?
A: It will surely be exploited!
In 2024 IT security has to be based on the Zero Trust security model [1]. This means: No one is trusted by default. Verification is required in order to be trusted.
This security model works for years when dealing with untrusted files, emails, network connections. Not just in IT, but also in real life. That’s why:
Email attachments are all scanned before you’re allowed to open them, 100% is scanned
Your virus scanner scans every file before you’re allowed to open the file, 100% is scanned
The routing and header information (SPF, DKIM, DMARC) of all emails are checked in order to detect and flag spam, 100% is verified
Firewalls block any unsolicited inbound connection, 100%
At US airports all passengers are security checked by the TSA, 100%
etc.
Does this zero trust security model fail? Yes, sometimes it fails. In most cases because of human error. Zero-day exploits which are not yet detected by the virus scanner are very rare and if you’re a target then you already know what to do. But, clicking on a malicious link in an email, opening a malicious attachment not being flagged are reason it fails, because the human verification failed.
Once clicked on such a attachment or malicious link you’re screwed. This can be: Cyber extortion because of a data breach, a ransomware attack, or an APT will be installed [2]. This is all well known and many companies have implemented a ton of security measures in order to mitigate these threats. But, this is to protect their own infrastructure and the data within their corporate networks. Besides state sponsored actors the majority of bad actors are not really interested in the stolen data. They are interested in the monetary value of the data through extortion and/or blocking access to your private or company’s computer(s) and its files until you pay a hefty sum of money.
Ad fraud totals US market
In 2024 the global digital ad spending will reach ~US $667 billion. At a nine percent growth this means before 2030 the total digital ad spending will cross the US $1 trillion mark. This means a single percent of ad fraud represents US $10 billion. Looking at the US market, which is about 1/3rd of the global market, this means a single percent represents ~US $3.3 billion. If the total fraud is 20% the monetary damage of this fraud in the US will be 20% * ~US $3.3B = ~US $66 billion.
That’s a lot of money! And this implies a lot of responsibility for the vendors fighting this type of fraud.
Q: How many sucessful ransomware installs and payments are required to reach US $66 Billion?
A: At avg. $300 per device that’s 220 million devices and payments!
So, potential ad-fraud profits are way bigger than ransomware and it is less distributed. The following question arises: Does this mean that money spent by brands and/or advertisers is protected better?
Zero-trust in ad-tech?
One of the protections is ads.txt, where ads is an acronym for Authorized Digital Sellers [3][4][5]. If you want to know how such a file looks: go to the location bar in your browser, type in the domain name, a slash, and ads.txt. For example: https://www.wsj.com/ads.txt or https://www.wsj.com/app-ads.txt.
Who enforces these files? The DSPs (Google DV360, The Trade Desk, Centro, etc) will load these text files from a publisher’s domain and only serve ads from authorized monetization partners on these domains. In theory this should protect advertisers from MFA sites and other shady websites, but look at the current situation and how did this work out? Even if you have implemented whitelists of allowed domains fraudsters still find a way around.
But, that’s the supply side. What about the demand side? The demand side are the millions of browsers, apps, and connected TVs downloading a webpage with ad slots requesting ads, or watching a video requesting an in-stream ad. To understand what happens beneath the hood let’s first look at the lifecycle of a typical advertisement in programmatic. Its lifecycle has two logical stages: 1) pre-bid and 2) post-bid.
Pre-bid
The pre-bid stage requests an advertisement. This means the app, connected TV, browser send a requests to the ad-tech infrastructure which will start an auction to determine whether an ad is available in your geo, within your min/ max bid price, device, targeting (your cookies), etc. If the bid is won, an advertisement is returned and the post-bid stage starts.
Post-bid
The post-bid stage will load the advertisement in the ad-slot. If applicable it will start measuring viewability, detect ad-fraud, and fire completion pixels.
Figure 2 shows the (simplified) lifecycle of an advertisement in programmatic as seen from the browser [7]. Each stage is a contact moment between the browser and the ad tech infrastructure. In a zero-trust environment each stage needs to revalidate whether the browser is really the browser it claims to be. And, if not: Failure, bot, or something else you don’t want to pay for!
How is this implemented in pre-bid?
In pre-bid this is achieved by looking at the IP address and the user agent. The IP address roughly tells you where the request comes from. If you’re an US insurance company licensed in 30 states you only want your ads served in those 30 states. The user agent tells you what kind of device is requesting the ad. If the user agent is a bot, eg. HeadlessChrome, bingbot or Applebot you don’t want your ads being served. But, unfortunately, browsers are able to set their own user agent and proxy servers can be used to fake the geo-location. It means these two precautions don’t fit the zero-trust model.
How is this implemented in post-bid?
The typical setup is that a piece of JavaScript code is sent along with the advertisement. Once loaded in the browser it verifies whether the advertisement was seen within the viewport (viewability) and whether the request was made and viewed by a human or a bot. In Connected TV no JavaScript can be executed, only completion pixels can be fired. Whether this fits the zero-trust model can be read below.
What would the fraudster do?
A few weeks ago I posted this article: ‘How to make money using fake browsers’ [6]. This article describes how you can successfully request an advertisement using ~100 lines of Python. One of the images in that article shows the returned advertisement including the post-bid IAS verification loader JavaScript.
Figure 3 shows the prebid response from the ad server. This prebid response contains a field named content which contains the content to be loaded in the ad slot. The content field in Figure 3 is highlighted in red and startes with “content”. This field also contains the IAS post-bid verification tag loader. This small piece of JavaScript starts just before the green marked text after the word <script> and ends just after the yellow marked text before the </script> tag. Its purpose is to load the real verification JavaScript, which is hosted at IAS.
IAS ad-tech: 99% trust?
Looking at the IAS verification tag loader JavaScript code in figure 3: In the green label the following code can be read:
if (Math.random() * 100 < 1 && …. )
This means that only 1% of the browsers will load the verification tag. 99% of the browsers will not verify the advertisements. This is called sampling. But, the decision is made at the client in the browser! This means the bot decides whether it load the verification script or not. The bot decides to be in the 99% trusted group, or not. In this implementation: You don’t know whether the bot didn't load the verification code ON PURPOSE, or the random number was >= 1.
To give IAS some free advice: If you still want to stick with sampling at 1% then please do it well! Move the decision which ad is to be verified and which not to the server side. In this new scenario the server selects which ad is verified, not the browser, not the bot! If a bot still doesn’t load the verification tag and thus nothing is returned, this means: the advertisement was not loaded, not viewed and shouldn’t be paid for.
In post-bid other vendors do sample as well. Human for example. The difference is the decision which client is verified lies at the server side, which is fine if the decision is really random. As far as I know Double Verify doesn’t sample.
Now what?
Is every pre-bid request verified? That depends on the ad-tech infrastructure. Some vendors have partnerships in order to verify all prebid requests, others might do the verification themselves. In post-bid it all depends on the sophistication of the detection (browsers can spoof properties), how well the call back containing the collected data is protected (bots are able to capture and rewrite the returned payload). It also depends whether sampling decisions are made server side (this is the correct way) or client side (this is wrong!). But, how well does the detection solution work? At this moment we just don’t know because no regular independent security audits are being done to verify this.
Prebid
More technical checks need and CAN be to be done in order to verify prebid requests. IP address and user agent are just not enough, as they can be spoofed. Although
Post bid
In order to verify and detect invalid traffic better technology needs to be implemented. You’ll never detect and flag all invalid traffic, but that’s not the goal. The goal of better detection is to make it more expensive to operate for bad actors. When bad actors need to continuously improve their technology, have to route the traffic over expensive residential proxies it will make it much more expensive. The end goal would be to make it economically unfeasible to operate. The first step to achieve this goal is to close all blind spots and loopholes. Unfortunately, at this moment huge blind spots and many loopholes still exist.
The second step is to improve detection technology to detect invalid traffic. This technology exists, but you do have to use it. For example, the current naive detection technology does read many browser properties but take them at face value. In order to improve this your technology needs to be ramped up. If you want to know more about this. Oxford Biochronometrics can help.
Red team audit
Brands and/or advertisers do you want to know how well your digital spend is protected against invalid traffic and/or sophisticated invalid traffic aka fraud? In order to know how well your ad-tech infrastructure protects its inventory and thus your spend you could run an audit. The idea behind an audit like this comes from red team/blue team approach [8]. The bots represent the red team and its goal is to demonstrate what works for the defenders. The verification vendors represent the blue team and must defend against real or simulated attacks.
The red team will unleash a number of different bots. These bots should be different in the level of sophistication. Based on the returned advertisement or a blank HTTP 204 message you know whether the bot was detected or not in pre-bid. The same can be done for post-bid. If the detailed logs contain these bots and show them as human, the bots were not detected. The blue team (the verification vendors) will try to detect the bots. Because the bots differ in sophistication some bots might be detected and some not. If only a subset of the bots is detected you can determine how well the blue team did its work. Of course, if (server side) sampling is used you only have to look at the sample.
The purpose of this audit is NOT to share technology on how to detect bots. Acquiring and improving knowledge is up to the blue team. They’ll need to hire better staff, build a research lab and run internal blue/red simulations within their lab to improve their bot detection technology.
Knowing fraudsters and knowing that if any blind spot or loophole exists they will find it! To prevent this your verification vendor will have to adopt the zero trust security model and verify everything. Just like the TSA, your virus scanner, and your firewall: 100%!
Questions? Feel free to connect, comment and/or DM.
#adfraud #digitalmarketing #prebid #frauddetection #CMO
[1] https://en.wikipedia.org/wiki/Zero_trust_security_model
[2] https://en.wikipedia.org/wiki/Advanced_persistent_threat
[3] https://iabtechlab.com/ads-txt/
[4] https://support.google.com/admanager/answer/7441288?hl=en
[5] https://support.google.com/admanager/answer/9422067?hl=en
[6] https://www.linkedin.com/pulse/how-make-money-using-fake-browsers-sander-kouwenhoven-esgce
[7] https://support.google.com/admanager/answer/7128958?
[8] https://csrc.nist.gov/glossary/term/red_team_blue_team_approach