We define three major values for our IVT category which follow the MRC guidelines:
IVT Category
OK: no sign of invalid traffic were found
General Invalid Traffic (GIVT): includes known crawlers (e.g. SEO Crawlers), traffic originating from an ASN which is used for hosting/datacenter purposes, or traffic originating from software other than a browser (e.g. python requests). The major components of detection here include the IP and the User-Agent checks.
Sophisticated Invalid Traffic (SIVT): includes invalid traffic that is more difficult to detect as both the IP and the User-Agent may have been masked to look legitimate. We use deeper forensic signals, behavioral analysis, and correlating several factors to identify this type of advanced abuse.
Summary of the ivt_category values
IVT Category name | ivt_category value |
OK | ok |
General Invalid Traffic (GIVT) | gi |
Sophisticated Invalid Traffic (SIVT) | si |
We further subdivide each category into subcategories (ivt_subcategory
) as follows.
GIVT Subcategories
Crawler: traffic that is identifying itself as crawlers at the User-Agent level. A good example of User-Agents that represent crawlers can be found here. We classify all major known crawlers (e.g. SEO, Marketing, Ads) in this category.
ivt_subcategory
is "crawl
"
Datacenter (short name: "dc"): traffic classified as datacenter has an IP whose origin has been identified as belonging to a known datacenter. The method used may vary between public IP ranges publishes by the datacenters themselves to a curated list of datacenter's ASN we maintain internally.
ivt_subcategory
is "dc
"
Invalid User Agent (short name: "ua"): traffic whose User-Agent is clearly not representing a browser. This could be for example a web request sent from a program like Python or Java.
ivt_subcategory
is "ua
"
Summary of the GIVT ivt_subcategory values
GIVT subcategory name | ivt_subcategory value |
Crawler | crawl |
Datacenter | dc |
Invalid User Agent | ua |
SIVT Subcategories
Bot: traffic where traces of automation where found. A common example is a browser controlled through selenium.
Spoofed Device: traffic where the User-Agent does not match the OS/Browser/Device. An example would be a User-Agent declaring to be an iPhone running on the latest iOS while in reality it is running on a Linux virtual machine environment.
Geo Masking: traffic whose geo location origin has been masked through the use of a VPN, proxies and/or TOR.
Suspicious IP: traffic originating from IPs that are known to have been used for cyber crimes. These are often IPs used by devices infected with malware.
Malicious referer: traffic originating from malicious referers often related with phishing, scams, adware and malware. In the context of adfraud these are often as redirectors to broken traffic coming from dubious placements.
Brand unsafe referer: traffic originating from brand unsafe referers (e.g. containing sexually explicit content).
Summary of the SIVT ivt_subcategory values
SIVT subcategory name | ivt_subcategory value |
Bot | bot |
Datacenter | spoofed_ua |
Geo Masking | geo |
Suspicious IP | ip |
Malicious referer | ref_malicious |
Brand unsafe referer | ref_brand |
These SIVT subcategories are common to all our products. Below are additional product-specific SIVT categories we also support.
Adfraud (PPC) specific subcategories
Frequency Cap: this is an adfraud PPC-only category that flags devices clicking on ads more than the pre-configured limit within a fixed time window (default: 10 times in the last 24 hours).
Re-occuring sessions: this is an adfraud PPC-only category that flags devices clicking on ads more than the pre-configured limit (default: 15 times per 24 hours).
Summary of the SIVT ivt_subcategory values that are Adfraud-specific
SIVT subcategory name | ivt_subcategory value |
Frequency Cap | repeat |
Re-occuring Sessions | reoccuring |
Affiliate specific subcategories
Hijacked conversions: this is an affiliate specific category that flags conversions stolen through click injection methods. Click injection (often referred to as cookie stuffing) often happen due to device being infected with malware/adware and target specific affiliate campaigns in order to steal the attribution.
Flood conversions: this is an affiliate specific category that flags conversions stolen through click flooding methods. Click flooding can also happen due to device being infected with malware/adware but also through other less sophisticated methods like popups. Click flooding is used to steal conversions that should be attributed to organic channels.
Farmed conversions: this is an affiliate specific category that flags conversions farmed by a dedicated group of users exploiting the affiliate payout program.
Repeat conversions: flag conversions coming from the same user the pre-configured limit (default: 2 times or more within 7 days / 3 times or more within 24 hours).
Summary of the SIVT ivt_subcategory values that are Affiliate-specific
SIVT subcategory name | ivt_subcategory value |
Hijacked conversions | cv_hijack |
Flood conversions | cv_flood |
Farm conversions | cv_farm |
Repeat conversions | cv_repeat_user |
Difference between subcategory and subcategories:
We provide both a subcategory and a list of subcategories for the same event. The difference is as follows:
when an event is classified, we store all the subcategories detected on that event in the subcategories field. The reason is that these subcategories are not mutually exclusive. For example a bot whose IP originates from a datacenter would have the subcategories: "bot,dc".
the subcategory is a simplified version in which we choose only one of the triggered subcategories. The subcategory that has the higher priority is chosen. In the previous example, the subcategory would be "dc" because it has higher priority than "bot".