IVT category and sub-categories
E
Written by Eurico Doirado
Updated over a week ago

We define three major values for our IVT category which follow the MRC guidelines:

IVT Category

  • OK: no sign of invalid traffic were found

  • General Invalid Traffic (GIVT): includes known crawlers (e.g. SEO Crawlers), traffic originating from an ASN which is used for hosting/datacenter purposes, or traffic originating from software other than a browser (e.g. python requests). The major components of detection here include the IP and the User-Agent checks.

  • Sophisticated Invalid Traffic (SIVT): includes invalid traffic that is more difficult to detect as both the IP and the User-Agent may have been masked to look legitimate. We use deeper forensic signals, behavioral analysis, and correlating several factors to identify this type of advanced abuse.

Summary of the ivt_category values

IVT Category name

ivt_category value

OK

ok

General Invalid Traffic (GIVT)

gi

Sophisticated Invalid Traffic (SIVT)

si

We further subdivide each category into subcategories (ivt_subcategory) as follows.

GIVT Subcategories

  • Crawler: traffic that is identifying itself as crawlers at the User-Agent level. A good example of User-Agents that represent crawlers can be found here. We classify all major known crawlers (e.g. SEO, Marketing, Ads) in this category.

    • ivt_subcategory is "crawl"

  • Datacenter (short name: "dc"): traffic classified as datacenter has an IP whose origin has been identified as belonging to a known datacenter. The method used may vary between public IP ranges publishes by the datacenters themselves to a curated list of datacenter's ASN we maintain internally.

    • ivt_subcategory is "dc"

  • Invalid User Agent (short name: "ua"): traffic whose User-Agent is clearly not representing a browser. This could be for example a web request sent from a program like Python or Java.

    • ivt_subcategory is "ua"

Summary of the GIVT ivt_subcategory values

GIVT subcategory name

ivt_subcategory value

Crawler

crawl

Datacenter

dc

Invalid User Agent

ua

SIVT Subcategories

  • Bot: traffic where traces of automation where found. A common example is a browser controlled through selenium.

  • Spoofed Device: traffic where the User-Agent does not match the OS/Browser/Device. An example would be a User-Agent declaring to be an iPhone running on the latest iOS while in reality it is running on a Linux virtual machine environment.

  • Geo Masking: traffic whose geo location origin has been masked through the use of a VPN, proxies and/or TOR.

  • Suspicious IP: traffic originating from IPs that are known to have been used for cyber crimes. These are often IPs used by devices infected with malware.

  • Malicious referer: traffic originating from malicious referers often related with phishing, scams, adware and malware. In the context of adfraud these are often as redirectors to broken traffic coming from dubious placements.

  • Brand unsafe referer: traffic originating from brand unsafe referers (e.g. containing sexually explicit content).

Summary of the SIVT ivt_subcategory values

SIVT subcategory name

ivt_subcategory value

Bot

bot

Datacenter

spoofed_ua

Geo Masking

geo

Suspicious IP

ip

Malicious referer

ref_malicious

Brand unsafe referer

ref_brand

These SIVT subcategories are common to all our products. Below are additional product-specific SIVT categories we also support.

Adfraud (PPC) specific subcategories

  • Frequency Cap: this is an adfraud PPC-only category that flags devices clicking on ads more than the pre-configured limit within a fixed time window (default: 10 times in the last 24 hours).

  • Re-occuring sessions: this is an adfraud PPC-only category that flags devices clicking on ads more than the pre-configured limit (default: 15 times per 24 hours).

Summary of the SIVT ivt_subcategory values that are Adfraud-specific

SIVT subcategory name

ivt_subcategory value

Frequency Cap

repeat

Re-occuring Sessions

reoccuring

Affiliate specific subcategories

  • Hijacked conversions: this is an affiliate specific category that flags conversions stolen through click injection methods. Click injection (often referred to as cookie stuffing) often happen due to device being infected with malware/adware and target specific affiliate campaigns in order to steal the attribution.

  • Flood conversions: this is an affiliate specific category that flags conversions stolen through click flooding methods. Click flooding can also happen due to device being infected with malware/adware but also through other less sophisticated methods like popups. Click flooding is used to steal conversions that should be attributed to organic channels.

  • Farmed conversions: this is an affiliate specific category that flags conversions farmed by a dedicated group of users exploiting the affiliate payout program.

  • Repeat conversions: flag conversions coming from the same user the pre-configured limit (default: 2 times or more within 7 days / 3 times or more within 24 hours).

Summary of the SIVT ivt_subcategory values that are Affiliate-specific

SIVT subcategory name

ivt_subcategory value

Hijacked conversions

cv_hijack

Flood conversions

cv_flood

Farm conversions

cv_farm

Repeat conversions

cv_repeat_user

Difference between subcategory and subcategories:

We provide both a subcategory and a list of subcategories for the same event. The difference is as follows:

  • when an event is classified, we store all the subcategories detected on that event in the subcategories field. The reason is that these subcategories are not mutually exclusive. For example a bot whose IP originates from a datacenter would have the subcategories: "bot,dc".

  • the subcategory is a simplified version in which we choose only one of the triggered subcategories. The subcategory that has the higher priority is chosen. In the previous example, the subcategory would be "dc" because it has higher priority than "bot".

Did this answer your question?