Introduction
It’s Christmas and your child wants the latest new gaming console – just like everyone else. As a doting parent, you say you will do what you can. On the date they become available, promptly at 9am, using multiple browsers and mobile devices, you try to acquire one. Just one. Before you finish typing your credit card, the inventory is sold out. You just lost out to an automated shopping bot, accurately referred to as a “Grinch Bot”, designed to use automation, computer-based speed and massive scale to buy high-demand items. The massive scale in this scenario is a network of many thousands of IP addresses, also referred to as proxies, used by malicious bots to increase their chances of success and equally important, maintain anonymity. This blog outlines some of the different types of proxies, how they are used by threat actors and how they can be used by security teams to uncover and block malicious transactions, possibly increasing the chance of scoring that new game console for your child.
What is a proxy?
A proxy is a server that you can send traffic to that would then be re-sent to the destination address, but with the source address now being the proxy’s address instead of the original sender. A potential legitimate use case for a proxy is a whistleblower who wants to bring a critical issue to light but wishes to maintain anonymity. On the flip side, as shown in the image below, a bot will use a large pool of these proxy addresses to disguise their credential stuffing traffic, so it appears to come from many distinct addresses instead of all from the same actor. Because of this, proxies are an essential tool for any threat actor operating at scale while trying to avoid being blocked.
Bulletproof Proxy Services
So, now that we know how essential a proxy network is for conducting a bot attack, how do threat actors get access to one? It is not financially feasible for every threat actor to construct their own proxy network, as acquiring and maintaining access to a large network is very costly and would render most attacks unprofitable. This is where some savvy providers have stepped in to provide proxies as a service. These Bulletproof proxy providers, whose name is derived from Bulletproof hosting, have discovered that there are many actors willing to pay for access to a proxy network and quickly jumped on the opportunity. Bulletproof hosting refers to hosting providers that advertise a service that does not care what they host, but still offer uptime guarantees to customers hosting their sites through them. On the outside, this type of service might appear innocent, as it seems to be a fair and content-blind way to host sites. In practice, these providers and the sites they host are not innocent, with many dark web sites hosted using this type of service.
Similarly, Bulletproof proxy providers advertise unrestricted access to websites for market research or access from countries where traffic may otherwise be blocked. In practice, however, legitimate users would only need access to a handful of proxies, not the massive networks that these providers give access to. Many providers advertise access to millions of different proxies, both residential and datacenter, with varying pricing plans. Bulletproof proxy vendor examples include SmartProxy, Oxylabs, and Bright Data.
Residential vs. Datacenter Proxies
Datacenter proxies generally come from large datacenters, where individuals have acquired large blocks of network infrastructure for relatively little money. The caveat is that these proxies are lower quality and come from organizations and Autonomous System Numbers (ASNs) that one would not expect general user traffic for a public-facing website to come from. Residential proxies, on the other hand, generally are higher quality and come from organizations and ASNs that one might actually expect user traffic to come from. These proxies can be anything we might have deployed in our home – PCs, servers, cable boxes, garage door openers, and even a refrigerator. Residential proxies are more expensive than a datacenter proxy because the individual IPs need to be acquired or compromised instead of purchased in bulk and their residential location make the traffic appear more like a real user.
Using Bot Infrastructure as a Defense Technique
So, how can we identify bot actors based on their infrastructure? We can target the proxies they are using. In order to beat threats at scale, you need scale yourself. Boasting the largest collection of malicious infrastructure, Cequence Network IQ is home to our infrastructure threat data. Within Network IQ are several sub-systems that allow us to make practical use of its rich data. The system within Network IQ that is used to identify and tag proxies that bots are using every day to attack our customers is called IP Threat Score, so let’s take a look at how it works.
Sourcing IP Intelligence from Multiple Sources: IP Threat Score
The IP Threat Score utilizes and parses large amounts of threat data to identify proxy IPs in real time. The IP data is collected from multiple sources (outlined below) and is applied to the benefit of all Cequence customers.
- Passive Threat Research: Cequence has crawlers that are constantly active and searching the web for public data feeds that contain proxy or IP reputation information. They automatically populate our databases with their findings.
- Active Threat Research: The CQ Prime threat research team also goes directly to the source – Bulletproof proxy providers themselves. After obtaining access to their networks, we are able to enumerate the IPs they use and add them to our databases an automated fashion.
- Field Data: Finally, the third threat data feed for IP threat score is data sourced directly from the field. We automatically tag and ingest proxy IPs that we see actually carrying out attacks every day. This is perhaps the most essential source of data for our system, as it looks at what infrastructure bots are really using to carry out their attacks.
Importantly, the majority of the data collection is automated and allows our system to adapt to the ever-changing landscape of active bot infrastructure on the fly.
Putting Data to Work Blocking Bots
Rich data means nothing if we are not able to use it to extract useful information. Important for data of this scope is to extract meaningful features that allow us to create a sliding scale when classifying how bad infrastructure is, whether it is an individual IP address, network block, organization, or ASN. Using the rich data we collect and the features we generate, we can identify not only the worst offending individual IPs, but also entire network blocks, organizations, or ASNs. Using both of these as block criteria, we can identify groupings of proxy IPs corresponding to datacenter proxies, as well as have the granularity to identify individual residential proxies. As an example, for individual attack campaigns using datacenter proxies, 98-100% of attack traffic is tagged as such by IP Threat Score even if the individual IPs themselves are not in our threat database.
This is the power of being able to group and identify malicious infrastructure. Of course, we also have mechanisms in place to address the potential for false positives to ensure that legitimate user traffic is not blocked by our system. In practice, the infrastructure information is combined with our behavioral ML models and rules in Bot Management to distinguish and mitigate malicious bots.
Conclusion
Intelligent threat actors have access to enormous amounts of network infrastructure to use as proxies for their attacks through Bulletproof proxy providers, allowing them to avoid being easily blocked. At Cequence Security, we are uniquely able to use the massive amount of bot data we process every day to enumerate and tag malicious bot infrastructure with high confidence and efficacy. Incorporating this information into our Bot Management product allows Cequence to add value on day one for new customers while making sure existing customers can detect emerging threats and attacks. When combined with our ML-based behavioral fingerprinting and mitigation capabilities, malicious bots stand no chance.
Schedule a personalized demo to see how Network IQ and Bot Management can help protect your applications and APIs.