Network IQ: How the Largest API Threat Database Protects Your APIs

August 9, 2022 | by Zack Kaplan

Network IQ

Introduction

It’s Christmas and your child wants the new PS5 game console – just like everyone else. As a doting parent, you say you will do what you can. On the date they become available, promptly at 9AM, with multiple browsers and mobile devices, you try to acquire just one. Before you finish typing your credit card, the inventory is sold out. You just lost out to an automated shopping bot, accurately referred to as a Grinch Bot, designed to use automation, computer-based speed and massive scale to buy high demand items. The massive scale in this scenario is a network of many thousands of IP addresses, also referred to as proxies, used by bot managers to increase the chances of success and equally important, maintain anonymity. This blog will outline some of the different types of proxies, how they are used by threat actors and how they can be used by security teams to uncover and block malicious transactions, possibly increasing the chance of a PS5 for your child.

What is a proxy?

A proxy is a server that you can send traffic to that would then be re-sent to the destination address, but with the source address now being the proxy’s address instead of the original sender. A potential legitimate use case for a proxy is a whistleblower who wants to bring a critical issue to light but wishes to maintain anonymity. On the flip side, as shown in the image below, a bot will use a large pool of these proxy addresses to disguise their credential stuffing traffic, so it appears to come from many distinct addresses instead of all from the same actor. Because of this, proxies are an essential tool for any bot actor operating at scale to try to avoid being blocked.

Network IQ - What is a Proxy

Bulletproof Proxy Services

So, now that we know how essential a proxy network is for conducting a bot attack, how do bot actors get access to one? It is not financially feasible for every bot actor to construct their own proxy network, as acquiring and maintaining access to a large network is very costly and would make most bot use cases unprofitable. This is where some savvy providers have stepped in to provide proxies as a service. These Bulletproof proxy providers, whose name is derived from Bulletproof hosting, have discovered that there are many actors willing to pay for access to a proxy network and quickly jumped on the opportunity. Bulletproof hosting refers to hosting providers that advertise a service that does not care what they host, but still guarantee uptime to customers hosting their sites through them. On the outside, this type of service might appear innocent, as it seems to be a fair and content blind way to host sites. In practice, these providers and the sites they host are not innocent, with many dark web sites hosted using this type of service.

Similarly, Bulletproof proxy providers advertise unrestricted access to websites for market research or access from countries where traffic may otherwise be blocked. In practice, however, legitimate users would only need access to a handful of proxies, not the massive networks that these providers give access to. Many providers advertise access to millions of different proxies, both residential and datacenter, with varying pricing plans. Bulletproof proxy vendor examples include the recently shuttered RSOCKs, Microleaves and SmartProxy.

Residential vs. Datacenter Proxies

Datacenter proxies generally come from large datacenters, where individuals have acquired large blocks of network infrastructure for relatively little money. The caveat is that these proxies are lower quality and come from organizations and ASNs that one would not expect general user traffic for a public facing website to come from. Residential proxies, on the other hand, generally are higher quality and come from organizations and ASNs that one might actually expect user traffic to come from. These proxies can be anything we might have deployed in our home – PCs, servers, cable boxes, garage door openers, and even a refrigerator. Residential proxies are more expensive than a datacenter proxy because the individual IPs need to be acquired or compromised instead of purchased in bulk and their residential location make the traffic appear more like a real user.

Using Bot Infrastructure as a Defense Technique

So, how can we identify bot actors based on their infrastructure? We can target the proxies they are using. In order to beat threats at scale, you need scale yourself. Boasting the largest collection of malicious infrastructure, Cequence Network IQ is part of CQAI, and is home to our infrastructure threat data. Within Network IQ are several sub-systems that allow us to make practical use of its rich data. The system within Network IQ that is used to identify and tag proxies that bots are using every day to attack our customers is called IP Threat Score, so let’s take a look at how it works.

Sourcing IP Intelligence from Multiple Sources: IP Threat Score

The IP Threat Score utilizes and parses large amounts of threat data to identify proxy IPs the moment we see them in our product. The IP data is collected from multiple sources outlined below.

  • Passive Threat Research: Cequence has crawlers that are constantly active and searching the web for public data feeds that contain proxy or IP reputation information. They automatically populate our databases with their findings.
  • Active Threat Research: The CQ Prime Threat Research team also goes directly to the source – Bulletproof proxy providers themselves. After obtaining access to their networks, we are able to enumerate the IPs they use and add them to our databases an automated fashion.
  • Field Data: Finally, the third threat data feed for IP threat score is data directly from the field. We automatically tag and ingest proxy IPs that we see actually carrying out attacks every day. This is perhaps the most essential source of data for our system, as it looks at what infrastructure bots are really using to carry out their attacks.

Network IQ - Field Data

Importantly, the majority of the data collection is automated and allows our system to adapt to the ever-changing landscape of active bot infrastructure on the fly.

Putting Data to Work Blocking Bots

Rich data means nothing if we are not able to use it to extract useful information. Important for data of this scope is to extract meaningful features that allow us to create a sliding scale when classifying how bad infrastructure is, whether it is an individual IP address, network block, organization, or ASN. Using the rich data we collect and the features we generate, we can identify not only the worst offending individual IPs, but also entire network blocks, organizations, or ASNs. Using both of these as block criteria, we can identify groupings of proxy IPs corresponding to datacenter proxies, as well as have the granularity to identify individual residential proxies. As an example, for individual attack campaigns using datacenter proxies, 98-100% of attack traffic is tagged as such by IP Threat Score even if the individual IPs themselves are not in our threat database.

Network IQ - By The Numbers

This is the power of being able to group and identify malicious infrastructure. Of course, we also have mechanisms in place to address the potential for false positives to ensure that legitimate user traffic is not blocked by our system. In practice, the infrastructure information is combined with our behavioral ML models and rules in CQAI and Bot Defense to distinguish and mitigate malicious bots.

Conclusion

Intelligent threat actors have access to enormous amounts of network infrastructure to use as proxies for their attacks through Bulletproof Proxy providers, allowing them to avoid being easily blocked. At Cequence Security, we are put in the unique position to be able to use the massive amounts of bot data we process every day to enumerate and tag bot infrastructure with high confidence and efficacy. Identifying bot infrastructure allows us to add value on day 1, with bot infrastructure being tagged out of the box for Bot Defense. When combined with our ML-based behavioral fingerprinting and mitigation capabilities, bots stand no chance.

Schedule a personalized demo to see how Network IQ and Bot Defense can help protect your APIs and web apps.

Zack Kaplan

Zack Kaplan

Software Engineer

Additional Resources

Get an attacker’s view of your API attack surface now. Free, no obligation API assessment Arrow icon