How to Use Residential Proxies to Avoid IP Bans When Scraping (Step-by-Step Guide)
GuideA concise overview of how residential proxies help prevent IP bans by making scraper traffic appear human, and how tools like Playwright and MrScraper simplify proxy rotation, fingerprinting, and large-scale anti-bot evasion.
You've been here before. Your scraper runs beautifully for twenty minutes, then hits a wall — 403 Forbidden, a redirect to a CAPTCHA page, or just silence where data used to be. You check the logs. Your IP is banned. You restart from a new IP. Forty minutes later, same thing. Rinse, repeat, lose entire afternoons.
IP bans are the most common scraping obstacle, and they're not random. Sites ban IPs because the traffic pattern — velocity, origin, consistency — looks automated. The fix isn't just more IPs. It's the right kind of IPs, used the right way.
Here's the core answer: residential proxies work where datacenter proxies fail because they route your requests through real home internet connections — the same type of IPs real users browse from. When combined with proper rotation, request pacing, and session management, they make your scraper's traffic statistically indistinguishable from real user traffic. This guide walks through the complete setup, step by step.
Why IP Bans Happen in the First Place
Before building the solution, let's understand the problem. Sites don't ban IPs by accident. They ban them because something in the traffic pattern triggered a detection threshold. The most common triggers:
Datacenter IP origin — AWS, Google Cloud, and DigitalOcean IP ranges are classified in public databases like IPinfo and MaxMind. Any request from these ranges is pre-flagged as potential bot traffic before a single behavioral signal is evaluated.
Request velocity — Ten requests per second from one IP, with 100ms precision between each one, is not human behavior. Even if each individual request looks normal, the pattern across requests reveals automation.
No session history — Real users arrive at deep pages with cookies from previous visits, referrer headers from search engines, and session state accumulated over time. A scraper jumping cold into /products/item-12345 with zero prior session history stands out immediately.
Consistent fingerprint across many requests — The same User-Agent, same IP, same response to JavaScript challenges, same time-of-day pattern. Real users are inconsistent. Scrapers are predictable.
IP reuse after a block — When one request gets soft-blocked (challenged or rate-limited), immediately retrying from the same IP confirms it's automated. Real users don't retry a CAPTCHA page instantly.
Understanding which trigger is causing your bans determines which solution to apply. Residential proxies solve the IP origin problem definitively. The others require additional techniques — covered in the steps below.
What Are Residential Proxies?
A residential proxy routes your request through an IP address assigned by an Internet Service Provider (ISP) to a real household device — a home router, laptop, or mobile phone on a consumer internet connection.
The critical difference from datacenter proxies: residential IPs belong to ASN ranges owned by consumer ISPs (Comcast, AT&T, Verizon, BT, Deutsche Telekom). They're the same IP type that hundreds of millions of real users browse from every day. Anti-bot systems that instantly flag AS14618 (Amazon AWS) have no equivalent flag for a Comcast residential IP — blocking it would mean blocking real customers.
Residential proxy providers build their pools by partnering with apps and services whose users opt in to share idle bandwidth. When your scraper makes a request through the pool, it literally travels through someone's home internet connection before reaching the target.
For a deeper dive on residential vs. datacenter proxies, see our guide: Residential Proxy vs Datacenter Proxy for Web Scraping: What's the Difference?
Step-by-Step Guide: Using Residential Proxies to Avoid IP Bans
Step 1: Choose a Residential Proxy Provider
Not all residential proxy providers are equal. The pool quality — how fresh the IPs are, how their reputation scores look, how diverse the geographic coverage is — varies dramatically between providers. What to evaluate:
Pool size and geographic coverage — You want a provider with millions of IPs across multiple countries, cities, and carriers. Thin pools cycle through the same IPs too quickly, and those IPs accumulate bad reputation scores.
Rotation options — You need both rotating proxies (new IP per request) and sticky sessions (same IP for a configurable duration). Rotating is best for most scraping; sticky is needed for multi-step workflows that require session continuity.
IP reputation screening — Ask whether the provider pre-screens IPs for reputation. The best providers remove IPs with high fraud scores before adding them to the pool. You can verify a sample yourself using IPQualityScore.
Ethical sourcing — Check the provider's terms to understand how devices are recruited into their pool. Legitimate providers pay device owners or use opt-in SDK integrations. Avoid providers with opaque sourcing.
MrScraper's Scraping Browser includes residential proxy rotation built into the infrastructure — if you'd rather not manage a proxy provider at all, this is covered in Step 6.
Step 2: Set Up Your Proxy Configuration in Python
Most residential proxy providers give you a gateway endpoint — a single host:port combination that handles routing and rotation automatically. Here's how to configure it for requests:
import requests
import time
import random
# Your residential proxy provider credentials
PROXY_HOST = "residential-proxy.provider.com"
PROXY_PORT = 8080
PROXY_USER = "your_username"
PROXY_PASS = "your_password"
def get_proxy():
"""Build proxy config — most providers use a single endpoint for rotation."""
proxy_url = f"http://{PROXY_USER}:{PROXY_PASS}@{PROXY_HOST}:{PROXY_PORT}"
return {
"http": proxy_url,
"https": proxy_url,
}
def scrape_with_residential_proxy(url):
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language": "en-US,en;q=0.9",
"Accept-Encoding": "gzip, deflate, br",
"Referer": "https://www.google.com/",
"DNT": "1",
}
try:
response = requests.get(
url,
proxies=get_proxy(),
headers=headers,
timeout=20,
)
if response.status_code == 200:
return response.text
elif response.status_code == 403:
print(f"403 Forbidden — IP may be flagged, rotating on next request")
return None
elif response.status_code == 429:
print(f"429 Too Many Requests — slow down")
return None
else:
print(f"Unexpected status: {response.status_code}")
return None
except requests.exceptions.ProxyError as e:
print(f"Proxy connection failed: {e}")
return None
except requests.exceptions.Timeout:
print(f"Request timed out — residential proxies can be slower")
return None
# Test on your target
result = scrape_with_residential_proxy("https://example.com/products")
if result:
print(f"Success — got {len(result)} characters")
The Referer: https://www.google.com/ header is worth highlighting — it tells the server your "user" arrived from a Google search, which is the most common traffic source for real visitors. A request with no referrer header on a first visit is mildly unusual; a realistic referrer is a small but meaningful signal improvement.
Step 3: Implement Proper IP Rotation
The most common mistake with residential proxies is using the same IP for too many requests. Even a residential IP starts accumulating velocity signals after repeated use against the same target. Here's how to implement smart rotation:
import requests
import random
import time
# Some providers support session-level rotation via username parameters
# Check your provider's documentation for the exact syntax
PROXY_CONFIGS = [
{
"username": "user-session-1", # Each session = different IP
"password": "your_password",
"host": "residential-proxy.provider.com",
"port": 8080,
},
{
"username": "user-session-2",
"password": "your_password",
"host": "residential-proxy.provider.com",
"port": 8080,
},
# Add more sessions as needed
]
class ResidentialProxyRotator:
def __init__(self, proxy_configs, requests_per_ip=10):
self.configs = proxy_configs
self.requests_per_ip = requests_per_ip
self.current_index = 0
self.request_count = 0
def get_current_proxy(self):
config = self.configs[self.current_index]
proxy_url = f"http://{config['username']}:{config['password']}@{config['host']}:{config['port']}"
return {"http": proxy_url, "https": proxy_url}
def rotate_if_needed(self):
self.request_count += 1
if self.request_count >= self.requests_per_ip:
# Move to next proxy session
self.current_index = (self.current_index + 1) % len(self.configs)
self.request_count = 0
print(f"Rotated to proxy session {self.current_index + 1}")
def force_rotate(self):
"""Call this when you get a 403 or 429 — don't wait for the rotation schedule."""
self.current_index = (self.current_index + 1) % len(self.configs)
self.request_count = 0
print(f"Force-rotated to proxy session {self.current_index + 1}")
rotator = ResidentialProxyRotator(PROXY_CONFIGS, requests_per_ip=8)
def scrape_with_rotation(url):
rotator.rotate_if_needed()
proxies = rotator.get_current_proxy()
try:
response = requests.get(
url,
proxies=proxies,
headers={
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
},
timeout=15,
)
if response.status_code in (403, 429):
rotator.force_rotate() # Immediate rotation on block signals
return None
return response.text
except Exception as e:
rotator.force_rotate()
print(f"Error: {e}")
return None
The force_rotate() on 403/429 is the key detail here. Continuing to use a flagged IP after getting a block signal is how scrapers burn through IPs fast. Rotating immediately gives you a fresh identity on the next request.
Step 4: Add Request Pacing and Human-Like Delays
Even with perfect residential IPs, robotic request velocity will trigger rate limiting at the application layer — which operates independently of IP reputation checks. Here's pacing that looks human:
import asyncio
import random
async def paced_scrape_loop(urls):
results = []
for i, url in enumerate(urls):
# Scrape the current URL
result = scrape_with_rotation(url)
if result:
results.append(result)
# Variable delay between requests — humans don't click at fixed intervals
base_delay = random.uniform(2.0, 5.0)
# Occasionally pause longer — simulates reading, tabbing away, etc.
if random.random() < 0.15: # 15% of requests
base_delay += random.uniform(5.0, 15.0)
print(f"Taking a longer pause ({base_delay:.1f}s)...")
# After every 20 requests, take a meaningful break
if (i + 1) % 20 == 0:
break_duration = random.uniform(30.0, 90.0)
print(f"Extended break after {i+1} requests ({break_duration:.0f}s)...")
await asyncio.sleep(break_duration)
else:
await asyncio.sleep(base_delay)
return results
The 15% random long-pause probability is deliberate. Real users don't spend exactly 3.2 seconds on every page — sometimes they read for 30 seconds, sometimes they switch tabs. That randomness in timing is a behavioral signal that's hard to fake with fixed delays.
Step 5: Handle Geo-Targeting for Location-Specific Content
One of the underrated advantages of residential proxies is precise geographic targeting. Most providers let you specify country, state, and city in your proxy credentials:
def get_geo_targeted_proxy(country="US", state=None, city=None):
"""
Most residential proxy providers support geo-targeting via username parameters.
Syntax varies by provider — check your provider's documentation.
"""
# Example for a provider that uses username params for geo-targeting
geo_params = f"country-{country}"
if state:
geo_params += f"-state-{state}"
if city:
geo_params += f"-city-{city.replace(' ', '_')}"
proxy_url = f"http://{PROXY_USER}-{geo_params}:{PROXY_PASS}@{PROXY_HOST}:{PROXY_PORT}"
return {"http": proxy_url, "https": proxy_url}
# Target US - California - San Francisco
sf_proxy = get_geo_targeted_proxy(country="US", state="california", city="San Francisco")
# Target UK - London
london_proxy = get_geo_targeted_proxy(country="GB", city="London")
response = requests.get(
"https://example.com/local-listings",
proxies=sf_proxy,
headers={"User-Agent": "Mozilla/5.0 ..."},
)
This is particularly useful for:
- Price comparison — Many e-commerce sites serve different prices by location
- Local SEO research — Google and Bing return different SERP results per city
- Real estate scraping — Location-based listings only show to local visitors
- Retail availability — In-store availability often varies by region
Critically, when you geo-target, make sure your browser's timezone and language settings match the proxy's location. A UK proxy with an Accept-Language: en-US header is a detectable inconsistency.
Step 6: Use Playwright with Residential Proxies for JavaScript Sites
For JavaScript-rendered sites — React, Vue, Angular — requests won't work regardless of how good your proxy is. You need a real browser. Here's how to configure Playwright with a residential proxy:
from playwright.async_api import async_playwright
import asyncio
PROXY_CONFIG = {
"server": "http://residential-proxy.provider.com:8080",
"username": "your_username",
"password": "your_password",
}
async def scrape_js_site_with_residential_proxy(url):
async with async_playwright() as p:
# Pass the proxy config to the browser launch
browser = await p.chromium.launch(
headless=True,
proxy=PROXY_CONFIG,
)
context = await browser.new_context(
user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
viewport={"width": 1920, "height": 1080},
locale="en-US",
timezone_id="America/New_York",
)
page = await context.new_page()
# Patch webdriver flag before the page's anti-bot scripts run
await page.add_init_script("""
Object.defineProperty(navigator, 'webdriver', { get: () => undefined });
""")
await page.goto(url, wait_until="domcontentloaded")
await page.wait_for_selector(".product-card", timeout=15000)
products = await page.eval_on_selector_all(
".product-card",
"els => els.map(el => el.textContent.trim())"
)
await browser.close()
return products
results = asyncio.run(scrape_js_site_with_residential_proxy("https://js-site.com/products"))
print(results[:3])
The timezone must match your proxy's geo-location. A New York residential proxy with timezone_id="Europe/Berlin" creates a geolocation inconsistency that behavioral analysis systems flag. Always align timezone, locale, and Accept-Language with your proxy country.
Step 7: Skip All of This With MrScraper
The steps above are what managing residential proxies yourself looks like. It's doable, but it's an ongoing maintenance burden — you're buying proxy credits, managing rotation logic, aligning geolocation signals, and fixing issues when providers change their API.
MrScraper bundles residential proxy rotation into its infrastructure, so none of this setup is required. The proxy_country parameter handles geo-targeting in one line:
import asyncio
from mrscraper import MrScraperClient
async def scrape_without_proxy_management():
client = MrScraperClient(token="YOUR_MRSCRAPER_API_TOKEN")
# Residential proxy rotation, fingerprinting, and CAPTCHA solving — all automatic
result = await client.create_scraper(
url="https://protected-site.com/products",
message="Extract all product names, prices, and availability",
agent="listing",
proxy_country="US", # Routes through US residential IPs automatically
)
print("Scraper ID:", result["data"]["data"]["id"])
asyncio.run(scrape_without_proxy_management())
No proxy provider account. No rotation class. No credential management. No geolocation alignment — MrScraper matches the browser timezone and language to the proxy country automatically.
Or connect your existing Playwright scraper to MrScraper's cloud browser — one line change, residential proxies included:
from playwright.async_api import async_playwright
import asyncio
async def scrape_via_mrscraper_browser(url):
async with async_playwright() as p:
# Residential proxies, fingerprinting, CAPTCHA handling — all in the connection
browser = await p.chromium.connect_over_cdp(
"wss://browser.mrscraper.com?token=YOUR_API_TOKEN"
)
page = await browser.new_page()
await page.goto(url, wait_until="domcontentloaded")
await page.wait_for_selector(".product-card", timeout=15000)
products = await page.eval_on_selector_all(
".product-card",
"els => els.map(el => el.textContent.trim())"
)
await browser.close()
return products
asyncio.run(scrape_via_mrscraper_browser("https://protected-site.com/products"))
Same Playwright code. Zero proxy infrastructure.
Common Challenges and Limitations
Residential proxies don't guarantee bypass on their own. A clean residential IP with a headless Chrome fingerprint and robotic behavior still gets flagged by advanced systems like Cloudflare Enterprise. The IP fixes the ASN reputation problem. You still need proper browser fingerprinting and behavioral randomization on top of it — which is why managed scraping browsers that bundle all three are often more effective than proxies alone.
Pool quality degrades over time. Residential IPs used heavily for scraping accumulate bad reputation scores. Good providers rotate their pools and remove flagged IPs, but no pool is infinitely clean. Monitor your success rates over time and switch providers if you see block rates climbing without explanation.
Latency is higher than datacenter proxies. Home internet connections are slower and more variable than data center infrastructure. Residential proxy requests can take 2–5 seconds where a datacenter request takes 0.5 seconds. Set your timeouts accordingly (15–20 seconds minimum) and account for the added latency in your throughput calculations.
Sticky sessions complicate rotation. Sticky sessions are useful for multi-step workflows (login → navigate → extract), but holding the same IP across many requests increases per-IP velocity. Use rotating sessions for bulk scraping and sticky sessions only when session continuity is genuinely required.
Some sites block entire ISP ranges. High-value targets that are extremely aggressive about bot prevention sometimes block known residential proxy provider IP ranges — because the same IPs show up repeatedly in their logs. If you're seeing residential proxy blocks on a specific target, try switching providers or using mobile proxies (4G/5G IPs), which are harder to identify as proxy traffic.
Conclusion
IP bans are frustrating, but they're solvable — and residential proxies are the core of the solution for any site that takes bot detection seriously. The key is using them correctly: rotating frequently, pacing requests variably, aligning geolocation signals, and pairing them with proper browser fingerprinting for JavaScript-rendered targets.
The DIY path — provider account, rotation logic, Playwright proxy config, geolocation alignment — works and gives you full control. The managed path — MrScraper's Scraping Browser with proxy_country set — skips all the infrastructure work and pairs residential proxies with fingerprinting and CAPTCHA handling automatically.
Either way, the principle is the same: make your traffic look like it comes from real people, because once it does, most IP ban triggers simply stop applying to you.
What We Learned
- IP bans are triggered by patterns, not individual requests — velocity, datacenter origin, missing session history, and consistent fingerprints across requests are the main triggers; residential proxies fix the origin problem definitively
- Residential proxies route through real ISP-assigned home connections — making them indistinguishable from regular user traffic at the network level, unlike datacenter IPs that are pre-flagged by public IP classification databases
- Force-rotate immediately on 403 or 429 responses — continuing to use a flagged IP after a block signal burns through your proxy budget and confirms automated behavior; rotate and retry from a fresh IP
- Variable delays — especially occasional long pauses — matter more than fixed sleeps — fixed intervals are detectable as automation; randomized timing with 15% probability of longer breaks mimics real browsing patterns
- Geo-targeting requires full signal alignment — proxy country, browser timezone,
Accept-Languageheader, and locale must all point to the same location; mismatches are a detectable inconsistency that behavioral analysis flags - MrScraper's
proxy_countryparameter eliminates all proxy infrastructure management — residential rotation, geolocation alignment, fingerprinting, and CAPTCHA handling are all bundled into the infrastructure rather than your codebase
FAQ
- How many requests can I make per residential IP before rotating? A safe default is 5–15 requests per IP for protected sites, rotating after each session. For sites with lighter protection, you can push to 20–30. Monitor your 403/429 rate per IP — if a specific IP starts getting flagged, rotate immediately rather than waiting for the scheduled rotation. Some scrapers rotate on every single request for maximum safety.
- Why do I still get blocked with residential proxies?
Most likely one of three reasons: (1) your browser fingerprint (headless Chrome signals,
navigator.webdriver) is giving you away even with a clean IP; (2) your request velocity is too high for that target site's rate limiting; or (3) the residential proxy provider's IPs have already been flagged by that specific site. Try pairing residential proxies with browser fingerprint patches, lower your request rate, and test a different provider's IP sample. - Is it expensive to use residential proxies at scale? At $5–$15/GB, costs add up for data-heavy scraping. Optimize by only routing through residential proxies for requests that need them — pre-checking whether a URL requires residential access, using datacenter IPs for non-sensitive requests, and compressing where possible. Alternatively, MrScraper's plans bundle proxy usage into a predictable monthly cost rather than per-GB billing.
- What's the difference between rotating and sticky residential proxies? Rotating proxies assign a new IP for every request or every connection. Sticky proxies maintain the same IP for a configurable window (typically 1–60 minutes). Use rotating for most bulk scraping to keep per-IP velocity low. Use sticky sessions when you need to maintain a logged-in session, complete a multi-step checkout flow, or interact with a site that tracks session continuity.
- Can I use residential proxies with Scrapy?
Yes — Scrapy supports proxy configuration through its
DOWNLOADER_MIDDLEWARESand settings. SetHTTP_PROXYor use a proxy middleware likescrapy-rotating-proxiesto integrate residential proxy rotation. The same principles apply: rotate frequently, add realistic delays viaDOWNLOAD_DELAYandRANDOMIZE_DOWNLOAD_DELAY, and match headers to the proxy's geo-location.
Find more insights here
Residential Proxy vs Datacenter Proxy for Web Scraping: What's the Difference?
A concise overview of the differences between datacenter and residential proxies, explaining why res...
How to Scrape Single Page Applications (SPAs) Without Losing Your Mind
A concise overview of scraping SPAs by shifting from static HTML parsing to techniques like inspecti...
Best Free Web Scraping APIs to Try Before You Buy (Free & Paid Options)
A concise overview of how free tiers from tools like MrScraper, ScraperAPI, and ZenRows are designed...