How to Use cURL with Proxy and Mrscraper API
Article

How to Use cURL with Proxy and Mrscraper API

Guide

Learn how to use cURL with a proxy to interact with the Mrscraper API for web scraping. This guide covers the step-by-step process of setting up proxies, handling authentication, and maximizing scraping efficiency with cURL.

How to Use cURL with Proxy and Mrscraper's API The ability to scrape data from websites using APIs is a powerful tool for businesses and developers. When it comes to scraping, Mrscraper is a reliable service that offers various endpoints to perform web scraping. However, some scenarios require you to use proxies to ensure anonymity or access content behind geographic or network restrictions.

This article will guide you on how to use cURL with a proxy to interact with the Mrscraper API based on its JSON documentation.

Table of contents

What Is cURL?

cURL is a command-line tool for making HTTP requests. It’s often used for interacting with web servers, testing APIs, and downloading files. With cURL, you can send requests with various HTTP methods like GET, POST, PUT, DELETE, etc.

Why Use a Proxy?

Using a proxy allows you to route your request through a different server. This can help:

  • Mask your IP address.
  • Bypass geo-restrictions.
  • Enhance security and anonymity.

When combined with the mrscraper's API, using a proxy ensures you can scrape data while maintaining flexibility over where your requests are coming from.

Step-by-Step Guide: Using cURL with a Proxy and Mrscraper API

Step 1: Understanding the API from Mrscraper.json

The API documentation found in the mrscraper.json file defines several endpoints you can interact with. A typical endpoint might look like this:

{ "url": "/api/v1/scrape", "method": "POST", "headers": 
{ "Authorization": "Bearer your_api_key", "Content-Type": 
"application/json" }, "body": { "url": "https://targetwebsite.com", 
"options": { "headers": { "User-Agent": "custom-agent" } } } }`

This example shows a POST request to /api/v1/scrape where you provide:

  • Authorization: An API key to authenticate.
  • Content-Type: Specify that the data format is JSON.
  • URL: The website you want to scrape.
  • Options: Custom headers such as User-Agent.

Step 2: Setting Up the cURL Command

To send a POST request with cURL, you need to follow this basic structure:

curl -H "Authorization: Bearer your_api_key" \ -H "Content-Type: application/json" \ 
https://mrscraper.com/api/v1/scrape \ -d '{ "url": "https://targetwebsite.com",
"options": { "headers": { "User-Agent": "custom-agent" } } }'

In this command:

  • -H: Adds request headers like Authorization and Content-Type.
  • -d: Sends data in JSON format (the body of the request).

This will send the request directly to the mrscraper API.

Step 3: Adding a Proxy to cURL

To use a proxy server with cURL, simply add the -x or --proxy option. Here’s how you can route the request through a proxy server:

curl -x http://proxy-server-address:port \ -H 
"Authorization: Bearer your_api_key" \ -H "Content-Type: application/json" 
\ https://mrscraper.com/api/v1/scrape \ -d '{ "url": "https://targetwebsite.com", 
"options": { "headers": { "User-Agent": "custom-agent" } } }'

In this command:

  • -x or --proxy: Specifies the proxy server. Replace proxy-server-address with the IP or domain of your proxy, and port with the correct port (e.g., 8080).

Step 4: Handling Proxy Authentication

If your proxy server requires authentication, you can add the -U option to pass your credentials:

curl -x http://proxy-server-address:port \ -U proxyUsername:proxyPassword \ -H 
"Authorization: Bearer your_api_key" \ -H "Content-Type: application/json" \ 
https://mrscraper.com/api/v1/scrape \ -d '{ "url": "https://targetwebsite.com", 
"options": { "headers": { "User-Agent": "custom-agent" } } }'

In this command: -U: Adds proxy credentials where proxyUsername and proxyPassword are your proxy login details.

Step 5: Debugging the Request

To troubleshoot or see detailed information about your request, you can add the -v (verbose) option. This will output useful debugging information such as the connection details, request headers, and the server’s response:

curl -v -x http://proxy-server-address:port \ -H 
"Authorization: Bearer your_api_key" \ -H 
"Content-Type: application/json" \ https://mrscraper.com/api/v1/scrape \ -d 
'{ "url": "https://targetwebsite.com", "options": { "headers": { "User-Agent": "custom-agent" } } }'

In this command:

  • This will give you a step-by-step breakdown of how the request is processed and help identify any errors.

Example: Using a Proxy to Scrape a Website via Mrscraper API

Here’s a full example that scrapes a webpage using the Mrscraper API, routes the request through a proxy, and includes proxy authentication:

curl -x http://your-proxy.com:8080 \ -U myProxyUser:myProxyPass \ -H 
"Authorization: Bearer my_api_key" \ -H "Content-Type: application/json" \ 
https://mrscraper.com/api/v1/scrape \ -d '{ "url": "https://example.com", 
"options": { "headers": { "User-Agent": "Mozilla/5.0" } } }'

This command will:

  • Use a proxy (http://your-proxy.com:8080).
  • Authenticate with the proxy using myProxyUser and myProxyPass.
  • Send the API request to scrape https://example.com through Mrscraper.
  • Include a User-Agent header to mimic a browser request.

Conclusion

Combining `cURL`, proxies, and the Mrscraper API is an effective way to scrape data while maintaining control over where your requests originate from. Whether you're bypassing geographic restrictions or simply want to ensure anonymity, routing requests through a proxy can help achieve your goals.

By understanding how to properly structure your cURL requests and incorporating proxy settings, you can leverage the power of the Mrscraper API API to gather the data you need.

To deepen your cURL knowledge, especially for web scraping, check out our blog post on "Converting cURL Commands to Python for Efficient Web Scraping." It’s a great resource for transitioning from cURL commands to Python code, making your scraping tasks more efficient and scalable.

Feel free to experiment with different proxies and APIs to maximize your scraping capabilities!

Table of Contents

    Take a Taste of Easy Scraping!