engineering

How to Scrape Amazon with Node.js: A Beginner-Friendly Guide

Learn how to scrape Amazon using Node.js and Puppeteer in a simple, beginner-friendly guide. This tutorial covers setup, scrolling, pagination, code examples, and tips for extracting product data safely and efficiently.
How to Scrape Amazon with Node.js: A Beginner-Friendly Guide

The modern web thrives on data. Companies rely on it to monitor competition, understand customer behavior, study market trends, and build smarter internal tools. Amazon, being one of the largest e-commerce platforms in the world, holds an enormous amount of product information. The problem? Manually collecting that data is slow, repetitive, and unrealistic at scale.

This is where web scraping becomes valuable. Web scraping allows a program to visit a website, read its structure, and extract the exact information you need—automatically. Instead of scrolling through pages and copying product names or prices by hand, a scraper handles the work for you in seconds. In a workflow where speed and accuracy matter, automation isn’t just helpful—it’s necessary.

In this guide, we’ll walk through how to scrape Amazon using Node.js and Puppeteer, one of the most reliable tools for browser automation. To keep things realistic, we’ll use Amazon’s Health & Beauty category as our example. It’s a crowded section with diverse items—skincare, supplements, cosmetics—making it perfect for demonstrating how a scraper handles dynamic product listings and multiple layouts.

By the end, you’ll understand how to run the script, how it works, and why each step is necessary when extracting data from Amazon’s dynamic pages.

1. Preparing Your Node.js Environment

Before writing any scraping code, you’ll need to install Node.js.

Download Node.js from:

https://nodejs.org

After installing, confirm everything works:

node --version
npm --version

If both return a version number, you’re ready.

2. Setting Up Your Project

Create a new folder and initialize a Node project:

mkdir amazon-scraper
cd amazon-scraper
npm init -y

Then install Puppeteer Core:

npm install puppeteer-core

We use puppeteer-core so we rely on your system’s Chrome/Chromium instead of downloading a new browser.

3. Creating the Scraper File

Create a file:

touch main.js

This file will contain logic for:

  • launching Chrome/Chromium
  • navigating to Amazon
  • scrolling to load dynamic content
  • extracting product titles and prices
  • paginating through search results
  • saving results to JSON

Puppeteer is ideal because Amazon relies heavily on dynamic rendering.

4. Writing the Amazon Scraper (Complete Code)

Below is the full working script:

const puppeteer = require("puppeteer-core");
const fs = require("fs");

let browser = null;
let page = null;
let all = [];

// Save scraped data
function saveJSON() {
  fs.writeFileSync("results.json", JSON.stringify(all, null, 2));
  console.log("Saved results to results.json");
}

// Graceful exit
async function finishAndExit() {
  console.log("\nFinalizing scraper...");
  console.log("Total items collected:", all.length);
  saveJSON();

  try { if (browser) await browser.close(); } catch {}
  process.exit(0);
}

process.on("SIGINT", finishAndExit);

(async () => {
  browser = await puppeteer.launch({
    headless: false,
    executablePath: puppeteer.executablePath(),
    args: [
      "--disable-http2",
      "--disable-features=IsolateOrigins,site-per-process",
      "--no-sandbox",
      "--disable-setuid-sandbox"
    ]
  });

  page = await browser.newPage();
  await page.setUserAgent(
    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120 Safari/537.36"
  );

  const startUrl = "https://www.amazon.com/s?k=health+and+beauty";
  console.log("Opening:", startUrl);

  await page.goto(startUrl, { waitUntil: "networkidle2", timeout: 90000 });

  const scrapePage = async () => {
    await page.evaluate(async () => {
      await new Promise(resolve => {
        let total = 0;
        const distance = 400;
        const timer = setInterval(() => {
          window.scrollBy(0, distance);
          total += distance;
          if (total >= document.body.scrollHeight) {
            clearInterval(timer);
            resolve();
          }
        }, 200);
      });
    });

    return await page.$$eval(
      `
      div[data-asin][data-component-type="s-search-result"],
      div[data-asin].s-result-item,
      div.puis-card-container[data-asin]
      `,
      nodes => nodes.map(n => ({
        title:
          n.querySelector("h2 a span")?.textContent?.trim() ||
          n.querySelector("span.a-text-normal")?.textContent?.trim() ||
          null,
        price:
          n.querySelector(".a-price .a-offscreen")?.textContent?.trim() ||
          null
      }))
    );
  };

  let limit = 10;

  while (limit-- > 0) {
    console.log("\nScraping current page...");
    const items = await scrapePage();
    all.push(...items);

    const current = await page.$eval(
      "span.s-pagination-selected",
      el => parseInt(el.textContent.trim())
    ).catch(() => null);

    if (!current) break;

    const next = current + 1;

    const nextHref = await page.$eval(
      `a.s-pagination-item.s-pagination-button[aria-label="Go to page ${next}"]`,
      el => el.href
    ).catch(() => null);

    if (!nextHref) break;

    await page.goto(nextHref, { waitUntil: "networkidle2", timeout: 90000 });
  }

  await finishAndExit();
})();

Step-by-Step Explanation

Below is a summary of what each block does. (You can expand these sections in your blog if needed.)

Top-Level Imports

Imports Puppeteer and file system utilities.

Save Helper

Writes all scraped results into results.json.

Graceful Shutdown

Ensures that even if you press CTRL+C, your results are saved.

Browser Launch

Opens Chrome/Chromium with safe defaults and anti-detection settings.

User-Agent Spoofing

Helps avoid basic bot-detection issues.

Scrolling Logic

Forces Amazon to render all products by scrolling the page slowly.

Product Extraction

Extracts product titles and prices using multiple selectors to handle layout variations.

Pagination

Reads the current page number and moves to the next page by URL.

Safety Limit

Prevents infinite loops.

5. Running the Scraper

Inside your project folder:

node main.js

A browser will open and the scraper will scroll, load data, and save everything into:

results.json

You can stop anytime with CTRL+C — your progress is still saved.

Quick Tips & Notes

  • Use headless: false while developing.
  • Some sites block default headless browsers; your user agent helps avoid that.
  • For large-scale scraping, using rotating proxies is recommended.
  • Amazon may return CAPTCHAs—handle responsibly.
  • Always follow local laws and the website’s terms.
  • Save partial progress regularly when scraping multiple pages.

Final Thoughts

Scraping Amazon with Node.js gives you a powerful and flexible way to collect data at scale. With Puppeteer, you can handle dynamic pages, scroll-based content, and pagination with ease.

As your project grows, you can enhance the scraper with:

  • rotating proxies
  • CAPTCHA solving
  • scheduling
  • pipelines and dashboards

But even with this beginner-friendly setup, you already have a robust scraping foundation.

Get started now!

Step up your web scraping

Try MrScraper Now

Find more insights here

AI Web Scraping Tools: How Intelligent Scrapers Are Transforming Data Collection

AI Web Scraping Tools: How Intelligent Scrapers Are Transforming Data Collection

Discover how AI web scraping tools work, why they are replacing traditional scrapers, and how businesses use intelligent extraction to collect reliable web data.

Dedicated Proxy vs Shared Proxy: What’s the Difference and Which One Is Better for Web Scraping?

Dedicated Proxy vs Shared Proxy: What’s the Difference and Which One Is Better for Web Scraping?

Dedicated proxies vs shared proxies: learn the differences, benefits, and best use cases to choose the right proxy setup for web scraping and automation.

Instant Data Scraper Review: Features, Benefits, and Limitations (2025 Guide)

Instant Data Scraper Review: Features, Benefits, and Limitations (2025 Guide)

Learn everything about Instant Data Scraper — the free Chrome extension that lets you scrape websites without coding. Understand its features, limitations, best use cases, and when you should use more advanced scraping tools.

What people think about scraper icon scraper

Net in hero

The mission to make data accessible to everyone is truly inspiring. With MrScraper, data scraping and automation are now easier than ever, giving users of all skill levels the ability to access valuable data. The AI-powered no-code tool simplifies the process, allowing you to extract data without needing technical skills. Plus, the integration with APIs and Zapier makes automation smooth and efficient, from data extraction to delivery.


I'm excited to see how MrScraper will change data access, making it simpler for businesses, researchers, and developers to unlock the full potential of their data. This tool can transform how we use data, saving time and resources while providing deeper insights.

John

Adnan Sher

Product Hunt user

This tool sounds fantastic! The white glove service being offered to everyone is incredibly generous. It's great to see such customer-focused support.

Ben

Harper Perez

Product Hunt user

MrScraper is a tool that helps you collect information from websites quickly and easily. Instead of fighting annoying captchas, MrScraper does the work for you. It can grab lots of data at once, saving you time and effort.

Ali

Jayesh Gohel

Product Hunt user

Now that I've set up and tested my first scraper, I'm really impressed. It was much easier than expected, and results worked out of the box, even on sites that are tough to scrape!

Kim Moser

Kim Moser

Computer consultant

MrScraper sounds like an incredibly useful tool for anyone looking to gather data at scale without the frustration of captcha blockers. The ability to get and scrape any data you need efficiently and effectively is a game-changer.

John

Nicola Lanzillot

Product Hunt user

Support

Head over to our community where you can engage with us and our community directly.

Questions? Ask our team via live chat 24/5 or just poke us on our official Twitter or our founder. We're always happy to help.