How to Scrape a YouTube Channel with Python

YouTube is one of the largest sources of public information on the internet. Creators upload millions of videos every day, and businesses rely on this data for competitor research, keyword analysis, trend monitoring, and content performance tracking.

But manually collecting video information—titles, publish dates, views, thumbnails—takes a long time.
Web scraping solves this problem.

Web scraping allows you to extract data from websites automatically. Instead of opening every video one by one, a Python script can load a channel page, scroll through it, and collect everything you need in minutes.

In this tutorial, we’ll scrape a real YouTube channel:

Example channel:
https://www.youtube.com/@mr_scraper

You’ll learn how to:

Set up Python
Build a working scraper with Playwright
Scroll through a channel feed
Export video metadata into JSON

Why Scrape a YouTube Channel?

When working with data, channel metadata is extremely valuable. Scraping allows you to automatically gather:

Video titles
URLs
Publish dates
View counts
Thumbnails
Channel patterns
Upload frequency
Content categories

These insights help with:

Competitor research
Content strategy
SEO and keyword optimization
Social media analytics
Market intelligence
AI dataset creation

While YouTube provides an official API, it requires API keys, enforces quota limits, and restricts certain fields.
Scraping allows you to collect everything publicly visible without rate limits.

Tools We’ll Use

We’ll scrape the channel using Python + Playwright, because YouTube relies heavily on JavaScript and loads its content dynamically as you scroll.

You will need:

Python 3
(Optional) Virtual environment
Playwright

Step 1 — Project Setup

Create a working folder:

mkdir yt-channel-scraper
cd yt-channel-scraper

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate      # Windows: venv\Scripts\activate

Install Playwright:

pip install playwright
playwright install

This installs Chromium, the browser Playwright will use to render YouTube.

Step 2 — Create the Scraper

Create a file called main.py and paste the script:

import asyncio
from playwright.async_api import async_playwright
import json

CHANNEL = "mr_scraper"
URL = f"https://www.youtube.com/@{CHANNEL}/videos"

async def scrape():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=False)
        page = await browser.new_page()

        print("Opening:", URL)
        await page.goto(URL, wait_until="networkidle")

        # Scroll enough to load all videos
        for _ in range(20):
            await page.evaluate("window.scrollTo(0, document.documentElement.scrollHeight)")
            await asyncio.sleep(2)

        print("Extracting videos...")

        videos = await page.evaluate("""
            () => {
                const results = [];
                const items = document.querySelectorAll("ytd-rich-grid-media");
                items.forEach(item => {
                    const title = item.querySelector("#video-title")?.innerText?.trim() || null;
                    const url = item.querySelector("#video-title-link")?.href || null;
                    const thumb = item.querySelector("ytd-thumbnail img")?.src || null;
                    const duration = item.querySelector(".yt-badge-shape__text")?.innerText?.trim() || null;

                    const metaSpans = item.querySelectorAll(".inline-metadata-item");
                    const views = metaSpans[0]?.innerText?.trim() || null;
                    const uploaded = metaSpans[1]?.innerText?.trim() || null;

                    results.push({
                        title,
                        url,
                        thumbnail: thumb,
                        duration,
                        views,
                        uploaded
                    });
                });
                return results;
            }
        """)
        
        print(f"Collected {len(videos)} videos.")
        
        with open(f"{CHANNEL}_videos.json", "w") as f:
            json.dump(videos, f, indent=2)

        print("Saved:", f"{CHANNEL}_videos.json")

        await browser.close()

asyncio.run(scrape())

Step 3 — How the Script Works

A reliable scraper needs predictable logic. The process works like this:

1. Launch Chromium with Playwright

YouTube relies heavily on JavaScript for thumbnails, metadata, and infinite scroll. Playwright uses a real browser so everything loads properly.

2. Open the channel’s “Videos” tab

Example: https://www.youtube.com/@mr_scraper/videos

3. Auto-scroll the page

YouTube loads videos in blocks as you scroll. The script scrolls multiple times to load additional content.

4. Extract video metadata

Each video card contains:

Data	Selector
Title	`#video-title`
URL	`#video-title-link[href]`
Views	`.inline-metadata-item[0]`
Upload date	`.inline-metadata-item[1]`
Thumbnail	`ytd-thumbnail img`
Duration	`.yt-badge-shape__text`

5. Save as JSON

Example output:

{
  "title": "STOP. Using AI Right now",
  "url": "https://www.youtube.com/watch?v=qw4fDU18RcU",
  "views": "4m views",
  "uploaded": "3 weeks ago",
  "thumbnail": "https://i.ytimg.com/vi/...jpg",
  "duration": "12:03"
}

Troubleshooting Tips

YouTube updates its UI often. If the scraper breaks:

Thumbnails return null → use img#img
Titles don’t match → use getAttribute("title")
Scrolling stops early → increase scroll loops
Views missing → inspect metadata structure with DevTools

Final Thoughts

Scraping YouTube channels with Python is simple once you understand how the page loads dynamic content. Playwright handles JavaScript execution, scrolling, and DOM interaction, making the scraper reliable and flexible.

This method works for: