How to Web Scrape a Table in Python: From Static HTML to Dynamic Pages
Web scraping tables is one of the most practical ways to collect structured data from websites—whether it’s financial statistics, sports results, academic records, or product lists. In this guide, we’ll explore how to web scrape a table in Python, using both simple and advanced methods, with examples tailored to real-world use cases.
1. The Quick Way: Using pandas.read_html()
The easiest method for scraping tables is with pandas.read_html(), which automatically detects and converts HTML tables into Pandas DataFrames.
import pandas as pd
url = "https://en.wikipedia.org/wiki/Demographics_of_India"
tables = pd.read_html(url, match="Population distribution")
df = tables[0]
print(df.head())
- This method uses
BeautifulSoupandlxmlunder the hood. - The
matchparameter helps target a specific table.
Pros: Extremely fast and simple. Cons: Only works on static HTML tables.
2. More Control: BeautifulSoup + Requests
If you need finer control or want to clean the data during extraction, combining requests with BeautifulSoup is a reliable approach.
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = "https://datatables.net/examples/styling/stripe.html"
resp = requests.get(url)
soup = BeautifulSoup(resp.text, "html.parser")
table = soup.find("table", class_="stripe")
rows = []
for tr in table.tbody.find_all("tr"):
cells = [td.get_text(strip=True) for td in tr.find_all("td")]
rows.append(cells)
df = pd.DataFrame(rows, columns=[th.get_text() for th in table.thead.find_all("th")])
print(df.head())
This is helpful when:
- The table is nested inside custom HTML structures.
- You want to customize how rows and columns are parsed.
3. Scraping Dynamic Tables with Selenium
If the table is loaded dynamically using JavaScript (AJAX), then a static HTML parser won’t work. In this case, you can use Selenium to load and render the page as a browser would.
from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
driver = webdriver.Chrome()
driver.get("https://example.com/dynamic_table")
html = driver.page_source
soup = BeautifulSoup(html, "html.parser")
table = soup.find("table", id="myTable")
df = pd.read_html(str(table))[0]
driver.quit()
print(df.head())
Pros: Can handle JavaScript-heavy websites. Cons: Slower, requires browser drivers like ChromeDriver.
4. Accessing Hidden APIs Behind Tables
Sometimes the table content is not hardcoded into the HTML but fetched from an API in the background. This is actually a more efficient way to extract data:
- Open DevTools → Network → XHR/Fetch
- Locate the API URL used to load table data
- Use
requests.get()to retrieve JSON data
import requests
import pandas as pd
api = "https://www.levantineceramics.org/vessels/datatable.json"
data = requests.get(api).json()
df = pd.DataFrame(data["data"])
print(df.head())
Pros: Fast and clean. Cons: Requires inspecting the site’s network calls.
5. Scalable Scraping with Scrapy
If you're building a large-scale scraper or need asynchronous performance, Scrapy is a powerful Python framework for crawling and extracting data.
import scrapy
class TableSpider(scrapy.Spider):
name = "table_spider"
start_urls = ["https://example.com/page_with_table"]
def parse(self, response):
for row in response.xpath('//table//tr'):
yield {
'column1': row.xpath('td[1]/text()').get(),
'column2': row.xpath('td[2]/text()').get()
}
Pros: Great for multiple pages, built-in pipelines. Cons: More complex setup and learning curve.
Comparison Table
| Need | Method | Pros | Cons |
|---|---|---|---|
| Simple HTML tables | pandas.read_html() |
Fast and beginner-friendly | Only works on static content |
| Custom structure | BeautifulSoup + requests | High control, clean data | More code required |
| JavaScript tables | Selenium | Can render dynamic content | Slower, heavier setup |
| Background API | Direct API request | Fast and efficient | Requires DevTools inspection |
| Large-scale scraping | Scrapy | Scalable and async | Advanced setup |
Responsible Scraping
Before scraping, always:
- Check
robots.txtand the site’s Terms of Service - Use rate limiting to avoid overloading the server
- Add headers like user-agent to mimic a browser
- Use proxies or headless browsing to avoid blocks
No-Code Scraping with MrScraper
If coding isn’t your thing—or you need to extract tables from difficult or protected websites—use MrScraper.
MrScraper is a visual, AI-powered web scraping tool that makes it easy to:
- Extract tables with just a few clicks
- Scrape JavaScript-rendered pages
- Export to CSV or JSON
- Use proxy rotation and CAPTCHA bypass automatically
Whether you're scraping product lists, public records, or movie data, MrScraper handles the hard part for you—no code required.
Conclusion
Learning how to web scrape a table in Python opens up a world of possibilities for data analysis, automation, and research. Whether you’re scraping a static table from Wikipedia or a dynamic one from an e-commerce site, Python offers flexible tools to make the job easier.
And for those who want the simplest, most efficient solution, MrScraper helps you collect structured data from any website—without touching a line of code.
Ready to scrape your first table? Try MrScraper today.
Table of Contents
Take a Taste of Easy Scraping!
Get started now!
Step up your web scraping
Find more insights here
Chrome Proxy Settings: How to Configure & Use Them
Learn how to configure Chrome proxy settings for better privacy, security, and browsing control. This guide explains what proxies are, how to set them up on Windows, macOS, and mobile, plus common troubleshooting tips and best practices to ensure smooth and secure web access.
Everything You Need to Know About hCaptcha and Why It Matters
Learn what hCaptcha is, how it works to stop bots while protecting user privacy, its key features, use cases, pricing tiers, and how to integrate it into your website or app.
Market Intelligence Tools: What They Are & How to Choose the Right One
Market intelligence tools help businesses gather and analyse external data from competitors to market trends to make smarter, faster decisions. This guide explains what these tools are, why they matter, the key features to look for, and how to choose the right one to gain a competitive edge in a rapidly changing market.
@MrScraper_
@MrScraper