Dauo

1. Introduction

The current image has no alternative text. The file name is: IMG_20250912_225223-scaled.jpg

Web scraping is the process of extracting data from websites. In Python, we commonly use libraries like requests (to fetch web pages) and BeautifulSoup (to parse and extract information from HTML).


2. Installing Required Libraries

Before scraping, install the libraries:

pip install requests beautifulsoup4

3. Basic Web Scraping Example

import requests
from bs4 import BeautifulSoup

# Step 1: Fetch a web page
url = "https://quotes.toscrape.com/"
response = requests.get(url)

# Step 2: Parse the HTML content
soup = BeautifulSoup(response.text, "html.parser")

# Step 3: Extract quotes and authors
quotes = soup.find_all("span", class_="text")
authors = soup.find_all("small", class_="author")

for q, a in zip(quotes, authors):
    print(f"{q.get_text()} - {a.get_text()}")
Output Example:
“The world as we have created it is a process of our thinking.” - Albert Einstein
“It is our choices, Harry, that show what we truly are.” - J.K. Rowling
...

4. Commonly Used BeautifulSoup Methods

MethodDescription
soup.find(tag)Finds the first occurrence of a tag
soup.find_all(tag)Finds all occurrences of a tag
element.get_text()Extracts text content inside an element
element['attribute']Gets the value of an attribute (e.g., href)
soup.select("css_selector")Finds elements using CSS selectors

5. Extracting Links Example

links = soup.find_all("a")
for link in links:
    href = link.get("href")
    text = link.get_text(strip=True)
    print(f"Text: {text} -> Link: {href}")

6. Scraping with CSS Selectors

# Example: Get all quotes using CSS selectors
quotes = soup.select("span.text")
for q in quotes:
    print(q.text)

7. Handling Pagination (Multiple Pages)
page = 1
while True:
    url = f"https://quotes.toscrape.com/page/{page}/"
    response = requests.get(url)
    
    if "No quotes found!" in response.text:
        break
    
    soup = BeautifulSoup(response.text, "html.parser")
    quotes = soup.find_all("span", class_="text")
    
    for q in quotes:
        print(q.get_text())
    
    page += 1

8. Best Practices for Web Scraping

  • ✅ Always check the website’s robots.txt rules.
  • ✅ Avoid overloading servers (use delays with time.sleep).
  • ✅ Consider using APIs if available instead of scraping.
  • ✅ Be respectful and ethical when scraping.

9. Summary

  • requests → Fetches web pages.
  • BeautifulSoup → Parses and extracts data from HTML.
  • Methods like .find(), .find_all(), .select(), and .get_text() help extract elements.
  • Pagination lets you scrape multiple pages in a loop.
  • Always follow best practices and respect website policies.
⬇️ Download Dauo تحميل Server 3

Related Posts

FlyFeditv

📘 Introduction In this project, you’ll create a command-line Expense Tracker that allows users to: ✅ Add expenses with category, amount, and description✅ View total spending and category breakdown✅ Automatically…

Read more

NetCinFly

📘 Introduction In this project, you’ll develop a Student Grades Analyzer in Python that can: This project helps you strengthen your skills with lists, dictionaries, loops, and data analysis logic….

Read more

Eagle_Pro

🌐 Introduction In this project, you’ll create a Web Scraper App in Python that extracts quotes, authors, and tags from a live website.You’ll use the BeautifulSoup and Requests libraries to…

Read more

Cobra_Pro

🌦️ Introduction In this project, you’ll build a Weather App in Python that retrieves live weather information from an online API.You’ll learn how to work with HTTP requests, JSON data,…

Read more

Show7-Pro

Concepts Covered: 🎯 Objective: Create a simple, text-based contact book application that allows users to: 💡 Code (contact_book.py): 🧠 Example Usage Console Output Example: 💾 Notes ⬇️ Download Show7-Pro تحميل…

Read more

Rapid_tv

🔹 Introduction The Rock, Paper, Scissors Game is a classic hand game that you can easily build in Python. 👉 Rules: 🔹 Code Example 🔹 Example Run 🔹 Concepts Learned…

Read more