Skip to content

Latest commit

 

History

History
25 lines (17 loc) · 879 Bytes

static-web-content-scraping-with-requests-and-beautiful-soup.md

File metadata and controls

25 lines (17 loc) · 879 Bytes

Static Web Content Scraping with Requests and Beautiful Soup

For scraping static websites, Requests and Beautiful Soup are the go to libraries for me.

It's worth noting that if the data you're trying to scrape are dynamically loaded through JavaScript or APIs, then this method won't work.

import requests
from bs4 import BeautifulSoup

url = "https://konekoya.github.io"

html_content = requests.get(url).text
soup = BeautifulSoup(html_content, "html.parser")

# We can use CSS selectors
el = soup.select_one(".avatar__title")
print(el.getText()) # Joshua

# Or by its attributes
img = soup.find("img", {"class", "avatar__img"})
print(img["alt"]) # Joshua's Picture

More example code can be found in the official docs