Skip to content

Scraped rotten tomatoes. Built a currency converter using exchangeratesapi.io. Used Github API, iTunes API and EDAMAM API for cool and fun stuff

Notifications You must be signed in to change notification settings

vishnukanduri/Web-scraping-and-API-in-Python

Repository files navigation

Web-scraping-and-API-in-Python

In this repository, you will know and learn in-depth about:

  • How to use APIs like EDAMAM API, Github API and iTunes API.

    • Initial setup and registration
    • Passing parameters
    • Testing invalid input
    • Investigating output
    • Structuring and exporting data
    • sending GET and POST requests
    • Pagination
    • Extracting results from multiple pages
  • Building a currency converter using exchange rates API.

    • Extracting data on currency exchange rates
    • Handling JSON
    • Obtaining historical exchange rates
    • Extracting data from a time period
  • How to download files with requests.

    • Naive downloading
    • Streaming the download to a file
    • Writing to a file
  • Using BeautifulSoup library.

    • Making a GET request and soup
    • Exporting the HTML to a file
    • Searching and Navigating HTML tree
    • Extracting the text
    • Extracting data from HTML tree and nested tags
    • Searching by attributes
    • Processing links and multiple links at once
    • Scraping multiple pages automatically
  • Scraping Rotten Tomatoes

    • Choosing a parser among html.parser and lxml
    • Finding an element containing all the data
    • Extracting the title, year and score of each movie (including preprocessing and cleaning)
    • Extracting adjusted score, synopsis, critics consensus (plus 2 ways of text processing), directors and cast info
    • Representing the data in structured form and exporting the data
  • Scraping HTML tables with the help of pandas

    • Extracting tables with Beautiful Soup
    • Using Pandas to extract tables
  • Exploring requests-html library:

    • Searching for elements, text;
    • Using CSS selectors - select elements based on ID, class, tag name and other attributes
    • Combining different filters together into a compound selector
    • Incorporating tag hierarchy
    • Scraping data generated by JavaScript via Asynchronous sessions