Skip to content

clarkbk/archive-org-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Purpose

Scrape a Webshots user's photos from the Internet Archive (archive.org).

Images are saved in folders in an /output directory inside the project, each corresponding to a Webshots album. Each folder also contains a text file with album metadata.

Configuration

Install requirements and set environment variables.

In the command line:

$ mkvirtualenv archive-org-scraper
$ pip install -r requirements.text
$ touch .env

In .env:

source ~/.virtualenvs/archive-org-scraper/bin/activate
export WEBSHOTS_USER=…

Use

Source required environment variables, and then:

$ python run.py

About

Scrapes photos from old Webshots pages on the Internet Archive

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages