Cheapo Tumblr Backup

A scraper for largely-text tumblr blogs.

What you need

A working Python 2.x (or 3.x) installation.
A Tumblr API key. You'll use the consumer key for this utility (see below)
The API URL for the blog you want to scrape.

Running

Use pip to install the contents of requirements.txt: pip install -r requirements.txt. Ideally you should use virtualenv to prevent installing these packages globally, where they may conflict with future/past Python software.
Create a config.yml file in the same directory as the scrape.py script. It should contain the keys:
- api_key: Must be a quoted string equalling the API consumer key you got from Tumblr.
- url: The API URL for the blog you want to scrape. This is optional, and can be overridden by the --user option on the script.
Run the scrape.py script. It will go away for awhile and generate a huge html file called posts.html containing all text content of your posts. The content of all photo posts will be dumped into the same directory.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.gitignore		.gitignore
README.md		README.md
config.py		config.py
dump.sh		dump.sh
index.html		index.html
old-posts.html		old-posts.html
requirements.txt		requirements.txt
scrape.py		scrape.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

config.py

config.py

dump.sh

dump.sh

index.html

index.html

old-posts.html

old-posts.html

requirements.txt

requirements.txt

scrape.py

scrape.py

Repository files navigation

Cheapo Tumblr Backup

What you need

Running

About

Releases

Packages

Contributors 2

Languages

barbeque/cheapo-tumblr-backup

Folders and files

Latest commit

History

Repository files navigation

Cheapo Tumblr Backup

What you need

Running

About

Topics

Resources

Stars

Watchers

Forks

Languages