Scraper

A starter project for scraping similar data from multiple sources using Node, Cheerio, and Request and saving the result in a MongoDB instance.

Prerequisites

Node & NPM
A MongoDB server instance (specify its url in config/)
An empty Github repo for your version of the scraper

Install

> git clone https://github.com/elnaz/scraper
> cd scraper
> git remote set-url origin git@github.com:YOUR_USERNAME/YOUR_SCRAPER_PROJECT.git
> git push origin master
> npm i

Usage

> npm start

Note: For legal reasons, when you first clone this starter project, it won't work because the example source, /lib/sources/example.js is fake. To add your own sources, see below.

Adding a source

Let's say you need to scrape people from multiple different sources. For each source:

Create a file with the source's name in the /lib/sources/ directory.
In /lib/sources/source-name.js,

Define and export a URL constant of the source's web page.
Define and export a parsePeople function that takes in a Cheerio selector $, uses it to select the data you want to scrape about each person on the page, and returns an array of parsed JSON people objects.

Require the new source in the SOURCES array of /lib/index.js.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
db		db
lib		lib
.eslintrc		.eslintrc
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
npm-shrinkwrap.json		npm-shrinkwrap.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config

config

db

db

lib

lib

.eslintrc

.eslintrc

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

npm-shrinkwrap.json

npm-shrinkwrap.json

package.json

package.json

Repository files navigation

Scraper

Prerequisites

Install

Usage

Adding a source

About

Releases

Packages

Languages

License

elnaz/scraper

Folders and files

Latest commit

History

Repository files navigation

Scraper

Prerequisites

Install

Usage

Adding a source

About

Topics

Resources

License

Stars

Watchers

Forks

Languages