GitHub - sachin-philip/spiderweb: Crawler is a simple crawl mechanism with no major optimisations.

Crawler a simple webcrawler | SpiderWeb

Crawler is a simple crawl mechanism with no major optimisations.I had done with in 5 hours, sorry that i could't invest more due to my on going works.Having said that I can make a tons of optimisations if you want me to do.

Stack

BackEnd is build of Django (Just the API's, Did in django only beacuse was asked to do. It was just a Flask requirements. Anyways learned that Django have changed a lot).
FrontEnd is build on top of vue2(Its one of my personal favourates, thats it).
https://github.com/picturepan2/spectre for design.

Installation Instructions

Install the requirement file and run the server

pip install -r requirements.txt
python manage.py runserver

#todo

Implementation of a webhook for live crawling updates.
redis caching to avoid duplicate crawling.
proxy round robin to avoid gets blocking.

Thank You

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
core		core
spider		spider
static		static
templates		templates
.gitignore		.gitignore
bitbucket-pipelines.yml		bitbucket-pipelines.yml
manage.py		manage.py
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core

core

spider

spider

static

static

templates

templates

.gitignore

.gitignore

bitbucket-pipelines.yml

bitbucket-pipelines.yml

manage.py

manage.py

readme.md

readme.md

requirements.txt

requirements.txt

Repository files navigation

Crawler a simple webcrawler | SpiderWeb

Stack

Installation Instructions

About

Releases

Packages

Languages

sachin-philip/spiderweb

Folders and files

Latest commit

History

Repository files navigation

Crawler a simple webcrawler | SpiderWeb

Stack

Installation Instructions

About

Topics

Resources

Stars

Watchers

Forks

Languages