Skip to content

Crawler is a simple crawl mechanism with no major optimisations.

Notifications You must be signed in to change notification settings

sachin-philip/spiderweb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crawler a simple webcrawler | SpiderWeb

Crawler is a simple crawl mechanism with no major optimisations.I had done with in 5 hours, sorry that i could't invest more due to my on going works.Having said that I can make a tons of optimisations if you want me to do.

Stack

  • BackEnd is build of Django (Just the API's, Did in django only beacuse was asked to do. It was just a Flask requirements. Anyways learned that Django have changed a lot).

  • FrontEnd is build on top of vue2(Its one of my personal favourates, thats it).

  • https://github.com/picturepan2/spectre for design.

Installation Instructions

  • Install the requirement file and run the server
pip install -r requirements.txt
python manage.py runserver

#todo

  • Implementation of a webhook for live crawling updates.
  • redis caching to avoid duplicate crawling.
  • proxy round robin to avoid gets blocking.

Thank You

About

Crawler is a simple crawl mechanism with no major optimisations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published