#

robots-txt

Here are 185 public repositories matching this topic...

rimiti / robotizer

Robots.txt parser / generator

parser generator robots-txt robots-parser robotstxt

Updated Sep 18, 2018
TypeScript

dosbenjamin / eleventy-webpack-boilerplate

Front-end workflow to start a new project with Eleventy and Webpack.

Updated Oct 27, 2020
JavaScript

amandeepmittal / robotize

Generates a robots.txt

nodejs javascript npm npm-package robots-txt robots-generator robots

Updated Nov 1, 2019
JavaScript

JamieMagee / robots-txt

python crawler analysis robots-txt

Updated Mar 19, 2018
Jupyter Notebook

leshniak / robotstxt-debug

A tool for debugging robots.txt

debugger crawler seo indexing robots-txt seo-optimization tester seo-tools

Updated Mar 23, 2018
JavaScript

hgruniaux / robotstxt

The repository contains Google-based robots.txt parser and matcher as a C++ library (compliant to C++17).

robots-txt robots-parser robots-exclusion-standard robotstxt robots-exclusion-protocol

Updated Aug 20, 2020
C++

slemarchand / no-robots

🚫🤖 Override /robots.txt to disallow all web crawlers, regardless settings stored in the database. Compatible with Liferay 7.0, 7.1, 7.2, 7.3 and 7.4.

web osgi robots-txt crawlers liferay liferay-portal liferay-dxp liferay-7 liferay71 liferay70 liferay72 liferay-71 liferay-72 liferay73 webcrawlers liferay-73 liferay-74 liferay74 liferay-70

Updated Sep 4, 2021
Java

dubniczky / Bad-Robot

This is a python crawler that disregards robots.txt rules and downloads disallowed resources

python crawler robots-txt osint-python osint-tool

Updated Aug 5, 2023
Python

MaximeGuinard / Robots.txt-Viewer

🌐 Displays the contents of robots.txt and sitemap.xml files of a website google extension

sitemap website extension extensions website-builder websites robots-txt extension-methods sitemap-xml website-design sitemaps website-template robotstxt extension-pack extension-chrome extension-firefox

Updated Jan 3, 2024
JavaScript

austinsonger / sitemapsandrobotsaroundtheweb

Sitemaps and Robots.txt for websites around the world.

Updated Sep 18, 2020

0xIbra / robots-txt-component

Fully native robots.txt parsing component without any dependencies.

nodejs robots-txt robots-parser robots-exclusion-standard robots-node robots-txt-node

Updated Oct 8, 2022
JavaScript

gbenson / ultimate-sitemap-parser

Website sitemap parser

sitemap web robots-txt

Updated May 27, 2023
Python

A3onn / mapptth

A simple to use multi-threaded web-crawler written in C with libcURL and Lexbor.

c graphviz sitemap multi-threading cmake gplv3 web-crawler libcurl robots-txt lexbor

Updated Feb 13, 2024
C

rimiti / robotstxt

Robots.txt parser and generator - Work in progress

robots-txt golang-package robots-parser

Updated Apr 26, 2018
Go

TahaT80 / Robots_Scanner

Robots Scanner

python robot hack scanner python-script hacking scan python3 robots-txt robots py hacktoberfest robotstxt port-scan scanner-web get-data get-info robots-scanner get-web-information

Updated May 21, 2021
Python

enishant / domain-for-sale

This is ready to use template to quickly start selling domain with minimum setup.

sitemap website domain leads robots-txt seo-optimization sitemap-xml website-template lead-generator lead-generation seo-friendly seo-ready simple-website sell-domain lead-gathering

Updated Nov 27, 2022
PHP

Emilia-Capital / eco-friendly-robots-txt

Optimizes your site's robots.txt to reduce server load and CO2 footprint by blocking unnecessary crawlers while allowing major search engines and specific tools.

wordpress wordpress-plugin robots-txt

Updated Feb 22, 2024
PHP

pierre-pvln / robots_creator

Scripts to create a robots.txt file from building blocks

scripts robots-txt apache2

Updated Aug 21, 2019
Batchfile

rihenperry / csuci-mscs-thesis-dist-web-crawler

documents my master's level thesis work on building continous, topical web crawler based on mercator 1999

tex distributed-systems thesis rabbitmq worker scheduling simhash queueing indexing robots-txt event-driven locality-sensitive-hashing webcrawler rate-limiter mercator minhash-lsh-algorithm duplicate-detection

Updated Dec 24, 2019
TeX

RobotstxtGenerator / Robots.txt-Generator

The Robots.txt Generator tool helps you to create the Robots.txt file for your website.

robots-txt robots-generator robots robots-tag robots-txt-generator robots-txt-creator robots-txt-file

Updated Aug 30, 2021

Improve this page

Add a description, image, and links to the robots-txt topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the robots-txt topic, visit your repo's landing page and select "manage topics."