clj-robots-parser

What

A Clojure(-script) library to parse robots.txt files as specified by The Great Goog themselves. As robots.txt is woefully underspecified in the "official" docs, this library tolerates anything it doesn't understand, extracting the data it does.

It can use the extracted data to query whether a given user-agent is allowed to crawl a given URL.

Why

Why use Google's (much more stringent) documentation for handling robots.txt? In terms of SEO, googlebot is what you ought to care about the most.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
src/clj_robots_parser		src/clj_robots_parser
test/clj_robots_parser		test/clj_robots_parser
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
project.clj		project.clj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src/clj_robots_parser

src/clj_robots_parser

test/clj_robots_parser

test/clj_robots_parser

.gitignore

.gitignore

.travis.yml

.travis.yml

LICENSE

LICENSE

README.md

README.md

project.clj

project.clj

Repository files navigation

clj-robots-parser

What

Why

About

Releases

Packages

Languages

License

isker/clj-robots-parser

Folders and files

Latest commit

History

Repository files navigation

clj-robots-parser

What

Why

About

Resources

License

Stars

Watchers

Forks

Languages