selectorlib

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

Free software: MIT license
Documentation: https://selectorlib.readthedocs.io.

Example

>>> from selectorlib import Extractor >>> yaml_string = """ title: css: "h1" type: Text link: css: "h2 a" type: Link """ >>> extractor = Extractor.from_yaml_string(yaml_string) >>> html = """ <h1>Title</h1> <h2>Usage <a class="headerlink" href="http://test">¶</a> </h2> """ >>> extractor.extract(html) {'title': 'Title', 'link': 'http://test'}

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.github		.github
docs		docs
selectorlib		selectorlib
tests		tests
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
.travis.yml		.travis.yml
AUTHORS.rst		AUTHORS.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
HISTORY.rst		HISTORY.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.rst		README.rst
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

License

scrapehero/selectorlib

Folders and files

Latest commit

History

Repository files navigation

selectorlib

Example

About

Topics

Resources

License

Stars

Watchers

Forks

Languages