Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

element>element selector does not work relative to an element #86

Open
frisi opened this issue Dec 7, 2018 · 4 comments
Open

element>element selector does not work relative to an element #86

frisi opened this issue Dec 7, 2018 · 4 comments

Comments

@frisi
Copy link

frisi commented Dec 7, 2018

in version 1.0.3 i get an exception when using cssselect on an element to select it's direct children
element > element (see https://www.w3schools.com/cssref/sel_element_gt.asp)

>>> from lxml import html
>>> html.fromstring('<html><body><div class="parent"><div class="child"><div class="child"></div></div></div></body></html>')
<Element html at 0x7feadf137d08>
>>> tree=html.fromstring('<html><body><div class="parent"><div class="child"><div class="child"></div></div></div></body></html>')
>>> tree.cssselect('div.parent')
[<Element div at 0x7feadf137e10>]
>>> tree.cssselect('div.parent')[0].cssselect('> .child')
*** SelectorSyntaxError: Expected selector, got <DELIM '>' at 0>

in version 0.9.1 the following worked w/o raising an exception, however it leads to an unexpected result since the second div.child is no direct child of div.parent

>>> tree=html.fromstring('<html><body><div class="parent"><div class="child"><div class="child"></div></div></div></body></html>')
# works but should return only one element
>>> tree.cssselect('div.parent')[0].cssselect('> .child')
[<Element div at 0x7fa6e973def0>, <Element div at 0x7fa6e973dfb0>]

> only works when parent selector is given in the selector

>>> tree.cssselect('div.parent > .child')
[<Element div at 0x7fa6e973de90>]

A) is it a regression, that element.cssselect('> .child') raises an exception on recent versions?

B) is there a way to select a direct child given the parent element?

@frisi
Copy link
Author

frisi commented Dec 13, 2018

@dangra @kmike as you did the last commits to this repos i'm asking you.
can you shed some light on this? if not - who shall i ask?
thank you!

@redapple
Copy link
Contributor

@frisi , I'm really not sure how version 0.9.1 did it, but to me, something like > .child alone, i.e. starting with > is not a valid CSS3 selector.
This is one example where CSS3 falls short when working in specific subtrees of the document, and where XPath can be used directly.

Maybe there's something for your use-case in CSS4, around :scope elements, but I'm not sure I understand how it's supposed to work. See relative selectors.

Certain contexts may accept relative selectors, which are a shorthand for selectors that represent elements relative to a :scope element (i.e. an element that matches :scope). In a relative selector, “:scope ” (the :scope pseudo-class followed by a space) is implied at the beginning of each complex selector that does not already contain the :scope pseudo-class. This allows the selector to begin syntactically with a combinator. However, it must be absolutized before matching.

@frisi
Copy link
Author

frisi commented Feb 12, 2019

thanks for your detailed explanation @redapple.
IIUC element.cssselect('> a') is not made/meant to work on subtrees.

i replaced the relative css selects with xpath (which - for me at least - is a lot less readable ;-):

broken = tree.cssselect('div.parent')[0].cssselect('> .child')
works = tree.cssselect('div.parent')[0].xpath('./*[contains(concat(" ", @class, " "), " child ")]')

is there a place in the documentation to add a note on this limitation?
if not, i'd say this issue can be closed.

@rulatir
Copy link

rulatir commented Mar 18, 2024

Are there really no better workarounds than dropping the entire selector to XPath manually? Perhaps there is some esoteric XPath way to take an existing expression and turn it into something that means "the same but only direct children of the scope element"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants