Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*:first-of-type and friends are not implemented yet #4

Open
SimonSapin opened this issue Apr 18, 2012 · 8 comments
Open

*:first-of-type and friends are not implemented yet #4

SimonSapin opened this issue Apr 18, 2012 · 8 comments

Comments

@SimonSapin
Copy link
Contributor

From the docs:

*:first-of-type, *:last-of-type, *:nth-of-type, *:nth-last-of-type, *:only-of-type. All of these work when you specify an element type, but not with *

@SimonSapin
Copy link
Contributor Author

Actually, the current implementation is broken. The selector e ~ f:nth-child(3) is translated to XPath e/following-sibling::*[name() = 'f' and (position() = 3)] which is incorrect: it finds the 3rd element after e, not the third child of its parent.

@beaumartinez
Copy link

Hi Simon,

Do you know what the status of this is? Would these be easy to implement?

I wanted to use <element>:nth-of-type.

@SimonSapin
Copy link
Contributor Author

I don’t know how easy it is. What’s needed it to find what the correct XPath translation is, if there is one in the general case.

@beaumartinez
Copy link

I see. I found this Wikibook last night, I'm not sure how reliable it is though.

@SimonSapin
Copy link
Contributor Author

//p[n] is a correct translation of p:nth-of-type(n) when it’s by itself, but not always when combined with other selectors. [n] in XPath indexes within the current scope, whereas the :nth* family of CSS pseudo-classes counts from the first child of the parent.

Example:

<div>
<p id="a"/><p id="b"/><p id="c"/><p id="d"/><p id="e"/>
</div>

In Selectors, #b ~ p:nth-of-type(3) would match #c, and #b ~ p:nth-of-type(2) would not match anything. (Counting from the first child of the <div>). In XPath, //[@id="b"]/following-sibling::p[3] would match #e and //[@id="b"]/following-sibling::p[2] would match #d, counting from the "current position".

I’m not convinced there even is a correct XPath translation of some Selectors.

This kind of thing has lead me to believe that the entire premise of translating Selectors to XPath (or at least to XPath 1.0, what’s implemented in libxml,) is flawed.

I’ve started work on cssselect2 which implements Selectors “for real” without XPath being involved, but it’s blocked on some design decisions that need to be made: Kozea/cssselect2#1

@beaumartinez
Copy link

Thanks for the comprehensive answer @SimonSapin ! It's truly a shame if XPath 1.0 isn't flexible enough.

I'm keen to see how cssselect2 develops.

@redapple
Copy link
Contributor

For the record, here's a tentative implementation using an XPath extension function with lxml: scrapy/parsel#73

@flip111
Copy link

flip111 commented Jul 17, 2018

cssselect contributors, do you have any advice on this? There seem two solutions, cssselect2 and scrapy/parsel ... are any of these solutions mature enough ? Do i still need the cssselect package?

Mellthas added a commit to dompdf/dompdf that referenced this issue Nov 15, 2022
'Type' refers to the element name [1], so e.g. `p.class:first-of-type`
should select every first `p` child that also has the class `class`,
not every first `p.class` child.

Support in combination with the universal selector (`*:first-of-type`
or `:first-of-type` instead of e.g. `p:first-of-type`) is not
implemented. A proper translation to XPath 1.0 might not possible; it
is not implemented in the Python cssselect library either [2].

[1] https://www.w3.org/TR/selectors-3/#nth-of-type-pseudo
[2] scrapy/cssselect#4
Mellthas added a commit to dompdf/dompdf that referenced this issue Nov 16, 2022
'Type' refers to the element name [1], so e.g. `p.class:first-of-type`
should select every first `p` child that also has the class `class`,
not every first `p.class` child.

Support in combination with the universal selector (`*:first-of-type`
or `:first-of-type` instead of e.g. `p:first-of-type`) is not
implemented. A proper translation to XPath 1.0 might not possible; it
is not implemented in the Python cssselect library either [2].

[1] https://www.w3.org/TR/selectors-3/#nth-of-type-pseudo
[2] scrapy/cssselect#4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants