New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for custom selector engines (Querying nested shadow roots) #5405
Comments
Hey, so we now have an experimental API that lets you do this (on master). Roughly it looks like this: // Custom query handler.
const doesNotHaveClass =
(element, className) => element.querySelectorAll(`:not(.${className})`);
// Register it.
puppeteer.__experimental_registerCustomQueryHandler('doesNotHaveClass',
doesNotHaveClass);
// Prepend queries with the name of the handler.
const elements = await page.$$('doesNotHaveClass/foo'); We have the following APIs: __experimental_registerCustomQueryHandler(name: string, queryHandler: QueryHandler): void;
__experimental_unregisterCustomQueryHandler(name: string): void;
__experimental_customQueryHandlers(): Map<string, QueryHandler>;
__experimental_clearQueryHandlers(): void; Where (element, selector) => Element | Element[] | NodeListOf<Element> Other points of note:
|
this does sound super interesting. For the official Aurelia i18n plugin we're making use of custom attributes, by default named an example would be something like this: <span t="title">Title</span> additionally next to the default textContent target the user can override the target with this syntax <span t="[alt]title">Title</span> So ideally we could forward multiple params to the custom query handler (something along these lines) // Custom query handler.
const i18n =
(element, key, target) => element.querySelectorAll(`[t^='${target ? '[' + target + ']' : ''}${key}']`);
// Register it.
puppeteer.__experimental_registerCustomQueryHandler('i18n ', i18n);
const elementsWithoutTarget = await page.$$('i18n /title');
const elementsWithTarget = await page.$$('i18n /title/alt'); There are many more opportunities but essentially having multiple params available, would open up much more use cases |
@zewa666 These use cases can already be addressed by splitting the selector string into parts in your custom query handler: const myQueryHandler = (element, selector) => {
const params = splitIntoParameters(selector);
return doStuff(element, params);
}; The reason we want to avoid handling this for you (in this case by splitting on |
Oh ok yeah that makes sense. I thought the / was a kind of convention (like xpath) and you had to distinguish params by that. In this case my call can be simply
Thanks for the clarification |
@zewa666 Exactly! The |
@paullewis How would you register a custom query handler that supports both |
I've also just run into this problem i've got .$ working fine, but not when i return an array of elements, with a shadow-dom based query handler, it falls over, my handlers attached at bottom to this comment. Ideally i could support both and puppeteer could choose what action to take accordingly based on if the user wanted $$ or $ This is handled in playwright by registering two functions query, and queryAll, in what if the query handlers supported returning something like this
Then puppeteer could call the appropriate function, or alternatively allow something like this:
Where second function is intended to return arrays or nodelists This works for my lib has two func:
The first function works, but using the 2nd doesn't because the second returns an array. const queryHandler = (element, selector) => {
// minified library guff to inject my code into the handler, scroll past
var querySelectorShadowDom=function(e){"use strict";function o(e,a,c){var t=c.querySelector(e);return document.head.createShadowRoot||document.head.attachShadow?!a&&t?t:h(e,",").reduce(function(e,t){if(!a&&e)return e;var l,d,i,o=h(t.replace(/^\s+/g,"").replace(/\s*([>+~]+)\s*/g,"$1")," ").filter(function(e){return!!e}),r=o.length-1,n=function(t,e){void 0===t&&(t=null);var n=[],o=function e(t){for(var o,r=0;o=t[r];++r)n.push(o),o.shadowRoot&&e(o.shadowRoot.querySelectorAll("*"))};e.shadowRoot&&o(e.shadowRoot.querySelectorAll("*"));return o(e.querySelectorAll("*")),t?n.filter(function(e){return e.matches(t)}):n}(o[r],c),u=(l=o,d=r,i=c,function(e){for(var t,o,r,n=d,u=e,a=!1;u&&(r=u).nodeType!==Node.DOCUMENT_FRAGMENT_NODE&&r.nodeType!==Node.DOCUMENT_NODE;){var c=u.matches(l[n]);if(c&&0===n){a=!0;break}c&&n--,t=i,o=u.parentNode,u=o&&o.host&&11===o.nodeType?o.host:o===t?null:o}return a});return a?e=e.concat(n.filter(u)):(e=n.find(u))||null},a?[]:null):a?c.querySelectorAll(e):t}function h(e,o){return e.match(/\\?.|^$/g).reduce(function(e,t){return'"'!==t||e.sQuote?"'"!==t||e.quote?e.quote||e.sQuote||t!==o?e.a[e.a.length-1]+=t:e.a.push(""):(e.sQuote^=1,e.a[e.a.length-1]+=t):(e.quote^=1,e.a[e.a.length-1]+=t),e},{a:[""]}).a}return e.querySelectorAllDeep=function(e,t){return void 0===t&&(t=document),o(e,!0,t)},e.querySelectorDeep=function(e,t){return void 0===t&&(t=document),o(e,!1,t)},e}({});
// my lib communicating with the new puppeteer api
return querySelectorShadowDom.querySelectorDeep(selector, element);
} Incidentally, Playwright recently made their inbuilt css selector automagically work for shadow dom: https://github.com/microsoft/playwright/releases/tag/v0.14.0 |
You would just register it and use it wherever you like. We don't make any distinction in the code about which function the handler is for. That said the implementation of the handler will either be doing something that expects a single element or a collection of elements, which will naturally lend it to either |
Just a note from a debugging session with @mathiasbynens: |
Been experimenting with the updated API in 5.2.0 its working great! Was able to implement the following with ease: QueryHandler implementation: https://github.com/Georgegriff/query-selector-shadow-dom/pull/36/files#diff-1297c36120ceed6b61d83df8075cc959 |
I've been trying to develop a custom query handler, but I'm having trouble accessing custom properties set on For example: // Handler tries to access custom window property
puppeteer.registerCustomQueryHandler("meow", {
queryOne: (element, selector) => {
return element.querySelector((window as any).meow);
},
queryAll: (element, selector) => {
return element.querySelectorAll((window as any).meow);
},
});
// After page is created and navigated, set a global property
await page.evaluate(() => (window as any).meow = "body"); I would expect the custom query handler to always return the body, but |
We're marking this issue as unconfirmed because it has not had recent activity and we weren't able to confirm it yet. It will be closed if no further activity occurs within the next 30 days. |
We are closing this issue. If the issue still persists in the latest version of Puppeteer, please reopen the issue and update the description. We will try our best to accomodate it! |
What is this?
Other tools offer the ability to provide a custom
engine
selecting elements in the DOM.(example from playwright)
I've seen #382 but i'm not sure this offers such an easy mechanism thats really simple for end users. I'm happy to be wrong on this, i couldn't find any examples.
Playwright offers this. https://github.com/microsoft/playwright/blob/master/docs/api.md#selectorsregisterenginefunction-args
In Selenium world its things like this: https://chercher.tech/java/custom-locators-selenium-webdriver
Why is this useful?
Traditionally these custom locators would be used to provide the ability to select elements via XPATH or JQuery selectors.
Why do i want this?
I maintain: https://github.com/Georgegriff/query-selector-shadow-dom which allows users to write css selectors that automatically pierce web component shadow roots and it was trivial to add support in
Playwright
to use my library as a selector engine.Like so:
Registering this engine allows users to use
click
waitForSelector
and thing that accepts a selector to use my library to automatically pierce shadow roots.How is my engine implemented in playwright?
Playwright defines this interface: https://github.com/microsoft/playwright/blob/master/docs/api.md#selectorsregisterenginefunction-args which accepts a Function/String
They will take your function and pass into into the browser context and handle the rest for you so you can use the engine for
click
etc.My library implements this interface: https://github.com/Georgegriff/query-selector-shadow-dom/blob/master/plugins/playwright/index.js (It does this a little strangely using string because i need to inject my library into the function scope)
The text was updated successfully, but these errors were encountered: