New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
innerText implementation #1245
Comments
And what a pity that rangy |
So, innerText is not standard, and not implemented in at least one major engine (Firefox). Without a standard, I don't think we should implement it. Edit by @TimothyGu (April 24, 2023): this comment was accurate when it was written (in 2015!) but is no longer accurate, since a spec has been created. We welcome any contribution to implement that feature in jsdom. |
Looks like there's some movement in this whole thing with a draft spec here. See also all the references. There are no issues on the repo though, so I wonder how complete it already is / how quick progress will be. |
Firefox has implemented: https://bugzilla.mozilla.org/show_bug.cgi?id=264412 WHATWG semms to approve: whatwg/compat#5 (comment) |
From the spec it's seems like we can't implement |
Yeah, this is not really going to be implementable in jsdom anyway, without a lot of infrastructure work... nobody get their hopes up :(. |
As to layout support requirement: rocallahan/innerText-spec#2 |
…erText` usage to `textContent` based on [this discussion](jsdom/jsdom#1245). Added tests for many evaluators.
Is there any plan to implement it because of WHATWG adoption? |
Yeah... Although the spec requires a lot of stuff jsdom doesn't have, around CSS boxes :(. Not sure what to do. |
Is there any lib for this to plug along with jsdom? |
@domenic care to drop some knowledge on why this is such an infrastructure overhaul? We thought the 800lb gorilla in the room would leave lo-key. But looks like it's not going anywhere. As you know have been wrapping my head around the innards of jsdom. Where would be a great place in the repo to start reviewing code to a jsdom newb? Thanks in advance 🙏 /cc @vsemozhetbyt |
The primary issue is the fact that
|
Still out of scope and no workaround? |
Apparently the spec says:
I think a workaround would be then to simply return |
We implement enough CSS that I don't think that applies. We just don't implement the layout parts... |
Hi guys, any news on this one? |
Just use headless chrome :) |
@domenic from that spec that @coreh mentioned:
https://html.spec.whatwg.org/multipage/rendering.html#being-rendered
If |
Not the nicest of way's to do it but I had some cases where sometimes HTMLElement would work and sometimes it wouldnt.. I couldnt figure out why, so i changed it to be even more hacky. I mocked it using jest spies. innerText.ts
Then you can call it using You can use it without sanitize-html, I've used that to strip out any html |
can we close this ticket? people seem pretty capable of / content to roll
their own solutions
…On Sun, Nov 5, 2023 at 04:10 Andrew Cartwright ***@***.***> wrote:
Not the nicest of way's to do it but I had some cases where sometimes
HTMLElement would work and sometimes it wouldnt.. I couldnt figure out why,
so i changed it to be even more hacky. I mocked it using jest spies.
innerText.ts
let spyGet;
let spySet;
export const createSpies = () => {
Object.defineProperty(Object.prototype, 'innerText', {
get: () => {},
set: () => {},
configurable: true
});
// eslint-disable-next-line @typescript-eslint/ban-types
spyGet = jest.spyOn(Object.prototype, 'innerText' as keyof Object, 'get');
spyGet.mockImplementation(function () {
return this.textContent
.split('\\n')
.filter((text: string) => text && !text.match(/^\\s+$/))
.map((text: string) => text.trim())
.join('\\n');
});
// eslint-disable-next-line @typescript-eslint/ban-types
spySet = jest.spyOn(Object.prototype, 'innerText' as keyof Object, 'set');
spySet.mockImplementation(function (value) {
this.textContent = value;
});
};
export const resetSpies = () => {
spyGet?.mockRestore();
spySet?.mockRestore();
};
Then you can call it using createSpies and resetSpies in your jest tests.
—
Reply to this email directly, view it on GitHub
<#1245 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAAJC54FMGXSOUUM3SQAZLYC5JYJAVCNFSM4BQODFC2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCNZZGM3DQMBSGMZQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Can this please be documented as a limitation in the README? Speaking of which, is there a link that lists all the JSDOM limitations? If so, can this link be added to the README? If it's already there, it seems like it should be found on the "Unimplemented parts" section. EDIT: Here's my attempt at documenting this: #3631 |
See jsdom#1245 for context.
In my opinion, that's even scarier to think about. It's hard enough to troubleshoot when it's in your own project, imagine how hard it would be when you're degrees of separation away from the code that caused the problem. I'm not arguing for a breaking change, but maybe:
IMO, any one of these would improve the current situation. |
The issue addressed in PR #3351 unfortunately used a `contentEditable` value that is not compatible with Firefox [1] This revision implements the first strategy explored by the aforementioned PR by removing the full format from the contenteditable span when we stop the edition. Testing required to define some properties as JSdom does not properly support the implementation of innerText (see [2]) [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/contenteditable#browser_compatibility [2]: jsdom/jsdom#1245 Task: 3754944
The issue addressed in PR #3351 unfortunately used a `contentEditable` value that is not compatible with Firefox [1] This revision implements the first strategy explored by the aforementioned PR by removing the full format from the contenteditable span when we stop the edition. Testing required to define some properties as JSdom does not properly support the implementation of innerText (see [2]) [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/contenteditable#browser_compatibility [2]: jsdom/jsdom#1245 closes #3894 Task: 3754944 Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com>
The issue addressed in PR #3351 unfortunately used a `contentEditable` value that is not compatible with Firefox [1] This revision implements the first strategy explored by the aforementioned PR by removing the full format from the contenteditable span when we stop the edition. Testing required to define some properties as JSdom does not properly support the implementation of innerText (see [2]) [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/contenteditable#browser_compatibility [2]: jsdom/jsdom#1245 Task: 3754944 X-original-commit: a11b5dc
The issue addressed in PR #3351 unfortunately used a `contentEditable` value that is not compatible with Firefox [1] This revision implements the first strategy explored by the aforementioned PR by removing the full format from the contenteditable span when we stop the edition. Testing required to define some properties as JSdom does not properly support the implementation of innerText (see [2]) [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/contenteditable#browser_compatibility [2]: jsdom/jsdom#1245 closes #3915 Task: 3754944 X-original-commit: a11b5dc Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com> Signed-off-by: Rémi Rahir (rar) <rar@odoo.com>
The issue addressed in PR #3351 unfortunately used a `contentEditable` value that is not compatible with Firefox [1] This revision implements the first strategy explored by the aforementioned PR by removing the full format from the contenteditable span when we stop the edition. Testing required to define some properties as JSdom does not properly support the implementation of innerText (see [2]) [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/contenteditable#browser_compatibility [2]: jsdom/jsdom#1245 Task: 3754944 X-original-commit: 91d7a19
The issue addressed in PR #3351 unfortunately used a `contentEditable` value that is not compatible with Firefox [1] This revision implements the first strategy explored by the aforementioned PR by removing the full format from the contenteditable span when we stop the edition. Testing required to define some properties as JSdom does not properly support the implementation of innerText (see [2]) [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/contenteditable#browser_compatibility [2]: jsdom/jsdom#1245 closes #3919 Task: 3754944 X-original-commit: 91d7a19 Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com> Signed-off-by: Rémi Rahir (rar) <rar@odoo.com>
The issue addressed in PR #3351 unfortunately used a `contentEditable` value that is not compatible with Firefox [1] This revision implements the first strategy explored by the aforementioned PR by removing the full format from the contenteditable span when we stop the edition. Testing required to define some properties as JSdom does not properly support the implementation of innerText (see [2]) [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/contenteditable#browser_compatibility [2]: jsdom/jsdom#1245 Task: 3754944 X-original-commit: ed7d3fa
The issue addressed in PR #3351 unfortunately used a `contentEditable` value that is not compatible with Firefox [1] This revision implements the first strategy explored by the aforementioned PR by removing the full format from the contenteditable span when we stop the edition. Testing required to define some properties as JSdom does not properly support the implementation of innerText (see [2]) [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/contenteditable#browser_compatibility [2]: jsdom/jsdom#1245 Task: 3754944 X-original-commit: ed7d3fa
The issue addressed in PR #3351 unfortunately used a `contentEditable` value that is not compatible with Firefox [1] This revision implements the first strategy explored by the aforementioned PR by removing the full format from the contenteditable span when we stop the edition. Testing required to define some properties as JSdom does not properly support the implementation of innerText (see [2]) [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/contenteditable#browser_compatibility [2]: jsdom/jsdom#1245 Task: 3754944 X-original-commit: ed7d3fa
The issue addressed in PR #3351 unfortunately used a `contentEditable` value that is not compatible with Firefox [1] This revision implements the first strategy explored by the aforementioned PR by removing the full format from the contenteditable span when we stop the edition. Testing required to define some properties as JSdom does not properly support the implementation of innerText (see [2]) [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/contenteditable#browser_compatibility [2]: jsdom/jsdom#1245 closes #3922 Task: 3754944 X-original-commit: ed7d3fa Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com> Signed-off-by: Rémi Rahir (rar) <rar@odoo.com>
The issue addressed in PR #3351 unfortunately used a `contentEditable` value that is not compatible with Firefox [1] This revision implements the first strategy explored by the aforementioned PR by removing the full format from the contenteditable span when we stop the edition. Testing required to define some properties as JSdom does not properly support the implementation of innerText (see [2]) [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/contenteditable#browser_compatibility [2]: jsdom/jsdom#1245 closes #3921 Task: 3754944 X-original-commit: ed7d3fa Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com> Signed-off-by: Rémi Rahir (rar) <rar@odoo.com>
The issue addressed in PR #3351 unfortunately used a `contentEditable` value that is not compatible with Firefox [1] This revision implements the first strategy explored by the aforementioned PR by removing the full format from the contenteditable span when we stop the edition. Testing required to define some properties as JSdom does not properly support the implementation of innerText (see [2]) [1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/contenteditable#browser_compatibility [2]: jsdom/jsdom#1245 closes #3920 Task: 3754944 X-original-commit: ed7d3fa Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com> Signed-off-by: Rémi Rahir (rar) <rar@odoo.com>
jsdom is a great tool for web scraping. However the
textContent
is a very inconvenient way to get readable text for html2text conversion.There is a wonderful article about usefulness of negligible
innerText
in many cases:http://perfectionkills.com/the-poor-misunderstood-innerText/
The author suggests
getSelection().toString()
as a very slow workaround, butgetSelection
is not implemented in the jsdom yet.Could you consider an implementing of the
innerText
in the jsdom? The author has done a great exploration about it, he has even added a simple spec at the end.The text was updated successfully, but these errors were encountered: