[Doc] Clarify wrapper behaviour of parse() #153

DanKaplanSES · 2021-09-18T04:21:31Z

Here is a failing test case:

    it("reproduces error", () => {
      const html = `
      <html>
        <body>
          <div>
            <p>b</p>
            <p>c</p>
          <div>
        </body>
      </html>`
      
      const root = parse(html)
      root.querySelector('div').childNodes.unshift(parse('<p>a</p>'));

      expect(root.querySelector("p:nth-of-type(2)").rawText).toEqual("b");
/*
    Expected: "b"
    Received: "c"

      28 |       root.querySelector('div').childNodes.unshift(parse('<p>a</p>'));
      29 |
    > 30 |       expect(root.querySelector("p:nth-of-type(2)").rawText).toEqual("b");
         |                                                              ^
      31 |     });
*/
    });

My workaround was to use childNodes[0].set_content(newContent + oldContent).

The reason I wrote this code in the first place is because I couldn't figure out from the documentation how to prepend content. I looked at the tests and noticed some of them were using childNodes[0] and figured, if you can index that array, you should be able to call other functions on it and have it work.

The text was updated successfully, but these errors were encountered:

DanKaplanSES · 2021-09-18T04:42:42Z

I found another bug using my workaround: it seems to nest another div around everything. I think there are a lot of bugs to find using similar patterns to the above.

taoqf · 2021-09-18T09:48:37Z

const root = parse(html);
root.querySelector('div').childNodes.unshift(parse('<p>a</p>').firstChild);

DanKaplanSES · 2021-09-18T20:04:31Z

I don't understand why I need to do that firstChild. To me, it suggests that everything named root in the documentation is not really the root.

Regardless, finding a part of HTML and modifying it (or its neighbors) seems like a common use case. Can the readme be updated to show examples of this use case?

I don't know if you would prefer another issue for this, or for me to change the title of this one. Let me know.

taoqf · 2021-09-22T01:00:49Z

The root node is not really a node, it is a container of all the root nodes.

nonara · 2021-09-22T21:30:21Z

The node in question is a wrapper node. It is a commonly used pattern for html parsers, however, I think it makes sense to add something to the readme which explains the behaviour of parse.

If you want to change the issue title, I'll reopen if @taoqf is amenable to my doing a PR that clarifies that bit in the readme.

taoqf closed this as completed Sep 18, 2021

nonara changed the title ~~nth-by-type is off by one after childNodes.unshift(x)~~ [Doc] Clarify wrapper behaviour of parse() Sep 29, 2021

nonara added the documentation label Sep 29, 2021

nonara reopened this Sep 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Doc] Clarify wrapper behaviour of parse() #153

[Doc] Clarify wrapper behaviour of parse() #153

DanKaplanSES commented Sep 18, 2021

DanKaplanSES commented Sep 18, 2021

taoqf commented Sep 18, 2021

DanKaplanSES commented Sep 18, 2021

taoqf commented Sep 22, 2021

nonara commented Sep 22, 2021

[Doc] Clarify wrapper behaviour of parse() #153

[Doc] Clarify wrapper behaviour of parse() #153

Comments

DanKaplanSES commented Sep 18, 2021

DanKaplanSES commented Sep 18, 2021

taoqf commented Sep 18, 2021

DanKaplanSES commented Sep 18, 2021

taoqf commented Sep 22, 2021

nonara commented Sep 22, 2021