Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Clarify wrapper behaviour of parse() #153

Open
DanKaplanSES opened this issue Sep 18, 2021 · 5 comments
Open

[Doc] Clarify wrapper behaviour of parse() #153

DanKaplanSES opened this issue Sep 18, 2021 · 5 comments

Comments

@DanKaplanSES
Copy link

Here is a failing test case:

    it("reproduces error", () => {
      const html = `
      <html>
        <body>
          <div>
            <p>b</p>
            <p>c</p>
          <div>
        </body>
      </html>`
      
      const root = parse(html)
      root.querySelector('div').childNodes.unshift(parse('<p>a</p>'));

      expect(root.querySelector("p:nth-of-type(2)").rawText).toEqual("b");
/*
    Expected: "b"
    Received: "c"

      28 |       root.querySelector('div').childNodes.unshift(parse('<p>a</p>'));
      29 |
    > 30 |       expect(root.querySelector("p:nth-of-type(2)").rawText).toEqual("b");
         |                                                              ^
      31 |     });
*/
    });

My workaround was to use childNodes[0].set_content(newContent + oldContent).

The reason I wrote this code in the first place is because I couldn't figure out from the documentation how to prepend content. I looked at the tests and noticed some of them were using childNodes[0] and figured, if you can index that array, you should be able to call other functions on it and have it work.

@DanKaplanSES
Copy link
Author

I found another bug using my workaround: it seems to nest another div around everything. I think there are a lot of bugs to find using similar patterns to the above.

@taoqf
Copy link
Owner

taoqf commented Sep 18, 2021

const root = parse(html);
root.querySelector('div').childNodes.unshift(parse('<p>a</p>').firstChild);

@taoqf taoqf closed this as completed Sep 18, 2021
@DanKaplanSES
Copy link
Author

I don't understand why I need to do that firstChild. To me, it suggests that everything named root in the documentation is not really the root.

Regardless, finding a part of HTML and modifying it (or its neighbors) seems like a common use case. Can the readme be updated to show examples of this use case?

I don't know if you would prefer another issue for this, or for me to change the title of this one. Let me know.

@taoqf
Copy link
Owner

taoqf commented Sep 22, 2021

The root node is not really a node, it is a container of all the root nodes.

@nonara
Copy link
Collaborator

nonara commented Sep 22, 2021

The node in question is a wrapper node. It is a commonly used pattern for html parsers, however, I think it makes sense to add something to the readme which explains the behaviour of parse.

If you want to change the issue title, I'll reopen if @taoqf is amenable to my doing a PR that clarifies that bit in the readme.

@nonara nonara changed the title nth-by-type is off by one after childNodes.unshift(x) [Doc] Clarify wrapper behaviour of parse() Sep 29, 2021
@nonara nonara reopened this Sep 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants