Skip to content

Commit

Permalink
Fix missing reply text when message body is parsed as HTML in [`linke…
Browse files Browse the repository at this point in the history
…dom`](https://github.com/WebReflection/linkedom) (SSR).

 - [`linkedom`](https://github.com/WebReflection/linkedom) is being used https://github.com/matrix-org/matrix-public-archive to server-side render (SSR) Hydrogen (`hydrogen-view-sdk`)
 - This is being fixed by using a explicit HTML wrapper boilerplate with `DOMParser` to get a matching result in the browser and `linkedom`.

Currently `parseHTML` is only used for HTML content bodies in events. Events with replies have content bodies that look like `<mx-reply>Hello</mx-reply> What's up` so they're parsed as HTML to strip out the `<mx-reply>` part.

Before | After
---  |  ---
![](https://user-images.githubusercontent.com/558581/153692011-2f0e7114-fcb4-481f-b217-49f461b1740a.png) | ![](https://user-images.githubusercontent.com/558581/153692016-52582fdb-abd9-439d-9dce-3f04da6959db.png)

Before:
```js
// Browser (Chrome, Firefox)
new DOMParser().parseFromString(`<div>foo</div>`, "text/html").body.outerHTML;
// '<body><div>foo</div></body>'

// `linkedom` ❌
new DOMParser().parseFromString(`<div>foo</div>`, "text/html").body.outerHTML;
// '<body></body>'
```

After (consistent matching output):

```js
// Browser (Chrome, Firefox)
new DOMParser().parseFromString(`<!DOCTYPE html><html><body><div>foo</div></body></html>`, "text/html").body.outerHTML;
// '<body><div>foo</div></body>'

// `linkedom`
new DOMParser().parseFromString(`<!DOCTYPE html><html><body><div>foo</div></body></html>`, "text/html").body.outerHTML;
// '<body><div>foo</div></body>'
```

`linkedom` goal is to be close to the current DOM standard, but [not too close](https://github.com/WebReflection/linkedom#faq). Focused on the streamlined cases for server-side rendering (SSR).

Here is some context around getting `DOMParser` to interpret things better. The conclusion was to only support the explicit standard cases with a `<html><body></body></html>` specified instead of adding the magic HTML document creation and massaging that the browser does.

 - WebReflection/linkedom#106
 - WebReflection/linkedom#108

 ---

Part of #653 to support server-side rendering Hydrogen for the [`matrix-public-archive`](https://github.com/matrix-org/matrix-public-archive) project.
  • Loading branch information
MadLittleMods committed Feb 12, 2022
1 parent 75e2618 commit dfed041
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/platform/web/parsehtml.js
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,6 @@ export function parseHTML(html) {
// If DOMPurify uses DOMParser, can't we just get the built tree from it
// instead of re-parsing?
const sanitized = DOMPurify.sanitize(html, sanitizeConfig);
const bodyNode = new DOMParser().parseFromString(sanitized, "text/html").body;
const bodyNode = new DOMParser().parseFromString(`<!DOCTYPE html><html><body>${sanitized}</body></html>`, "text/html").body;
return new HTMLParseResult(bodyNode);
}

0 comments on commit dfed041

Please sign in to comment.