Skip to content

Releases: fb55/htmlparser2

v9.1.0

05 Jan 11:06
Compare
Choose a tag to compare

Fixes

Features

v9.0.0

10 May 09:05
Compare
Choose a tag to compare

Breaking Changes

  • The tokenizer now uses the EntityDecoder from the entities module #1480
    • Parsing of entities in attributes is now aligned with the HTML spec, and some inputs will produce different results. Eg. in <a href='&amp=boo'> the attribute value won't be modified any more.
    • The ontextentity tokenizer callback now has an endIndex argument; if you use the tokenizer directly, make sure indices are still the same.
  • Stacks inside the parser have been reversed. #1511

Features

  • Added a createDocumentStream function, analogous to createDomStream (which is now deprecated) #1510

Full Changelog: v8.0.2...v9.0.0

v8.0.2

22 Mar 23:43
Compare
Choose a tag to compare

Bug Fixes

  • Reset tokenizer baseState after closing tag name by @KillyMXI in #1460

Other changes

  • Dependency version bumps
  • GitHub Workflows security hardening by @sashashura in #1365
  • refactor(lint): Add eslint-plugin-n and -unicorn by @fb55 in #1352
  • chore(test): Move from JSON tests to specs by @fb55 in #1354
  • docs(readme): Use GitHub Actions CI badge by @fb55 in #1374

New Contributors

Full Changelog: v8.0.1...v8.0.2

v8.0.1

29 Apr 15:44
Compare
Choose a tag to compare
  • Added missing WritableStream export in the package.json 6923fca

v8.0.0...v8.0.1

v8.0.0

23 Apr 11:54
Compare
Choose a tag to compare

Breaking

  • The deprecated FeedHandler class has been removed #1166
    • See #1166 for how to migrate.
  • Typescript >= 4.5 is now required; see #1242
  • The types from domhandler and domutils have changed, the deprecated normalizeWhitespace option was removed #1164
  • The parser was updated to no longer concatenate strings. This led to several changes of internal interfaces. #1045
    • This reduces the memory overhead when parsing streams, and avoids copying memory.
    • Breaking if you were previously extending internals.

Features

  • htmlparser2 is now a dual CommonJS & ESM module #1165

Other changes

New Contributors

Full Changelog: v7.2.0...v8.0.0

v7.2.0

11 Nov 14:33
Compare
Choose a tag to compare

What's Changed

Fixes:

Docs

  • docs(readme): make parseDocument() example clearer by @cameronsteele in #998

Refactors:

  • Introduce sequences & fast forwarding by @fb55 in #1007
  • Emit text before entities once entity is confirmed by @fb55 in #1009

The refactors lead to a combined ~5% speed-up.

New Contributors

  • @cameronsteele made their first contribution in #998

Full Changelog: v7.1.2...v7.2.0

v7.1.2

11 Sep 18:51
Compare
Choose a tag to compare

v7.1.1...v7.1.2

v7.1.1

29 Aug 13:11
Compare
Choose a tag to compare
  • Fixed a bug where implied close tags would be misreported (#933) 903fb43
  • Fixed endIndex of text events being off by 1 (#932) 78ef1b7

v7.1.0...v7.1.1

v7.1.0

27 Aug 23:49
Compare
Choose a tag to compare

Features:

Fixes:

  • htmlparser2@7.0.0 changed how indices were computed. Unfortunately, a lot of edge-cases weren't handled correctly. This version fixes this.
    • refactor: Fix how indices are computed, add attrib indices (#929) 28c162b
    • fix(parser): Fix indices for end, CDATA, add indices to tests (#928) 4e25252
    • fix(parser): Don't override position for implied opening tags (#917) fac221d
    • fix(parser): Index of closing tag was misaligned (#913) 04c411c
  • .pause would lead to data being wrongfully discarded (#927) 78af88d
  • The tokenizer would still emit some data after an error (#923) 08b2040
  • Issue in foreign content: The tag name foreignObject will always be lowercased in HTML e852205

Refactors:

  • refactor(feeds): Move getFeed to domutils (#931) f10dc03
  • refactor(tokenizer): Use explicit empty buffer if we have reached the end 9c30fe6
  • chore(tests): Add test for error without a listener 0eb0067
  • chore(tests): Use proxies to collect events (#920) a2b0bf3
  • chore(tests): Move stream tests into WritableStream.spec (#916) da67eba
  • refactor(tokenizer): Remove unused branches, improve test coverage (#914) a2eae51
  • docs(readme): Update benchmark results d45fc82

v7.0.0...v7.1.0

v7.0.0

20 Aug 21:42
Compare
Choose a tag to compare

htmlparser2@7.0.0 changes a lot of internals, resulting in an 20% overall performance improvement in AndreasMadsen's htmlparser-benchmark.

Breaking changes:

  • Fixed how start & end index positions are calculated (#910) 5ab080e
    • Some indices, especially end indices, will now have changed. Most importantly, end indices will now always be greater or equal than start indices (whoops!).

Features:

  • Added an isVoidElement method to the parser (#785) 00ce57a

Refactors:

  • Use a trie to decode HTML & XML entities in the tokenizer (#863) 9a47a55
    • Leads to large speed-ups when dealing with entities.
  • Iterate over char codes in the tokenizer (#894) f5aed75
    • Improved tokenizer performance by ~40%.
  • Use Map for openImpliesClose in the parser (#911) 39a8109
  • Moved logic of FeedHandler to a function (#912) 3a672ff