Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTMLReader handles numeric and named entity references different ways #1

Open
Quutti opened this issue Oct 28, 2019 · 2 comments
Open

Comments

@Quutti
Copy link

Quutti commented Oct 28, 2019

function THtmlReader.ReadNumericEntityNode: Boolean;

ReadNumericEntityNode function handles the readings different way than ReadNamedEntityNode. Numeric entity is read as TEXT_NODE and named entities are read as ENTITY_REFERENCE_NODE, also different events are triggered which causes HTMLParser to handle them in separate ways, which may cause problems when parsing HTML. I.e. /&lt/; and /&/#60/; are handled in separate ways.

You guys know if this is intended functionality or not? Does HTML parsing spec state that these has to be parsed on different ways or something?

I can also provide PR for fixing this if needed.

@ange007
Copy link
Owner

ange007 commented Oct 30, 2019

Hello.
I am not the original author.
One of the original authors: @smsisko, but as far as I understand, he does not develop the library on GitHub (only in sourceforge, but last version is very old).

I just made a fork and redo it for myself: https://github.com/ange007/HTMLp/tree/modern
Description: #2

But I’m ready to accept edits, both in the original branch, and in my own if it will be interesting.

@smsisko
Copy link

smsisko commented Oct 31, 2019

Hi, I haven't used that library in a long time. Back then I made a couple of changes with the original author for a problem a ran into, but that was about it. There wasn't a lot of activiy, so I was added as a maintainer. Nowadays I don't often have an occasion to use Delphi (or Free Pascal), and I really haven't kept up to date with the HTML standard.
Sorry I can't be more helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants