New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try and preserve the structure of the html during a diff #350
base: master
Are you sure you want to change the base?
Changes from 1 commit
ba831cf
c4c504c
b340f0a
66e89a8
f4c38c6
2c9aca8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -87,12 +87,20 @@ Whitespace is generally ignored for the diff but preserved during the diff: | |
second | ||
<ins>third</ins> </pre> | ||
|
||
Ensure we don't preserve the structure of the html during the diff: | ||
Ensure we preserve the html structure on doing the diff: | ||
|
||
>>> a = "<div id='first'>some old text</div><div id='last'>more old text</div>" | ||
>>> b = "<div id='first'>some old text</div><div id='middle'>and new text</div><div id='last'>more old text</div>" | ||
>>> print(htmldiff(a, b)) | ||
<div id="first"><ins>some old text</ins></div> <div id="middle"> <ins>and new</ins> <del>some old</del> text</div><div id="last">more old text</div> | ||
>>> pdiff(a, b) | ||
<div id="first"><ins>some old text</ins></div> <div id="middle"> | ||
<ins>and new</ins> <del>some old</del> text</div><div id="last">more | ||
old text</div> | ||
>>> a = "<div><p>Some text that will change</p><p>Some tags will be added</p></div>" | ||
>>> b = "<div><div><p>Some text that has changed a bit</p><p>All of this is new</p></div></div>" | ||
>>> pdiff(a, b) | ||
<div><div><p>Some text that <ins>has changed a bit</ins> </p> | ||
<p><ins>All of this is new</ins></p> </div> <del>will | ||
change</del><p><del>Some tags will be added</del></p> </div> | ||
Comment on lines
+98
to
+103
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This diff looks weird. Is it getting confused by the nested There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that's correct. In this specific case, the rendered diff looks 'ok' to me (with the insertions preceding the deletions) but yeah, I imagine in larger context it might be less than ideal. I'm going to take another stab at this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. btw, fwiw, this isn't very different from the diff generated by the current code for the same input. The only difference being the closing current lxml diff: <div><div><p>Some text that <ins>has changed a bit</ins><p><ins>All of
this is new</ins></p> </p></div> <del>will change</del><p><del>Some
tags will be added</del></p> </div> diff generated with this patchset <div><div><p>Some text that <ins>has changed a bit</ins> </p>
<p><ins>All of this is new</ins></p> </div> <del>will
change</del><p><del>Some tags will be added</del></p> </div> |
||
|
||
The sixteen combinations:: | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition in this line probably doesn't work any more if
with_idx=True
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, wouldn't it ? Here
balanced
would be something like[(1, '<div>'), None, ...]
ifwith_idx==True
, so theif chunk is not None
check still does the right thing.