Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try and preserve the structure of the html during a diff #350

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Commits on Sep 12, 2022

  1. Try and preserve the structure of the html during a diff

    There exists a bug in the current `htmldiff` code, where by the generated diff
    changes the structure of the html:
    
    ```python
    >>> from lxml.html import diff
    >>> a = "<div id='first'>some old text</div><div id='last'>more old text</div>"
    >>> b = "<div id='first'>some old text</div><div id='middle'>and new text</div><div id='last'>more old text</div>"
    >>> diff.htmldiff(a, b)
    ('<div id="middle"> <div id="first"><ins>some old text</ins></div><ins>and new</ins> <del>some old</del> text</div><div id="last">more old text</div>')
    >>>
    
    ```
    
    This patchset is an attempt to fix that issue.
    lonetwin committed Sep 12, 2022
    Configuration menu
    Copy the full SHA
    ba831cf View commit details
    Browse the repository at this point in the history
  2. Attempt to be more consistent with rest of the code in terms of quote…

    …s and string interpolation.
    lonetwin committed Sep 12, 2022
    Configuration menu
    Copy the full SHA
    c4c504c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b340f0a View commit details
    Browse the repository at this point in the history

Commits on Sep 13, 2022

  1. Slight cleanup

    Steve committed Sep 13, 2022
    Configuration menu
    Copy the full SHA
    66e89a8 View commit details
    Browse the repository at this point in the history
  2. Use the same approach for unbalanced_end tags as we do for unbalanced…

    …_start
    
    ...also some minor clean up.
    lonetwin committed Sep 13, 2022
    Configuration menu
    Copy the full SHA
    f4c38c6 View commit details
    Browse the repository at this point in the history

Commits on Dec 15, 2022

  1. htmldiff: Avoid incorrectly identifying balanced/unbalanced tags in `…

    …merge_insert`
    
    In the recently revised implementation of `merge_insert` we were checking
    whether a tag exists in the `balanced/unbalanced` tags list by referring to
    just the tag itself. This is unreliable since the same tag might exist in both
    list. Incorrect identification leads to skipping of the `<ins>` tags in some
    cases, resulting in incorrect diff being rendered.
    
    This pathset fixes the issue described and adds a test to validate it.
    lonetwin committed Dec 15, 2022
    Configuration menu
    Copy the full SHA
    2c9aca8 View commit details
    Browse the repository at this point in the history