Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bugfix] solve crash when using inspect() on the "pyparsing" package #2294

Merged
merged 3 commits into from May 27, 2022

Conversation

olivierphi
Copy link
Contributor

@olivierphi olivierphi commented May 25, 2022

Type of changes

⚠️ This PR introduces a (minor) breaking change, in order to fix a (minor) issued described with the screenshots below. 👇
But we can of course choose to keep the existing behaviour as-is instead 🙂

Checklist

  • I've run the latest black with default args on new code.
  • I've updated CHANGELOG.md and CONTRIBUTORS.md where appropriate. (N/A)
  • I've added tests for new code.
  • I accept that @willmcgugan may be pedantic in the code review.

Description

As reported there, running this command was making Rich crash:

python3 -c "import pyparsing; from rich import inspect; inspect(pyparsing, methods=True)"

It took me a while to understand what was going wrong, as the code from which the exception pops is in Text.divide(offsets) which is quite a complex part 🤯
But in the end I identified the culprits 🥷 : special ASCII characters! (in docstrings)

We were removing them from the content of the textual content of the Text class, but when caching the length of this content we were using the length of the non-sanitised text - which causes issues in some cases, as the text we're splitting into several lines doesn't actually have the length we think it has.

Should we make control characters contained in docstrings visible in the inspection?

I first solved the bug by just making sure the cached length of the text is the one of the sanitised version of it, but then I noticed that the result was looking a bit odd with docstrings that contains such special characters, such as the WordStart class of this pyparsing package.

  • When dumped with inspect we were ending up with "To emulate the ```` behavior of regular expressions", with 4 backticks in a row without content:
    Screenshot from 2022-05-25 10-56-56

  • That's why I opted for another strategy: replacing these control codes with their "readable" equivalents, so \b is displayed as "\b":
    Screenshot from 2022-05-25 10-58-53

However, although it's rather unlikely I appreciate that doing such a change in Rich's behaviour could break people's code if they were using some parsing over the result of rich.inspect for example.
It depends I guess on whether or not we consider the stripping of control codes in docstrings, when inspecting code, as a feature of a bug? 🤔
If it's feature we may rather choose to keep the existing behaviour for the moment, and potentially change it later on with a major version bump, or an opt-in flag somewhere in Rich? 🙂

@@ -61,3 +61,6 @@ enable_error_code = ["ignore-without-code", "redundant-expr", "truthy-bool"]
[[tool.mypy.overrides]]
module = ["pygments.*", "IPython.*", "commonmark.*", "ipywidgets.*"]
ignore_missing_imports = true

[tool.pytest.ini_options]
testpaths = ["tests"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while working on this I noticed that creating a Python script with a name starting ending with test_ anywhere in the project's repo was collected by Pytest: I think we'd better tell Pytest to stick with the contents of our tests/ folder - which should also make the tests collection faster as a side effect

@@ -19,12 +19,6 @@ def _first_paragraph(doc: str) -> str:
return paragraph


def _reformat_doc(doc: str) -> str:
"""Reformat docstring."""
doc = cleandoc(doc).strip()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that we were ending up calling cleandoc twice on our extracted docstrings - to remedy this I inlined this operation in a single _get_formatted_doc method, later on this same module

if _doc is not None:
if not self.help:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The process of calling cleandoc() on the doctring's content and getting only its first paragraph if self.help is not True is now factorised in a single method, rather than done twice in this class

rich/control.py Show resolved Hide resolved
self.style = style
self.justify: Optional["JustifyMethod"] = justify
self.overflow: Optional["OverflowMethod"] = overflow
self.no_wrap = no_wrap
self.end = end
self.tab_size = tab_size
self._spans: List[Span] = spans or []
self._length: int = len(text)
self._length: int = len(sanitized_text)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's the bugfix itself! 🙂 i.e. we should use the length of the sanitised text, rather than the one of the text we received as an argument

old_length = self._length
self._length = len(new_text)
self._length = len(sanitized_text)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same bug here 🙂

offset = len(self)
text_length = len(text)
text_length = len(sanitized_text)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and here

expected_replacement
)

assert render(Something, methods=True) == expected
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to make sure that we don't have regressions on this bugfix, before trying to solve it I started by writing a test that reproduce the issue : this test was crashing before the fix, and passes now 🎈

@olivierphi olivierphi force-pushed the bugfix-inspect-with-pyparsing branch from 4b1893a to 49e2b1d Compare May 25, 2022 10:32
@olivierphi olivierphi marked this pull request as ready for review May 25, 2022 10:45
@olivierphi olivierphi force-pushed the bugfix-inspect-with-pyparsing branch 2 times, most recently from 2320989 to 74d7807 Compare May 27, 2022 08:24
Copy link
Collaborator

@willmcgugan willmcgugan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. Just a few pendantic requests, and we should be good to merge.

rich/control.py Outdated

_CONTROL_MAKE_READABLE_TRANSLATE: Final = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this CONTROL_ESCAPE

rich/control.py Outdated
@@ -182,6 +198,22 @@ def strip_control_codes(
return text.translate(_translate_table)


def make_control_codes_readable(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer escape_control_code. Would you mind changing the terminology from "make readable" elsewhere?

@@ -220,3 +210,12 @@ def safe_getattr(attr_name: str) -> Tuple[Any, Any]:
f"[b cyan]{not_shown_count}[/][i] attribute(s) not shown.[/i] "
f"Run [b][magenta]inspect[/]([not b]inspect[/])[/b] for options."
)

def _get_formatted_doc(self, object_: Any) -> Optional[str]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding a docstring.

…eadable version in docstrings

With this we allow the following:
- Such characters will be displayed in the data returned by "rich.inspect"
- We fix the crash that can happen in some circumstances in the `Text.divide()` method when some docstrings have such special characters
…eadable version

With this we allow the following:
- Such characters will be displayed in the data returned by "rich.inspect" when they are used in docstrings
- We fix the crash that can happen in some circumstances in the `Text.divide()` method when some docstrings have such special characters
@olivierphi olivierphi force-pushed the bugfix-inspect-with-pyparsing branch from ce0b2ff to 25d0c0e Compare May 27, 2022 13:57
@olivierphi
Copy link
Contributor Author

@willmcgugan "escape control codes" is a better terminology indeed! 🙂
Hopefully this commit addresses your comments: 25d0c0e

@codecov-commenter
Copy link

codecov-commenter commented May 27, 2022

Codecov Report

Merging #2294 (25d0c0e) into master (14d47c9) will decrease coverage by 0.12%.
The diff coverage is 95.42%.

@@            Coverage Diff             @@
##           master    #2294      +/-   ##
==========================================
- Coverage   98.88%   98.76%   -0.13%     
==========================================
  Files          73       73              
  Lines        7629     7684      +55     
==========================================
+ Hits         7544     7589      +45     
- Misses         85       95      +10     
Flag Coverage Δ
unittests 98.76% <95.42%> (-0.13%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
rich/_export_format.py 100.00% <ø> (ø)
rich/live.py 97.90% <40.00%> (-2.10%) ⬇️
rich/console.py 98.29% <96.03%> (-0.49%) ⬇️
rich/_inspect.py 100.00% <100.00%> (ø)
rich/control.py 100.00% <100.00%> (ø)
rich/syntax.py 99.30% <100.00%> (+0.02%) ⬆️
rich/terminal_theme.py 100.00% <100.00%> (ø)
rich/text.py 100.00% <100.00%> (ø)
rich/traceback.py 98.68% <0.00%> (-0.88%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5ccf4ed...25d0c0e. Read the comment docs.

@willmcgugan
Copy link
Collaborator

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IndexError: list index out of range when using inspect(pyparsing, all=True)
3 participants