Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__doc__ attribute references disabling all removal of literals #11

Open
jplloyd opened this issue May 6, 2020 · 2 comments
Open

__doc__ attribute references disabling all removal of literals #11

jplloyd opened this issue May 6, 2020 · 2 comments
Labels
defect Something isn't working

Comments

@jplloyd
Copy link

jplloyd commented May 6, 2020

When running pyminify --remove-literal-statements file.py, any reference to __doc__ anywhere in the processed module will disable removal of any literal.

The documentation states (emphasis mine):

"If the module uses the __doc__ name the module docstring will be retained"

Should the documentation be changed to clarify that no literals will be removed, or is the current behavior not intended?

Another thing to note is that only references to attributes named __doc__ will disable the removal, meaning that e.g.

"""module docstring"""

print(__doc__ + " surely no sane person does this?")

will strip the docstring and make the resulting code fail. This is a very contrived example, of course, and I hope it doesn't actually exist anywhere (but it probably does).

It may be that this issue should be split up into multiple issues, depending on the intended behavior, apologies in advance.


Side note: does it make sense to disable literal removal due to references to any attribute named __doc__? For all we know the reference may be to the docstring of something that is defined in another module, which itself had its docstrings stripped.

It seems to me that since docstring removal in the general case comes with some risks, and should be done with care (even if most of the time it won't have any practical impact), there may be some value in providing an additional flag that disables the __doc__ check entirely (with appropriately severe warnings to the user).

@dflook
Copy link
Owner

dflook commented May 7, 2020

Thanks for creating an issue!
The documentation is correct that __doc__ should only retain the module docstring - this isn't working and is a bug.

In addition to that, any usage of a __doc__ attribute disables all literal statement removal, which is intended but not documented. This is a documentation defect. The reason for this is to minimise the possibility of broken output. This is something I actually ran into a lot when testing python minifier.

I'm thinking about replacing --remove-literal-statements with separate options for:

  • removing non-docstring literal statements
  • removing module docstring (always or suppressed by __doc__ name)
  • removing other docstrings (always or suppressed by __doc__ attribute.

@dflook dflook added the defect Something isn't working label May 7, 2020
@jplloyd
Copy link
Author

jplloyd commented May 8, 2020

I'm thinking about replacing --remove-literal-statements with separate options for:

* removing non-docstring literal statements

* removing module docstring (always or suppressed by `__doc__` name)

* removing other docstrings (always or suppressed by `__doc__` attribute.

Sounds sensible. Ultimately docstring removal doesn't really fall inside the scope of minification (in python), since it can change execution semantics, so making users have to explicitly enable the potential self-foot-shooting should be the way to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants