Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addressing problems raised in sphinx_issues #10104 #863

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

hoangduytranuk
Copy link

branch

master

purpose

  • Addressed several issues with PYTHON_FORMAT pattern
  • Addressed issues with comment blocks

details

  • The PYTHON_FORMAT pattern wrongly found incidences of percentage sign, such as 50% or %28c in links, and flagging the message as python-format, which is exposed under the use of editors such as PoEdit as a wrong use of python-format feature, which many translators use. This editor is RECOMMENDED by Blender.org for use in the internationalisation projects.
  • Problem is solved by added look behind to see if it is not a space character and look ahead to see if the character ended the pattern is one of the allowed symbols - note: many will probably needed to manually enter to allow expansion of the use of '%s %d' etc. - I have only added characters that were found in one of the blender's po file vi.po file. One can download this file to test and simply use the following Python code:
from sphinx_intl import catalog as c
home = os.environ['HOME']
file_path = os.path.join(home, 'vi.po')
out_file = os.path.join(home, 'test_vi.po')
data = c.load_po(file_path)
c.dump_po(out_file, data, line_width=4096)

The original file doesn't contain any 'python-format', after the test, there should be 582 entries flagged as python-format, if load into the PoEditor, there should not be any PINK flagging indicating misuse of python-format.

  • Comment blocks are particularly good use for developers and translators as they are the quick reference to where the line originated from (which source files) and what comments developers had placed in to notify others. For this reason, several issues are address here:
    - Remove duplications in the entries
    - Sort entries alphabetically for easer spotting and locating the origins
    - Remove the wrapping by line length to the comment block as this SHOULD NOT be done. The observation that xgettext also performing wrapping is a fallacy as test shown using 'msgmerge --no-wrap' doesn't always wrap every lines, only a few incidences are found, indicating there is an error in the original xgettext code. This wrapping is particularly annoying for developers observing changes in the po files in a repository. Only physical changes to either msgid or msgstr should be observed, not the comment block, especially when there are no changes and the flagging of changes is created by the effects of wrapping functionality of the software.

link to the issues:

Message.locations duplicate unnecessary #10104

Copy link
Member

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond the inline comments:

  • can you clean up the commits and commit messages?
  • could you add tests that prove these changes do what they should?

self.locations = list(distinct(locations))
self.locations = list(set(locations))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changed? Using set will lose the original order of locations.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for your comments. Please read my original post here (Message.locations duplicate unnecessary) for the original observations and tests performed. I just personally found that locations will need to be sorted in alphabetical order for user to easily observing and finding the file, especially when the location list is long. so to make it unique using 'set' would not affecting the performance and the correctness.

This is one example of location list in the Blender

#: ../../manual/about/contribute/guides/markup_guide.rst:143
#: ../../manual/addons/3d_view/3d_navigation.rst
#: ../../manual/addons/3d_view/math_vis_console.rst
#: ../../manual/addons/3d_view/measureit.rst
#: ../../manual/addons/3d_view/precision_drawing_tools/index.rst
#: ../../manual/addons/3d_view/stored_views.rst
#: ../../manual/addons/3d_view/vr_scene_inspection.rst
#: ../../manual/addons/add_curve/assign_shape_keys.rst
#: ../../manual/addons/add_curve/btracer.rst
#: ../../manual/addons/add_curve/curve_tools.rst
#: ../../manual/addons/add_curve/extra_objects.rst
#: ../../manual/addons/add_curve/ivy_gen.rst
#: ../../manual/addons/add_curve/sapling.rst
#: ../../manual/addons/add_curve/simplify_curves.rst
#: ../../manual/addons/add_mesh/ant_landscape.rst
#: ../../manual/addons/add_mesh/archimesh.rst
#: ../../manual/addons/add_mesh/boltfactory.rst
#: ../../manual/addons/add_mesh/discombobulator.rst
#: ../../manual/addons/add_mesh/geodesic_domes.rst
#: ../../manual/addons/add_mesh/mesh_extra_objects.rst
#: ../../manual/addons/animation/animall.rst
#: ../../manual/addons/animation/bone_selection_sets.rst
#: ../../manual/addons/animation/corrective_shape_keys.rst
#: ../../manual/addons/animation/turnaround_camera.rst
#: ../../manual/addons/camera/camera_rigs.rst
#: ../../manual/addons/development/dependency_graph.rst
#: ../../manual/addons/development/edit_operator.rst
#: ../../manual/addons/development/icon_viewer.rst
#: ../../manual/addons/development/is_key_free.rst
#: ../../manual/addons/interface/amaranth.rst
#: ../../manual/addons/interface/brush_menus.rst
#: ../../manual/addons/interface/collection_manager.rst
#: ../../manual/addons/interface/context_menu.rst
#: ../../manual/addons/interface/copy_attributes.rst
#: ../../manual/addons/interface/modifier_tools.rst
#: ../../manual/addons/interface/viewport_pies.rst
#: ../../manual/addons/lighting/dynamic_sky.rst
#: ../../manual/addons/lighting/sun_position.rst
#: ../../manual/addons/lighting/sun_position.rst:95
#: ../../manual/addons/lighting/trilighting.rst
#: ../../manual/addons/materials/material_library.rst
#: ../../manual/addons/materials/material_utils.rst
#: ../../manual/addons/mesh/3d_print_toolbox.rst
#: ../../manual/addons/mesh/auto_mirror.rst
#: ../../manual/addons/mesh/bsurfaces.rst
#: ../../manual/addons/mesh/edit_mesh_tools.rst
#: ../../manual/addons/mesh/f2.rst
#: ../../manual/addons/mesh/inset_straight_skeleton.rst
#: ../../manual/addons/mesh/looptools.rst
#: ../../manual/addons/mesh/relax.rst
#: ../../manual/addons/mesh/snap_utilities_line.rst
#: ../../manual/addons/mesh/tinycad.rst
#: ../../manual/addons/mesh/tissue.rst
#: ../../manual/addons/node/node_arrange.rst
#: ../../manual/addons/node/node_presets.rst
#: ../../manual/addons/node/node_wrangler.rst
#: ../../manual/addons/object/align_tools.rst
#: ../../manual/addons/object/bool_tools.rst
#: ../../manual/addons/object/carver.rst
#: ../../manual/addons/object/cell_fracture.rst
#: ../../manual/addons/object/color_rules.rst
#: ../../manual/addons/object/edit_linked_library.rst
#: ../../manual/addons/object/greasepencil_tools.rst
#: ../../manual/addons/object/real_snow.rst
#: ../../manual/addons/object/scatter_objects.rst
#: ../../manual/addons/object/skinify.rst
#: ../../manual/addons/paint/paint_palettes.rst
#: ../../manual/addons/render/copy_settings.rst
#: ../../manual/addons/render/povray.rst
#: ../../manual/addons/rigging/rigify/index.rst
#: ../../manual/addons/sequencer/power_sequencer.rst
#: ../../manual/addons/system/blend_info.rst
#: ../../manual/addons/system/blender_id.rst
#: ../../manual/addons/system/demo_mode.rst
#: ../../manual/addons/system/property_chart.rst
#: ../../manual/addons/system/ui_translations.rst
#: ../../manual/addons/uv/magic_uv.rst
#: ../../manual/addons/video_tools/refine_tracking.rst
#: ../../manual/animation/constraints/relationship/child_of.rst:42
#: ../../manual/editors/3dview/3d_cursor.rst:60
#: ../../manual/editors/3dview/sidebar.rst:98
#: ../../manual/grease_pencil/materials/properties.rst:164
#: ../../manual/grease_pencil/materials/properties.rst:189
#: ../../manual/modeling/geometry_nodes/input/object_info.rst:46
#: ../../manual/movie_clip/tracking/clip/sidebar/stabilization/panel.rst:55
#: ../../manual/physics/forces/force_fields/introduction.rst:116
#: ../../manual/render/shader_nodes/input/object_info.rst:33
#: ../../manual/render/shader_nodes/input/particle_info.rst:48
#: ../../manual/render/shader_nodes/input/point_info.rst:33
#: ../../manual/render/shader_nodes/vector/mapping.rst:23
#: ../../manual/scene_layout/object/editing/apply.rst:45
#: ../../manual/scene_layout/object/editing/transform/randomize.rst:32
#: ../../manual/scene_layout/object/properties/transforms.rst:33
#: ../../manual/scene_layout/object/types.rst:99
msgid "Location"
msgstr "Vị Trí (Location)"

The original document can be found here

Blender UI translation SVN Repository

You can just run

svn co https://svn.blender.org/svnroot/bf-translations/trunk/po

and take a look at the file. As a translator, I have occasionally observing changes from the 'blender.pot' file and I always wished this location data block is made distinct and sorted alphabetically.

@@ -276,6 +276,7 @@ def _process_comment(self, line):
self.locations.append((location[:pos], lineno))
else:
self.locations.append((location, None))
self.locations = list(set(self.locations))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, this should probably also use distinct. Also, can you explain what's happening here? Why is self.locations modified every time comments are processed?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, you're right, don't need to perform this here. When the message is created then do it there instead, once and for all. Thank you again.

Comment on lines 525 to 527
<<<<<<< HEAD
def _write_comment(comment, prefix='':
=======
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge conflict marker and syntax error.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why there is an error here, the code at my local appeared to be OK. I will make another push to ensure the correct code is placed.

else:
_width = 76
for line in wraptext(comment, _width):
>>>>>>> 67aa652a1c10bfdc92587cdae87fdab65d633b23
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge conflict marker.

Copy link

@ghost ghost May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are removed. There is no need for wrapping comments any more. Please refer back to the original post commented above

Comment on lines 532 to 533
has_comment = (bool(comment) and len(comment) > 0)
if not has_comment:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this simplifies to

Suggested change
has_comment = (bool(comment) and len(comment) > 0)
if not has_comment:
if not comment:

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this comment. I'm changing the code and pushing another commit.

Comment on lines +536 to +538
is_list_type = isinstance(comment, list)
comment_list = (list(comment) if not is_list_type else comment)
comment_list.sort()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what's happening here? Is it possible this reorders multiline comments into the wrong order?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I run the code in tests, there are times the comment came in as an instance of list, sometimes it is not. So to make sure they are all instance of list type, I test it first and ensure that it is a list type whatever the input is, and sort the list according ly. Please run this simple testing code lines:

import os
from sphinx_intl import catalog as c
home = os.environ['HOME']
home_base = os.path.join(home, "<where ever the po directory is>"
input_file = os.path.join(home_base, "blender.pot")
output_file = os.path.join(home_base, "test_output_blender.pot")

data = c.load_po(input_file)
c.dump_po(output_file. data, line_width=4096)

and load the output file into an editor (ie. vscode) and POEdit to test for python-format errors.

for comment in message.auto_comments:
_write_comment(comment, prefix='.')
_write_comment(message.user_comments)
_write_comment(message.auto_comments)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we no longer use the '.' prefix here?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this observation. I will be making a new commit for the corrected code.

@@ -35,7 +35,7 @@
(?:\.(?:\*|[\d]+))?
[hlL]?
)
([diouxXeEfFgGcrs%])
((?<!\s)[diouxXeEfFgGcrs%])(?=([\s\'\)\.\,\:\"\!\]\>\?]|$))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit message(s) don't seem to explain why these changes are necessary.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please read the original post at

Message.locations duplicate unnecessary #10104

This is an extract of the original message:

By the way, in the past I posted a bug report mentioning the PYTHON_FORMAT problem, in that this re.Pattern causing the problem in recognizing this part "%50 'one letter'" (diouxXeEfFgGcrs%) as an ACCEPTABLE pattern, thus causing the flag "python_format" in the Message class to set, and the Catalog.dump_po will insert a "#, python-format" in the comment section of the message, causing applications such as PoEdit to flag up as a WRONG format for "python-format". The trick is to insert a look behind clause in the PYTHON_FORMAT pattern, as an example here:

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After all the above comments, and as you finished your preview, could you please give me a approval as what other changes I should make, before I make another push to the pull request branch. Thank you.

hoangduytranuk added 2 commits May 15, 2022 15:20
…RMAL behaviour user expected:

1. PYTHON_FORMAT will correctly, with some degrees of limitation (haven't found with sampled test file of Blender UI translation file, but that doesn't mean it will not be broken again) identify '%s' '%d' flags in the messages and flag it as 'python-format', which removed mistaken recognitions of percentage sign, ie. '%50' as a python-format, which POEditor will flag up as an ERROR in the message.
2. Locations and comments are sorted Alphabetically and removed duplicated redundant lines.
3. Comment lines are NO LONGER wrapped with provided line_width. There appeared to be an error in the xgettext code when running with --no-wrap flag. Message bodies (ie. msgid, msgstr) should be wrapped but NOT comments. The wrapping can cause unnecessary version changes appeared in the repository submissions.
…RMAL behaviour user expected:

1. PYTHON_FORMAT will correctly, with some degrees of limitation (haven't found with sampled test file of Blender UI translation file, but that doesn't mean it will not be broken again) identify '%s' '%d' flags in the messages and flag it as 'python-format', which removed mistaken recognitions of percentage sign, ie. '%50' as a python-format, which POEditor will flag up as an ERROR in the message.
2. Locations and comments are sorted Alphabetically and removed duplicated redundant lines.
3. Comment lines are NO LONGER wrapped with provided line_width. There appeared to be an error in the xgettext code when running with --no-wrap flag. Message bodies (ie. msgid, msgstr) should be wrapped but NOT comments. The wrapping can cause unnecessary version changes appeared in the repository submissions.

Resolved merge conflicts with previous changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants