Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file-contents-sorter Not working same way everytime #794

Open
teksturi opened this issue Jul 29, 2022 · 10 comments
Open

file-contents-sorter Not working same way everytime #794

teksturi opened this issue Jul 29, 2022 · 10 comments

Comments

@teksturi
Copy link

I get strange issue that file-contents-sorter hook does not work same way everytime. I made little test case which trickers the issue. If i run this 10 times it will sometimes pass and sometimes not. I did not have time to look what problem might be.

Add this to file_contents_sorter_test.py

(
    b'pre\nPre\n',
    ['--unique', '--ignore-case'],
    PASS,
    b'pre\nPre\n',
),

And run it with
for run in {1..10}; do pytest tests/file_contents_sorter_test.py; done
This gives me that about 30% pass and 70% fails

@asottile
Copy link
Member

yeah this set call breaks unique sorting -- it should probably just forbid the combination of unique and ignore case since the result is undefined https://github.com/pre-commit/pre-commit-hooks/blob/v4.3.0/pre_commit_hooks/file_contents_sorter.py#L36

@renegaderyu
Copy link

@asottile please review if you have time. I'm hoping this PR is simple enough and goes with the spirit of forbidding the combinations of options as you mentioned. Also, I'd appreciate if you could label w/ hacktoberfest-accepted so I can get a tree planted, thanks.

@asottile
Copy link
Member

I'm not going to review something which doesn't pass tests

@renegaderyu
Copy link

@asottile Apologies for not seeing the failing tests before asking. I think its ready now.

@XuehaiPan
Copy link

XuehaiPan commented Mar 27, 2023

yeah this set call breaks unique sorting -- it should probably just forbid the combination of unique and ignore case since the result is undefined v4.3.0/pre_commit_hooks/file_contents_sorter.py#L36

Could we respect the original word case if the uncased words are equal? For example, let Pre always proceed pre because it's upper case. This can be easily implemented with tuple-based sort keys.

def key(s):
    return (s.lower(), s)

ret = sorted(set(word_list), key=key)

Another option is to respect the original order (stable sort). This can use an "OrderedSet" rather than set.

def unique(iterable):
    return list(OrderedDict.fromkeys(iterable))  # can be replaced with `dict` for Python 3.7+

Both approaches I list above are deterministic.

@asottile
Copy link
Member

asottile commented Mar 27, 2023 via email

@Childcity

This comment was marked as off-topic.

@asottile

This comment was marked as off-topic.

@Childcity

This comment was marked as off-topic.

@asottile

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

5 participants