Skip to content

Commit

Permalink
Add check-spelling
Browse files Browse the repository at this point in the history
  • Loading branch information
jsoref committed Mar 18, 2024
1 parent 8943393 commit df949c3
Showing 1 changed file with 168 additions and 0 deletions.
168 changes: 168 additions & 0 deletions .github/workflows/spelling.yml
@@ -0,0 +1,168 @@
name: Check Spelling

# Comment management is handled through a secondary job, for details see:
# https://github.com/check-spelling/check-spelling/wiki/Feature%3A-Restricted-Permissions
#
# `jobs.comment-push` runs when a push is made to a repository and the `jobs.spelling` job needs to make a comment
# (in odd cases, it might actually run just to collapse a comment, but that's fairly rare)
# it needs `contents: write` in order to add a comment.
#
# `jobs.comment-pr` runs when a pull_request is made to a repository and the `jobs.spelling` job needs to make a comment
# or collapse a comment (in the case where it had previously made a comment and now no longer needs to show a comment)
# it needs `pull-requests: write` in order to manipulate those comments.

# Updating pull request branches is managed via comment handling.
# For details, see: https://github.com/check-spelling/check-spelling/wiki/Feature:-Update-expect-list
#
# These elements work together to make it happen:
#
# `on.issue_comment`
# This event listens to comments by users asking to update the metadata.
#
# `jobs.update`
# This job runs in response to an issue_comment and will push a new commit
# to update the spelling metadata.
#
# `with.experimental_apply_changes_via_bot`
# Tells the action to support and generate messages that enable it
# to make a commit to update the spelling metadata.
#
# `with.ssh_key`
# In order to trigger workflows when the commit is made, you can provide a
# secret (typically, a write-enabled github deploy key).
#
# For background, see: https://github.com/check-spelling/check-spelling/wiki/Feature:-Update-with-deploy-key

# Sarif reporting
#
# Access to Sarif reports is generally restricted (by GitHub) to members of the repository.
#
# Requires enabling `security-events: write`
# and configuring the action with `use_sarif: 1`
#
# For information on the feature, see: https://github.com/check-spelling/check-spelling/wiki/Feature:-Sarif-output

# Minimal workflow structure:
#
# on:
# push:
# ...
# pull_request_target:
# ...
# jobs:
# # you only want the spelling job, all others should be omitted
# spelling:
# # remove `security-events: write` and `use_sarif: 1`
# # remove `experimental_apply_changes_via_bot: 1`
# ... otherwise adjust the `with:` as you wish

on:
push:
branches:
- "**"
tags-ignore:
- "**"
pull_request_target:
branches:
- "**"
types:
- 'opened'
- 'reopened'
- 'synchronize'
issue_comment:
types:
- 'created'

jobs:
spelling:
name: Check Spelling
permissions:
contents: read
pull-requests: read
actions: read
security-events: write
outputs:
followup: ${{ steps.spelling.outputs.followup }}
runs-on: ubuntu-latest
if: ${{ contains(github.event_name, 'pull_request') || github.event_name == 'push' }}
concurrency:
group: spelling-${{ github.event.pull_request.number || github.ref }}
# note: If you use only_check_changed_files, you do not want cancel-in-progress
cancel-in-progress: true
steps:
- name: check-spelling
id: spelling
uses: check-spelling/check-spelling@prerelease
with:
suppress_push_for_open_pull_request: ${{ github.actor != 'dependabot[bot]' && 1 }}
checkout: true
check_file_names: 1
spell_check_this: check-spelling/spell-check-this@prerelease
post_comment: 0
use_magic_file: 1
report-timing: 1
warnings: bad-regex,binary-file,deprecated-feature,ignored-expect-variant,large-file,limited-references,no-newline-at-eof,noisy-file,non-alpha-in-dictionary,token-is-substring,unexpected-line-ending,whitespace-in-dictionary,minified-file,unsupported-configuration,no-files-to-check,unclosed-block-ignore-begin,unclosed-block-ignore-end
experimental_apply_changes_via_bot: 1
use_sarif: ${{ (!github.event.pull_request || (github.event.pull_request.head.repo.full_name == github.repository)) && 1 }}
extra_dictionary_limit: 20
extra_dictionaries: |
cspell:software-terms/dict/softwareTerms.txt
comment-push:
name: Report (Push)
# If your workflow isn't running on push, you can remove this job
runs-on: ubuntu-latest
needs: spelling
permissions:
actions: read
contents: write
if: (success() || failure()) && needs.spelling.outputs.followup && github.event_name == 'push'
steps:
- name: comment
uses: check-spelling/check-spelling@prerelease
with:
checkout: true
spell_check_this: check-spelling/spell-check-this@prerelease
task: ${{ needs.spelling.outputs.followup }}

comment-pr:
name: Report (PR)
# If you workflow isn't running on pull_request*, you can remove this job
runs-on: ubuntu-latest
needs: spelling
permissions:
actions: read
contents: read
pull-requests: write
if: (success() || failure()) && needs.spelling.outputs.followup && contains(github.event_name, 'pull_request')
steps:
- name: comment
uses: check-spelling/check-spelling@prerelease
with:
checkout: true
spell_check_this: check-spelling/spell-check-this@prerelease
task: ${{ needs.spelling.outputs.followup }}
experimental_apply_changes_via_bot: 1

update:
name: Update PR
permissions:
contents: write
pull-requests: write
actions: read
runs-on: ubuntu-latest
if: ${{
github.event_name == 'issue_comment' &&
github.event.issue.pull_request &&
contains(github.event.comment.body, '@check-spelling-bot apply')
}}
concurrency:
group: spelling-update-${{ github.event.issue.number }}
cancel-in-progress: false
steps:
- name: apply spelling updates
uses: check-spelling/check-spelling@prerelease
with:
experimental_apply_changes_via_bot: 1
checkout: true
ssh_key: "${{ secrets.CHECK_SPELLING }}"

2 comments on commit df949c3

@github-actions

This comment was marked as outdated.

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@check-spelling-bot Report

🔴 Please review

See the 📜action log or 📝 job summary for details.

Unrecognized words (7414)

Truncated, please see the log or artifact if available.

Some files were automatically ignored 🙈

These sample patterns would exclude them:

^\Qbindings/go/godoc-resources/jquery.js\E$
^\Qbindings/go/src/fdb/fdb_darwin.go\E$
^\Qbindings/go/src/fdb/tuple/testdata/tuples.golden\E$
^\Qbindings/python/fdb/libfdb_c.dylib.pth\E$
^\Qcontrib/crc32/crc32c-generated-constants.cpp\E$
^\Qcontrib/crc32/include/crc32/crc32_constants.h\E$
^\Qcontrib/sqlite/sqlite3.amalgamation.c\E$
^\Qfdbrpc/libcoroutine/asm.S\E$
^\Qfdbrpc/libcoroutine/Base.h\E$

You should consider adding them to:

.github/actions/spelling/excludes.txt

File matching is via Perl regular expressions.

To check these files, more of their words need to be in the dictionary than not. You can use patterns.txt to exclude portions, add items to the dictionary (e.g. by adding them to allow.txt), or fix typos.

Script unavailable

Truncated, please see the log or artifact if available.

Available 📚 dictionaries could cover words not in the 📘 dictionary
Dictionary Entries Covers Uniquely
cspell:python/src/python/python-lib.txt 2417 284 91
cspell:cpp/src/stdlib-cpp.txt 252 94 68
cspell:php/dict/php.txt 1689 218 63
cspell:cpp/src/stdlib-c.txt 278 112 47
cspell:python/src/common/extra.txt 741 74 40

Consider adding them (in .github/workflows/spelling.yml) in jobs:/spelling: for uses: check-spelling/check-spelling@prerelease in its with:

      with:
        extra_dictionaries: |
          cspell:python/src/python/python-lib.txt
          cspell:cpp/src/stdlib-cpp.txt
          cspell:php/dict/php.txt
          cspell:cpp/src/stdlib-c.txt
          cspell:python/src/common/extra.txt

To stop checking additional dictionaries, add (in .github/workflows/spelling.yml) for uses: check-spelling/check-spelling@prerelease in its with:

check_extra_dictionaries: ''
Forbidden patterns 🙅 (18)

In order to address this, you could change the content to not match the forbidden patterns (comments before forbidden patterns may help explain why they're forbidden), add patterns for acceptable instances, or adjust the forbidden patterns themselves.

These forbidden patterns matched content:

Should be ID

\bId\b

In English, duplicated words are generally mistakes

There are a few exceptions (e.g. "that that").
If the highlighted doubled word pair is in:

  • code, write a pattern to mask it.
  • prose, have someone read the English before you dismiss this error.
\s([A-Z]{3,}|[A-Z][a-z]{2,}|[a-z]{3,})\s\g{-1}\s

Should be cannot (or can't)

See https://www.grammarly.com/blog/cannot-or-can-not/

Don't use can not when you mean cannot. The only time you're likely to see can not written as separate words is when the word can happens to precede some other phrase that happens to start with not.
Can't is a contraction of cannot, and it's best suited for informal writing.
In formal writing and where contractions are frowned upon, use cannot.
It is possible to write can not, but you generally find it only as part of some other construction, such as not only . . . but also.

  • if you encounter such a case, add a pattern for that case to patterns.txt.
\b[Cc]an not\b

Should be macOS or Mac OS X or ...

\bMacOS\b

Should be nonexistent

\b[Nn]o[nt][- ]existent\b

Should be into

when not phrasal and when in order to would be wrong:
https://thewritepractice.com/into-vs-in-to/

\sin to\s(?!if\b)

Should be case-(in)sensitive

\bcase (?:in|)sensitive\b

Should be neither/nor -- or reword

\bnot\b[^.?!"/(]+\bnor\b

Should be preexisting

[Pp]re[- ]existing

Should be nonexistent

\bnon existing\b

Should be GitHub

(?<![&*.]|// |\btype )\bGithub\b(?![{)])

Should be greater than

\bgreater then\b

Should be its

\bit's(?= own\b)

Should be opt-in

(?<!\sfor)\sopt in\s

Should be or (more|less)

\bore (?:more|less)\b

Should be preemptively

[Pp]re[- ]emptively

Should be reentrant

[Rr]e[- ]entrant

Should be workaround

(?:(?:[Aa]|[Tt]he|ugly)\swork[- ]around\b|\swork[- ]around\s+for)
Pattern suggestions ✂️ (50)

You could add these patterns to .github/actions/spelling/patterns.txt:

# Automatically suggested patterns
# hit-count: 8078 file-count: 324
# hex digits including css/html color classes:
(?:[\\0][xX]|\\u|[uU]\+|#x?|%23)[0-9_a-fA-FgGrR]*?[a-fA-FgGrR]{2,}[0-9_a-fA-FgGrR]*(?:[uUlL]{0,3}|[iu]\d+)\b

# hit-count: 4560 file-count: 521
# IServiceProvider / isAThing
\b(?:I|isA)(?=(?:[A-Z][a-z]{2,})+(?:[A-Z]|\b))

# hit-count: 1919 file-count: 298
# https/http/file urls
(?:\b(?:https?|ftp|file)://)[-A-Za-z0-9+&@#/%?=~_|!:,.;]+[-A-Za-z0-9+&@#/%=~_|]

# hit-count: 1204 file-count: 81
# GitHub SHAs (markdown)
(?:\[`?[0-9a-f]+`?\]\(https:/|)/(?:www\.|)github\.com(?:/[^/\s"]+){2,}(?:/[^/\s")]+)(?:[0-9a-f]+(?:[-0-9a-zA-Z/#.]*|)\b|)

# hit-count: 508 file-count: 130
# libraries
\blib(?!rar(?:ies|y))(?=[a-z])

# hit-count: 398 file-count: 78
# Compiler flags (Unix, Java/Scala)
# Use if you have things like `-Pdocker` and want to treat them as `docker`
(?:^|[\t ,>"'`=(])-(?:(?:J-|)[DPWXY]|[Llf])(?=[A-Z]{2,}|[A-Z][a-z]|[a-z]{2,})

# hit-count: 372 file-count: 73
# Compiler flags (Windows / PowerShell)
# This is a subset of the more general compiler flags pattern.
# It avoids matching `-Path` to prevent it from being treated as `ath`
(?:^|[\t ,"'`=(])-(?:[DPL](?=[A-Z]{2,})|[WXYlf](?=[A-Z]{2,}|[A-Z][a-z]|[a-z]{2,}))

# hit-count: 354 file-count: 52
# in check-spelling@v0.0.22+, printf markers aren't automatically consumed
# printf markers
(?<!\\)\\[nrt](?=[a-z]{2,})

# hit-count: 309 file-count: 49
# version suffix <word>v#
(?:(?<=[A-Z]{2})V|(?<=[a-z]{2}|[A-Z]{2})v)\d+(?:\b|(?=[a-zA-Z_]))

# hit-count: 284 file-count: 55
# hex runs
\b[0-9a-fA-F]{16,}\b

# hit-count: 209 file-count: 80
# C network byte conversions
(?:\d|\bh)to(?!ken)(?=[a-z])|to(?=[adhiklpun]\()

# hit-count: 144 file-count: 24
# alternate printf markers if you run into latex and friends
(?<!\\)\\[nrt](?=[a-z]{2,})(?=.*['"`])

# hit-count: 142 file-count: 9
# microsoft
\b(?:https?://|)(?:(?:download\.visualstudio|docs|msdn2?|research)\.microsoft|blogs\.msdn)\.com/[-_a-zA-Z0-9()=./%]*

# hit-count: 104 file-count: 24
# Python string prefix / binary prefix
# Note that there's a high false positive rate, remove the `?=` and search for the regex to see if the matches seem like reasonable strings
(?<!['"])\b(?:B|BR|Br|F|FR|Fr|R|RB|RF|Rb|Rf|U|UR|Ur|b|bR|br|f|fR|fr|r|rB|rF|rb|rf|u|uR|ur)['"](?=[A-Z]{3,}|[A-Z][a-z]{2,}|[a-z]{3,})

# hit-count: 95 file-count: 11
# IPv6
\b(?:[0-9a-fA-F]{0,4}:){3,7}[0-9a-fA-F]{0,4}\b

# hit-count: 82 file-count: 3
# w3
\bw3\.org/[-0-9a-zA-Z/#.]+

# hit-count: 74 file-count: 11
# WWNN/WWPN (NAA identifiers)
\b(?:0x)?10[0-9a-f]{14}\b|\b(?:0x|3)?[25][0-9a-f]{15}\b|\b(?:0x|3)?6[0-9a-f]{31}\b

# hit-count: 56 file-count: 26
# base64 encoded content, possibly wrapped in mime
(?:^|[\s=;:?])[-a-zA-Z=;:/0-9+]{50,}(?:[\s=;:?]|$)

# hit-count: 54 file-count: 10
# uuid:
\b[0-9a-fA-F]{8}-(?:[0-9a-fA-F]{4}-){3}[0-9a-fA-F]{12}\b

# hit-count: 39 file-count: 23
# Wikipedia
\ben\.wikipedia\.org/wiki/[-\w%.#]+

# hit-count: 15 file-count: 8
# This does not cover multiline strings, if your repository has them,
# you'll want to remove the `(?=.*?")` suffix.
# The `(?=.*?")` suffix should limit the false positives rate
# printf
%(?:(?:(?:hh?|ll?|[jzt])?[diuoxn]|l?[cs]|L?[fega]|p)(?=[a-z]{2,})|(?:X|L?[FEGA]|p)(?=[a-zA-Z]{2,}))(?!%)(?=[_a-zA-Z]+(?!%)\b)(?=.*?['"])

# hit-count: 15 file-count: 2
# css url wrappings
\burl\([^)]+\)

# hit-count: 13 file-count: 7
# tar arguments
\b(?:\\n|)g?tar(?:\.exe|)(?:(?:\s+--[-a-zA-Z]+|\s+-[a-zA-Z]+|\s[ABGJMOPRSUWZacdfh-pr-xz]+\b)(?:=[^ ]*|))+

# hit-count: 7 file-count: 7
# assign regex
= /[^*].*?(?:[a-z]{3,}|[A-Z]{3,}|[A-Z][a-z]{2,}).*/[gi]?(?=\W|$)

# hit-count: 7 file-count: 7
# set arguments
\b(?:bash|sh|set)(?:\s+-[abefimouxE]{1,2})*\s+-[abefimouxE]{3,}(?:\s+-[abefimouxE]+)*

# hit-count: 7 file-count: 4
# stackexchange -- https://stackexchange.com/feeds/sites
\b(?:askubuntu|serverfault|stack(?:exchange|overflow)|superuser).com/(?:questions/\w+/[-\w]+|a/)

# hit-count: 6 file-count: 4
# base64 encoded content
([`'"])[-a-zA-Z=;:/0-9+]{3,}=\g{-1}

# hit-count: 6 file-count: 3
# lower URL escaped characters
%[0-9a-f][a-f](?=[a-z]{2,})

# hit-count: 6 file-count: 3
# Alternative printf
# %s
%(?:s(?=[a-z]{2,}))(?!%)(?=[_a-zA-Z]+(?!%)\b)(?=.*?['"])

# hit-count: 6 file-count: 1
# Compiler flags (linker)
,-B

# hit-count: 4 file-count: 3
# sha-... -- uses a fancy capture
(\\?['"]|&quot;)[0-9a-f]{40,}\g{-1}

# hit-count: 4 file-count: 2
# githubusercontent
/[-a-z0-9]+\.githubusercontent\.com/[-a-zA-Z0-9?&=_\/.]*

# hit-count: 4 file-count: 2
# curl arguments
\b(?:\\n|)curl(?:\.exe|)(?:\s+-[a-zA-Z]{1,2}\b)*(?:\s+-[a-zA-Z]{3,})(?:\s+-[a-zA-Z]+)*

# hit-count: 3 file-count: 1
# tput arguments -- https://man7.org/linux/man-pages/man5/terminfo.5.html -- technically they can be more than 5 chars long...
\btput\s+(?:(?:-[SV]|-T\s*\w+)\s+)*\w{3,5}\b

# hit-count: 2 file-count: 2
# While you could try to match `http://` and `https://` by using `s?` in `https?://`, sometimes there
# YouTube url
\b(?:(?:www\.|)youtube\.com|youtu.be)/(?:channel/|embed/|user/|playlist\?list=|watch\?v=|v/|)[-a-zA-Z0-9?&=_%]*

# hit-count: 2 file-count: 2
# Google Drive
\bdrive\.google\.com/(?:file/d/|open)[-0-9a-zA-Z_?=]*

# hit-count: 2 file-count: 2
# Google Groups
\bgroups\.google\.com(?:/[a-z]+/(?:#!|)[^/\s"]+)*

# hit-count: 2 file-count: 2
# latex (check-spelling >= 0.0.22)
\\\w{2,}\{

# hit-count: 2 file-count: 1
# shields.io
\bshields\.io/[-\w/%?=&.:+;,]*

# hit-count: 2 file-count: 1
# ANSI color codes
(?:\\(?:u00|x)1[Bb]|\x1b|\\u\{1[Bb]\})\[\d+(?:;\d+|)m

# hit-count: 1 file-count: 1
# mailto urls
mailto:[-a-zA-Z=;:/?%&0-9+@._]{3,}

# hit-count: 1 file-count: 1
# apple
\bdeveloper\.apple\.com/[-\w?=/]+

# hit-count: 1 file-count: 1
# Google Storage
\b[-a-zA-Z0-9.]*\bstorage\d*\.googleapis\.com(?:/\S*|)

# hit-count: 1 file-count: 1
# The leading `/` here is as opposed to the `\b` above
# ... a short way to match `https://` or `http://` since most urls have one of those prefixes
# Google Docs
/docs\.google\.com/[a-z]+/(?:ccc\?key=\w+|(?:u/\d+|d/(?:e/|)[0-9a-zA-Z_-]+/)?(?:edit\?[-\w=#.]*|/\?[\w=&]*|))

# hit-count: 1 file-count: 1
# bit.ly
\bbit\.ly/\w+

# hit-count: 1 file-count: 1
# vs devops
\bvisualstudio.com(?::443|)/[-\w/?=%&.]*

# hit-count: 1 file-count: 1
# oracle
\bdocs\.oracle\.com/[-0-9a-zA-Z./_?#&=]*

# hit-count: 1 file-count: 1
# regex choice
\(\?:[^)]+\|[^)]+\)

# hit-count: 1 file-count: 1
# configure flags
.* \| --\w{2,}.*?(?=\w+\s\w+)

# hit-count: 1 file-count: 1
# Non-English
[a-zA-Z]*[ÀÁÂÃÄÅÆČÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæčçèéêëìíîïðñòóôõöøùúûüýÿĀāŁłŃńŅņŒœŚśŠšŜŝŸŽžź][a-zA-Z]{3}[a-zA-ZÀÁÂÃÄÅÆČÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæčçèéêëìíîïðñòóôõöøùúûüýÿĀāŁłŃńŅņŒœŚśŠšŜŝŸŽžź]*|[a-zA-Z]{3,}[ÀÁÂÃÄÅÆČÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæčçèéêëìíîïðñòóôõöøùúûüýÿĀāŁłŃńŅņŒœŚśŠšŜŝŸŽžź]|[ÀÁÂÃÄÅÆČÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæčçèéêëìíîïðñòóôõöøùúûüýÿĀāŁłŃńŅņŒœŚśŠšŜŝŸŽžź][a-zA-Z]{3,}

Errors (7)

See the 📜action log or 📝 job summary for details.

❌ Errors Count
ℹ️ binary-file 1
ℹ️ candidate-pattern 115
❌ check-file-path 2798
❌ forbidden-pattern 452
ℹ️ large-file 1
ℹ️ minified-file 1
ℹ️ noisy-file 6

See ❌ Event descriptions for more information.

Please sign in to comment.