diff --git a/README.md b/README.md index 196e523..4d2b129 100644 --- a/README.md +++ b/README.md @@ -11,36 +11,66 @@ Here is how it is typically run: python comment_spell_check.py --exclude Ancillary $SIMPLEITK_SOURCE_DIR/Code -This command will recursively find all the '.h' files in a directory, +This command will recursively find all the \'.h\' files in a directory, extract the C/C++ comments from the code, and run a spell checker on them. -The **'--exclude'** flag tells the script to ignore any file that has -'Ancillary' in its full path name. This flag will accept any +The **\'\-\-exclude\'** flag tells the script to ignore any file that has +\'Ancillary\' in its full path name. This flag will accept any regular expression. -In addition to pyenchant's English dictionary, we use the words in +In addition to pyenchant\'s English dictionary, we use the words in **additional_dictionary.txt**. These words are proper names and technical terms harvest by hand from the SimpleITK and ITK code bases. If a word is not found in the dictionaries, we try two additional checks. 1. If the word starts with some known prefix, the prefix is removed -...and the remaining word is checked against the dictionary. The prefixes -...used by default are **'sitk'**, **'itk'**, and **'vtk'**. Additional -...prefixes can be specified with the **'--prefix'** command line argument. + and the remaining word is checked against the dictionary. The prefixes + used by default are **\'sitk\'**, **\'itk\'**, and **\'vtk\'**. Additional + prefixes can be specified with the **\'\-\-prefix\'** command line argument. 2. We attempt to split the word by capitalization and check each -...sub-word against the dictionary. This method is an attempt to detect -...camel-case words such as 'GetArrayFromImage', which would get split into -...'Get', 'Array', 'From', and 'Image'. Camel-case words are very commonly -...used for code elements. + sub\-word against the dictionary. This method is an attempt to detect + camel-case words such as \'GetArrayFromImage\', which would get split into + \'Get\', \'Array\', \'From\', and \'Image\'. Camel-case words are very commonly + used for code elements. -The script can also process other file types. With the **'--suffix'** +The script can also process other file types. With the **\'\-\-suffix\'** option, the following file types are available: Python (.py), C/C++ (.c/.cxx), CSharp (.cs), Text (.txt), reStructuredText(.rst), Markdown (.md), Ruby (.ruby), R (.R), and Java (.java). Note that reStructuredText files are treated as standard text. Consequentially, all markup keywords that are not actual words will need to be added to the additional/exception dictionary. +## Disabling Spell Checking + +Spell checking can be disabled for sections of code by using special + +comments. The following comments will disable spell checking until +the corresponding end comment is found. +``` +// spell-check-disable + +// This comment will not be spell checked. + +// spell-check-enable +``` + +Note that for C-style, multi-line comments, the disable and enable +comments must be in seperate comments. If the disable command +is found in a multi-line comment, spell checking will be +disabled for the entire multi-line comment. + +``` +/* +spell-check-disable +spell-check-enable +This comment will NOT be spell checked +*/ +/* spell-check-enable */ +/* This comment WILL be spell checked */ +``` + + ## Dictionary notes We use [PySpellChecker](https://github.com/barrust/pyspellchecker) as the diff --git a/comment_spell_check/comment_spell_check.py b/comment_spell_check/comment_spell_check.py index adf0445..01415fd 100755 --- a/comment_spell_check/comment_spell_check.py +++ b/comment_spell_check/comment_spell_check.py @@ -268,7 +268,21 @@ def spell_check_file( bad_words = [] line_count = 0 + disable_spell_check = False + for c in clist: + if "spell-check-disable" in c.text().lower(): + disable_spell_check = True + logger.info(" Spell checking disabled") + continue + + if "spell-check-enable" in c.text().lower(): + disable_spell_check = False + logger.info(" Spell checking enabled") + + if disable_spell_check: + continue + mistakes = spell_check_comment(spell_checker, c, prefixes=prefixes) if len(mistakes) > 0: logger.info("\nLine number %s", c.line_number()) diff --git a/tests/example.h b/tests/example.h index da3d44b..db959f0 100644 --- a/tests/example.h +++ b/tests/example.h @@ -25,6 +25,10 @@ // With node id's. // With the itemIndex'th where itemIndex is a variable name. +// spell-check-disable +// Some comment with a misspelled word: definately +// spell-check-enable + #include int test_int; diff --git a/tests/test_comment_spell_check.py b/tests/test_comment_spell_check.py index 67ceec5..ae6121f 100644 --- a/tests/test_comment_spell_check.py +++ b/tests/test_comment_spell_check.py @@ -109,7 +109,7 @@ def test_url(self): """URL test""" url = ( "https://raw.githubusercontent.com/SimpleITK/SimpleITK/" - "refs/heads/master/.github/workflows/additional_dictionary.txt" + "refs/heads/main/.github/workflows/additional_dictionary.txt" ) runresult = subprocess.run( [