IntelliJ Platform Plugin SDK Help

Spell Checking

Spell Checking is used to check the correctness of natural languages within code. Language plugins can implement customized spell checking by implementing SpellcheckingStrategy and registering it in the com.intellij.spellchecker.support extension point.

Examples:

SpellcheckingStrategy

SpellcheckingStrategy adjusts the spell checking behavior for PSI elements of a custom language by providing methods to define:

  1. Which PSI elements should be checked by this strategy.

  2. How to extract the text from PSI elements.

  3. How the text is broken into single words.

The class already contains a default strategy for spell checking of basic parts such as comments, identifiers and plain text. If you don't need anything else, you can just inherit from this class and register it.

If you need to check spelling for some specific elements in your language, then override getTokenizer() and use isMyContext() to determine if a PSI element should be checked by your strategy. The getTokenizer() method returns an instance of Tokenizer and is explained below.

Tokenizer

The tokenize() method of Tokenizer defines which portions of a PSI element need to be spell-checked by feeding them into the TokenConsumer. In the simplest case, the whole PSI element is consumed and its entire text is split into words and checked for spelling. For these simple cases, SpellcheckingStrategy already contains predefined tokenizers:

  • SpellcheckingStrategy.TEXT_TOKENIZER for simple text elements.

  • SpellcheckingStrategy.EMPTY_TOKENIZER for elements that don't require checking.

  • myCommentTokenizer for comments.

  • myXmlAttributeTokenizer for XML attributes.

However, there are situations where only fragments of the PSI element are textual content. In these cases, tokenize() can take care of extracting the correct text-ranges and feed them sequentially into the TokenConsumer. If elements in your language require such special handling, then define a tokenizer by deriving from Tokenizer and implement tokenize() with the logic you need.

Example: MethodNameTokenizerJava

Splitter

In Tokenizer.tokenize() the consumeToken() method can take an instance of Splitter as the second argument. The Splitter defines how the text is broken into words which is not always as simple as splitting at white space. Consider, for instance, identifiers or variables that follow camel-case or snake-case naming and that need to be separated differently to spell check single parts. As an example, please see how IdentifierSplitter, splits identifiers into separate words.

A custom language can define special splitting rules for elements by deriving from Splitter and implementing the logic for obtaining words from the passed text in the split() method.

Suppressing Spellchecking

Custom languages that support the suppression of inspection annotations can derive from SuppressibleSpellcheckingStrategy to make spell checking suppressible. The implementation overrides isSuppressedFor() to check if a spell check warning is suppressed for the passed element and overriding getSuppressActions() to add quick fix actions that suppress warnings.

Example: XmlSpellcheckingStrategy

Providing Dictionaries

BundledDictionaryProvider

Some custom languages may have a distinct fixed set of words or key identifiers. These words can be provided in additional dictionaries from BundledDictionaryProvider. Implement getBundledDictionaries() to return paths to the word dictionaries (*.dic files) and register it with the com.intellij.spellchecker.bundledDictionaryProvider extension point.

Example: PythonBundledDictionaryProvider

RuntimeDictionaryProvider

RuntimeDictionaryProvider allows providing (dynamic) dictionaries generated at runtime, e.g., downloaded from a server, created from project sources on-the-fly, etc. Register in com.intellij.spellchecker.dictionary.runtimeDictionaryProvider extension point.

Example PyPackagesDictionary

Last modified: 08 April 2024