Skip to content

Commit

Permalink
Allow diacritics (#1535)
Browse files Browse the repository at this point in the history
* Allow usage of letters with diacritics or strokes in enum values and filenames.

Closes #1530
  • Loading branch information
paul-dingemans committed Jul 18, 2022
1 parent 07ce058 commit b882d09
Show file tree
Hide file tree
Showing 8 changed files with 123 additions and 34 deletions.
16 changes: 8 additions & 8 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This project adheres to [Semantic Versioning](https://semver.org/).

### API Changes & RuleSet providers

If you are not an API user nor a RuleSet provider, then you can safely skip this section. Otherwise, please read below carefully and upgrade your usage of ktlint. In this and coming releases, we are changing and adapting important parts of our API in order to increase maintainability and flexibility for future changes. Please avoid skipping a releases as that will make it harder to migrate.
If you are not an API consumer nor a RuleSet provider, then you can safely skip this section. Otherwise, please read below carefully and upgrade your usage of ktlint. In this and coming releases, we are changing and adapting important parts of our API in order to increase maintainability and flexibility for future changes. Please avoid skipping a releases as that will make it harder to migrate.

#### Rule lifecycle hooks / deprecate RunOnRootOnly visitor modifier

Expand All @@ -20,19 +20,24 @@ The Rule class now offers new life cycle hooks:

The "visit" life cycle hook will be removed in Ktlint 0.48. In KtLint 0.47 the "visit" life cycle hook will be called *only* when hook "beforeVisitChildNodes" is not overridden. It is recommended to migrate to the new lifecycle hooks in KtLint 0.47. Please create an issue, in case you need additional assistence to implement the new life cycle hooks in your rules.

#### Format callback

The callback function provided as parameter to the format function is now called for all errors regardless whether the error has been autocorrected. Existing consumers of the format function should now explicitly check the `autocorrected` flag in the callback result and handle it appropriately (in most case this will be ignoring the callback results for which `autocorrected` has value `true`).

### Added

### Fixed

* Fix cli argument "--disabled_rules" ([#1520](https://github.com/pinterest/ktlint/issue/1520)).
* A file which contains a single top level declaration of type function does not need to be name after the function but only needs to adhere to the PascalCase convention. `filename` ([#1521](https://github.com/pinterest/ktlint/issue/1521)).
* A file which contains a single top level declaration of type function does not need to be named after the function but only needs to adhere to the PascalCase convention. `filename` ([#1521](https://github.com/pinterest/ktlint/issue/1521)).
* When a glob is specified then ensure that it matches files in the current directory and not only in subdirectories of the current directory ([#1533](https://github.com/pinterest/ktlint/issue/1533)).
* Disable/enable IndentationRule on blocks in middle of file. (`indent`) [#631](https://github.com/pinterest/ktlint/issues/631)
* Allow usage of letters with diacritics in enum values and filenames (`enum-entry-name-case`, `filename`) ([#1530](https://github.com/pinterest/ktlint/issue/1530)).

### Changed

* Print an error message and return with non-zero exit code when no files are found that match with the globs ([#629](https://github.com/pinterest/ktlint/issue/629)).
* Invoke callback on `format` function for all errors including errors that are autocorrected ([#1491](https://github.com/pinterest/ktlint/issues/1491))

### Removed

Expand Down Expand Up @@ -75,16 +80,12 @@ If your project did not run with the `experimental` ruleset enabled before, you

### API Changes & RuleSet providers

If you are not an API consumer nor a RuleSet provider, then you can safely skip this section. Otherwise, please read below carefully and upgrade your usage of ktlint. In this and coming releases, we are changing and adapting important parts of our API in order to increase maintainability and flexibility for future changes. Please avoid skipping a releases as that will make it harder to migrate.
If you are not an API user nor a RuleSet provider, then you can safely skip this section. Otherwise, please read below carefully and upgrade your usage of ktlint. In this and coming releases, we are changing and adapting important parts of our API in order to increase maintainability and flexibility for future changes. Please avoid skipping a releases as that will make it harder to migrate.

#### Lint and formatting functions

The lint and formatting changes no longer accept parameters of type `Params` but only `ExperimentalParams`. Also, the VisitorProvider parameter has been removed. Because of this, your integration with KtLint breaks. Based on feedback with ktlint 0.45.x, we now prefer to break at compile time instead of trying to keep the interface backwards compatible. Please raise an issue, in case you help to convert to the new API.

#### Format callback

The callback function provided as parameter to the format function is now called for all errors regardless whether the error has been autocorrected. Existing consumers of the format function should now explicitly check the `autocorrected` flag in the callback result and handle it appropriately (in most case this will be ignoring the callback results for which `autocorrected` has value `true`).

#### Use of ".editorconfig" properties & userData

The interface `UsesEditorConfigProperties` provides method `getEditorConfigValue` to retrieve a named `.editorconfig` property for a given ASTNode. When implementing this interface, the value `editorConfigProperties` needs to be overridden. Previously it was not checked whether a retrieved property was actually recorded in this list. Now, retrieval of unregistered properties results in an exception.
Expand Down Expand Up @@ -147,7 +148,6 @@ An AssertJ style API for testing KtLint rules ([#1444](https://github.com/pinter
- Fix indentation of property getter/setter when the property has an initializer on a separate line `indent` ([#1335](https://github.com/pinterest/ktlint/issues/1335))
- When `.editorconfig` setting `indentSize` is set to value `tab` then return the default tab width as value for `indentSize` ([#1485](https://github.com/pinterest/ktlint/issues/1485))
- Allow suppressing all rules or a list of specific rules in the entire file with `@file:Suppress(...)` ([#1029](https://github.com/pinterest/ktlint/issues/1029))
- Invoke callback on `format` function for all errors including errors that are autocorrected ([#1491](https://github.com/pinterest/ktlint/issues/1491))


### Changed
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package com.pinterest.ktlint.ruleset.standard

import com.pinterest.ktlint.core.Rule
import com.pinterest.ktlint.ruleset.standard.internal.regExIgnoringDiacriticsAndStrokesOnLetters
import org.jetbrains.kotlin.com.intellij.lang.ASTNode
import org.jetbrains.kotlin.com.intellij.psi.impl.source.tree.CompositeElement
import org.jetbrains.kotlin.psi.KtEnumEntry
Expand All @@ -10,7 +11,7 @@ import org.jetbrains.kotlin.psi.KtEnumEntry
*/
public class EnumEntryNameCaseRule : Rule("enum-entry-name-case") {
internal companion object {
val regex = Regex("[A-Z]([A-Za-z\\d]*|[A-Z_\\d]*)")
val regex = "[A-Z]([A-Za-z\\d]*|[A-Z_\\d]*)".regExIgnoringDiacriticsAndStrokesOnLetters()
}

override fun beforeVisitChildNodes(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import com.pinterest.ktlint.core.ast.ElementType.TYPEALIAS
import com.pinterest.ktlint.core.ast.ElementType.TYPE_REFERENCE
import com.pinterest.ktlint.core.ast.children
import com.pinterest.ktlint.core.ast.isRoot
import com.pinterest.ktlint.ruleset.standard.internal.regExIgnoringDiacriticsAndStrokesOnLetters
import java.nio.file.Paths
import org.jetbrains.kotlin.com.intellij.lang.ASTNode
import org.jetbrains.kotlin.com.intellij.lang.FileASTNode
Expand Down Expand Up @@ -151,7 +152,7 @@ public class FilenameRule : Rule("filename") {
private fun String.shouldMatchPascalCase(
emit: (offset: Int, errorMessage: String, canBeAutoCorrected: Boolean) -> Unit
) {
if (!pascalCaseRegEx.matches(this)) {
if (!this.matches(pascalCaseRegEx)) {
emit(0, "File name '$this.kt' should conform PascalCase", false)
}
}
Expand All @@ -168,7 +169,7 @@ public class FilenameRule : Rule("filename") {
?.let { TopLevelDeclaration(elementType, it) }

private companion object {
val pascalCaseRegEx = Regex("""^[A-Z][A-Za-z\d]*$""")
val pascalCaseRegEx = "^[A-Z][A-Za-z\\d]*$".regExIgnoringDiacriticsAndStrokesOnLetters()
val NON_CLASS_RELATED_TOP_LEVEL_DECLARATION_TYPES = listOf(OBJECT_DECLARATION, TYPEALIAS, PROPERTY)
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
package com.pinterest.ktlint.ruleset.standard.internal

/**
* Transforms a string containing regular expression ranges like "A-Z" and "a-z" to a RegEx which checks whether a
* unicode character has an uppercase versus a lowercase mapping to a letter. This function intents to keep the original
* expression more readable
*/
internal fun String.regExIgnoringDiacriticsAndStrokesOnLetters() =
replace("A-Z", "\\p{Lu}")
.replace("a-z", "\\p{Ll}")
.toRegex()
Original file line number Diff line number Diff line change
Expand Up @@ -61,4 +61,16 @@ class EnumEntryNameCaseRuleTest {
LintViolation(4, 5, "Enum entry name should be uppercase underscore-separated names like \"ENUM_ENTRY\" or upper camel-case like \"EnumEntry\"")
)
}

@Test
fun `Issue 1530 - Given enum values containing diacritics are allowed`() {
val code =
"""
enum class SomeEnum {
ŸÈŚ_THÎS_IS_ALLOWED_123,
ŸèśThîsIsAllowed123,
}
""".trimIndent()
enumEntryNameCaseRuleAssertThat(code).hasNoLintViolations()
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,14 @@ class FilenameRuleTest {
.hasLintViolationWithoutAutoCorrect(1, 1, "File '$UNEXPECTED_FILE_NAME' contains a single class and possibly also extension functions for that class and should be named same after that class 'Foo.kt'")
}

@Test
fun `Issue 1530 - Given a file which name should match PascalCase then this name may also contain letters with diacritics`() {
val code = "// some code"
fileNameRuleAssertThat(code)
.asFileWithPath("ŸëšThïsĮsÂllòwed123.kt")
.hasNoLintViolations()
}

private companion object {
const val NON_PASCAL_CASE_NAME = "nonPascalCaseName.kt"
const val UNEXPECTED_FILE_NAME = "UnexpectedFileName.kt"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
package com.pinterest.ktlint.ruleset.standard.internal

import org.assertj.core.api.Assertions.assertThat
import org.junit.jupiter.params.ParameterizedTest
import org.junit.jupiter.params.provider.ValueSource

class RemoveDiacriticsFromLettersTest {
@ParameterizedTest(name = "Original character: {0}, expected result: {1}")
@ValueSource(
strings = [
"àáâäæãåā",
"çćč",
"èéêëēėę",
"îïíīįì",
"ł",
"ñń",
"ôöòóœøōõ",
"ßśš",
"ûüùúū",
"ÿ",
"žźż",
"ÀÁÂÄÆÃÅĀ",
"ÇĆČ",
"ÈÉÊËĒĖĘ",
"ÎÏÍĪĮÌ",
"Ł",
"ÑŃ",
"ÔÖÒÓŒØŌÕ",
"ŚŠ",
"ÛÜÙÚŪ",
"Ÿ",
"ŽŹŻ"
]
)
fun `Given a letter with a diacritic then remove it`(original: String) {
assertThat(original.matches("[A-Za-z]*".regExIgnoringDiacriticsAndStrokesOnLetters())).isTrue
}
}
64 changes: 41 additions & 23 deletions ktlint/src/main/kotlin/com/pinterest/ktlint/internal/FileUtils.kt
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import java.nio.file.Paths
import java.nio.file.SimpleFileVisitor
import java.nio.file.attribute.BasicFileAttributes
import kotlin.system.exitProcess
import kotlin.system.measureTimeMillis
import mu.KotlinLogging
import org.jetbrains.kotlin.util.prefixIfNot

Expand Down Expand Up @@ -74,33 +75,50 @@ internal fun FileSystem.fileSequence(
}
}

Files.walkFileTree(
rootDir,
object : SimpleFileVisitor<Path>() {
override fun visitFile(
filePath: Path,
fileAttrs: BasicFileAttributes
): FileVisitResult {
if (negatedPathMatchers.none { it.matches(filePath) } &&
pathMatchers.any { it.matches(filePath) }
) {
result.add(filePath)
logger.debug {
"""
Start walkFileTree for rootDir: '$rootDir'
include:
${pathMatchers.map { " - $it" }}
exlcude:
${negatedPathMatchers.map { " - $it" }}
""".trimIndent()
}
val duration = measureTimeMillis {
Files.walkFileTree(
rootDir,
object : SimpleFileVisitor<Path>() {
override fun visitFile(
filePath: Path,
fileAttrs: BasicFileAttributes
): FileVisitResult {
if (negatedPathMatchers.none { it.matches(filePath) } &&
pathMatchers.any { it.matches(filePath) }
) {
logger.debug { "- File: $filePath: Include" }
result.add(filePath)
} else {
logger.debug { "- File: $filePath: Ignore" }
}
return FileVisitResult.CONTINUE
}
return FileVisitResult.CONTINUE
}

override fun preVisitDirectory(
dirPath: Path,
dirAttr: BasicFileAttributes
): FileVisitResult {
return if (Files.isHidden(dirPath)) {
FileVisitResult.SKIP_SUBTREE
} else {
FileVisitResult.CONTINUE
override fun preVisitDirectory(
dirPath: Path,
dirAttr: BasicFileAttributes
): FileVisitResult {
return if (Files.isHidden(dirPath)) {
logger.debug { "- Dir: $dirPath: Ignore" }
FileVisitResult.SKIP_SUBTREE
} else {
logger.debug { "- Dir: $dirPath: Traverse" }
FileVisitResult.CONTINUE
}
}
}
}
)
)
}
logger.debug { "Results: include ${result.count()} files in $duration ms" }

return result.asSequence()
}
Expand Down

0 comments on commit b882d09

Please sign in to comment.