Skip to content

Checkstyle GSoC 2023 Project Ideas

Roman Ivanov edited this page Apr 20, 2024 · 22 revisions

participated, https://summerofcode.withgoogle.com/archive/2023/organizations/checkstyle

Selected:



Medium size projects:


Project Name: Auto-fix Module

Skills required: intermediate Java

Project type: new feature implementation.

Project goal: implement new module, test it on real projects

Project size: large

Mentors: Roman Ivanov, Daniel Mühlbachler, Vyom Yadav

Description: Checkstyle is known as tool that raises numerous minor issues. There are so many of these and they are so minor that it is hard to find time and engineer to fix them. Most of the issues are so easy to fix but navigation to certain part of the code and making the fix takes time. Engineers could spend this time doing something more valuable. Implementation of an auto-fix functionality could significantly simplify introduction of checkstyle to project as it will do most tedious work automatically.

The major part of checkstyle violations are specifically targeting the formatting of the code. It is often that IDE formatting settings are not in sync with the checkstyle configuration. The IDE can fix the code itself as part of it’s auto-formatting. The same should be done by Checkstyle. Each Check that is targeting the formatting part of the code should have “Fix” functionality built-in. This functionality will convert the code with the violation to compliant code without any user interaction. Such functionality is in huge demand by users.

In scope of this project, it is required to review all existing functionality of auto-fix of code in plugins and tools to learn challenges they have and see the whole list of requirements to resolve such a task. Make implementation of auto-fix for formatting Checks as part of a special Module that takes all reported violations and fix them that will support auto-fix. If the resulting functionality proves to be easy to maintain, and might be reused by checkstyle plugins, then propose API changes can be brought to the core library and allow any plugins to reuse it.

More details at https://github.com/checkstyle/checkstyle/issues/7427

Links to similar tools: https://docs.openrewrite.org/tutorials/automatically-fix-checkstyle-violations, https://github.com/solven-eu/cleanthat

Ai autofix for checkstyle: https://link.springer.com/article/10.1007/s10664-021-10107-0

Auto fix in Eclipse https://github.com/checkstyle/eclipse-cs/pull/566/files#diff-13e277cb135ea2a474dad0b4ac46b5cb020f9c03a2eb6676b15de010f8aec369R549


Project Name: Optimization of distance between methods in single Java class

Skills required: basic Java , good analytical abilities, good background in mathematics.

Project type: new feature implementation.

Project goal: to make quality practices automated and publicly available.

Project size: large

Mentors: Roman Ivanov, Baratali Izmailov

Description:

This task is ambitious attempt to improve code read-ability by minimizing user jump/scrolls in source file to look at details of method implementation when user looks at method first usage.

It is required to analyse a lot of code and find a model to minimize distance between methods first usage and method declaration in the same file and respect users preferences to keep grouped overloaded and overridden methods together. Some other preferences may appear during investigation of open-source projects.

First step is already done by our team, we created a web service that already calculate distances between methods and make DSM matrix to ease analysis - methods-distance. We already practice it in our project.

As a second step it is required to use a matrix of distances between methods and optimize it by some empiric algorithm to allow user define expected model of class by arguments. This will allow to use this algorithm as a Check to enforce code structure automatically during build time.

Results of the project:

  • article with all details of analysis and algorithm details;
  • new Checkstyle's Check with optimization algorithm to share the algorithm with whole java community.

Prove of necessity: we have a number of PRs where contributors put new methods at any possible place in a class but better place is close to first usage. Example #1, Example #2, Example #3, ....


Project Name: Reconcile formatters of Eclipse , NetBeans and IntelliJ IDEA IDEs by Checkstyle config.

Skills required: basic Java.

Project type: new feature implementation, analysis of existing IDE features.

Project goal: to make well-known quality practices publicly available.

Project size: large

Mentors: Roman Ivanov, Pavel Bludov

Description:

Usage of different IDEs in the same team is already a serious problem, as different IDEs format code base on their own rules and configurations. Unwanted formatting changes happen to code which complicate code-review process. Problem become more acute when project use static analysis tool like Checkstyle that has a wide range of code formatting Checks.

It is required to make it possible to use the same Checkstyle config to work in IDEs without conflicts with IDEs internal formatters. This will help team members be independent on IDE choice but at the same time keep the same format and code style throughout the team.

Main focus of this project is the analysis of formatting abilities of IDEs (indentation, imports order, declaration order, separator/operator wrap, .....) . Update existing Checkstyle Rules to be able to work in the similar and non-conflicting way.

Results of the project:

  • create configuration for IDEs for Checkstyle project to let Checkstyle team use it and auto-format code to conform with checkstyle_check.xml file that is used by Continuous Integration.
  • create Checkstyle config that follows default Eclipse formatting + inspection rules
  • create Checkstyle config that follows default IntelliJ IDEA formatting + inspection rules
  • create Checkstyle config that follows default NetBeans formatting + inspection rules

Prove of necessity: mail-list post #1, mail-list post #2, mail-list post #3 , discussion #1


Project Name: Open JDK Code convention coverage

Skills required: basic Java.

Project type: new feature implementation.

Project goal: to make well-known quality practices publicly available.

Project size: large

Mentors: Roman Ivanov, Pavel Bludov, Vyom Yadav

Description:

OpenJdk Code Convention was one of the first guidelines on how to write Java code. OpenJdk Code Convention is marked as outdated (because of date of last update made in it) but best practices described there do not have an expiration date. New OpenJDK Java Style Guidelines is close to the final version and most likely will be successor of OpenJdk Code Convention. But there is a number of projects in Apache that still follow OpenJdk rules, so both configurations are in need by community.

OpenJdk Code Convention is already partly covered by Checkstyle, known as Sun Code Convention. A lot of validation Rules were added and changed in Checkstyle from the time when Sun's configuration was created (2004 year).

During the project it is required to review both documents in detail and prove publicly that Checkstyle covers all guideline rules. Missed functionality needs to be created, blocking bugs need to be fixed. Page OpenJdk Java Style Checkstyle Coverage needs to be updated. New page "New OpenJDK's Java Style Checkstyle Coverage" need to be created. Both pages need to be formatted in the same way as it is done for Google's Java Style Checkstyle Coverage.

Prove of necessity: javadoc issues on github; results of open survey; request from users for Openjdk coverage support.


Project Name: Coverage of Documentation Comments Style Guide

Skills required: basic Java.

Project type: new feature implementation.

Project goal: to make well-known quality practices publicly available.

Project size: large

Mentors: Roman Ivanov, Pavel Bludov

Description:

Project will mainly be focusing on automation of Documentation Comments (javadoc) guidelines by Checkstyle Checks. Reliable comments parsing was a major improvement in Checkstyle during GSoC 2014, archived results need to be reused to reliably implement automation of Javadoc best practices.

Separate configuration file with newly created Checks need to be created. Best practices in documentation make sense not for all projects. Javadoc validation matters only for library projects that need to expose online documentation in web publicly.

The result of this project will be a configuration file with the maximum possible coverage of Comment style guide. Report should look like Google's Java Style Checkstyle Coverage. If there will be time left we can focus on coverage of guidelines from https://blog.joda.org/2012/11/javadoc-coding-standards.html

Prove of necessity: javadoc issues on github.


Project Name: Spellcheck of Identifiers by English dictionary

Skills required: intermediate Java.

Project type: new feature implementation.

Project goal: implement spell checking for java code for all identifiers .

Project size: large

Mentors: Roman Ivanov, Andrei Paikin

Description:

The correct spelling of words in code is very important, since a typo in the name of method that is part of API could result in serious problem. Mistakes in names also make reading of code frustrating and misleading, especially when a typo in one letter makes developer to read javadoc or even implementation of the method. Two most popular IDEs (Eclipse and IntelliJ IDEA) already have spell-check ability. It will be beneficial for Checkstyle to have the same functionality that could be used in any Continuous Integration system by Command Line Interface or as part of build tool (maven, ant, gradle, ....) with wide range of options to customize to users needs. Features of existing spell-checkers need to be analyzed -
IntelliJ IDEA Spellchecking , Eclipse Spelling. There are numbers of open-source projects that do spell-check. It is ok to reuse them if license is compatible. Examples: https://code.google.com/archive/p/bspell/ , http://www.softcorporation.com/products/spellcheck/, ... https://github.com/giraciopide/shellcheck-maven-plugin


Project Name: Generate xdoc web site files based on javadoc content of Modules. Automate verification of examples for all modules.

Skills required: intermediate Java

Project type: creation of new functionality

Project goal: organize documentation and automate its maintenance

Project size: medium

Mentors: Roman Ivanov, Nick Mancuso, Vyom Yadav

Description: Checkstyle is an active project. Our user base is always requesting existing functionality to be expanded and adding brand new features. As these features are added to the core Checkstyle project, documentation must be updated to notify users not involved in the request of its existence. Some changes can drastically change the default behavior of a module. Documentation becomes extremely important to help users understand how our modules work and can be configured to fit each unique persons’ needs without looking at the source behind the scenes.

Documentation is mostly a manual process and it is easy to miss updating it during the fix workflow. Missing documentation on functionality can be missed for years as users can only go by documentation to know what exists. Even if it is caught and tried to be added, some contributors are not aware of our best practices when it comes to writing said documentation.

We want to automate most of our documentation creation to help avoid the manual processes in creating it. Automation will ensure all documentation for Checkstyle follows a strict standard that we define. Not only ensuring all configurable options are documented, it will help detect if current examples of usage are enough or if more are needed. It will ensure examples provided are valid, compilable if Java, and that it will or will not produce the violations for the configuration and check being described. For any new modules added, it will print out a template for the contributor to follow and fill in the required information specific to that check, like descriptions.

As part of this project, students must ensure all documentation verification pass for existing documentation and generation of xdoc/html content done automatically and do not need manual updates.

Project Name: Regression Testing Tool and HTML Report Generator for Pull Requests

Skills required: basic Java, or Shell, or Groovy, or Scala; basic understanding of testing principles

Project type: creation of testing tools

Project goal: to enforce quality and ease new Rule implementation to project

Project size: medium

Mentors: Roman Ivanov, Nick Mancuso, Vyom Yadav

Description:

Checkstyle needs a tool that will handle check regression testing based on the changes in a given pull request. The tool will need to identify which check modules were changed in the pull request in order to test them and ensure that we do not introduce bugs or lose functionality. Based on which check modules have been changed in the pull request, the tool should generate testing configurations (checkstyle configuration files) and regression test the code in the pull request against the master branch of Checkstyle on a list of projects. Then, a check regression report (showing differences of violations, if any) needs to be generated and shared in the pull request.

Each module could have set of manually prepared configuration chunks that should be used if that module was changed; however, full automation is highly desirable, i.e. generation of configurations from the check module itself.

Base Repo: https://github.com/checkstyle/regression-tool

Proof of necessity:


Project Name: Pitest Resolution

Project type: Resolving outstanding Pitest Issues

Project goal: to enforce quality and reduce technical debt

Project size: medium

Mentors: Roman Ivanov, Nick Mancuso, Vyom Yadav

Description:

Checkstyle recently introduced a number of new mutators to our mutation testing suite, powered by pitest. PIT is a state of the art mutation testing system, providing gold standard test coverage for Java and the jvm. Mutation testing ensures that we consider all conditions possible in our code (and even the removal of such code), helping to ensure that our code and tests are up to the highest standards. As we add new mutators, we use our own suppression system to help manage pitest violations.

For this project, Checkstyle needs these suppressions reviewed to identify if any new tests or test input can be identified to resolve them and show the code is still functionally sound. This may require diving into the logic of the modules to either assist in identifying with a test or leading to resolution on what to do with the suppression.

Helpful Links:


Project Name: Eliminate Maven Plugin Usage

Skills required: basic Java, Shell, Groovy

Project goal: remove all usages of maven-checkstyle-plugin in our tools

Project size: medium

Mentors: Roman Ivanov, Nick Mancuso, Vyom Yadav

Description:

Checkstyle is a library used by many other tools. We have become too dependent on another tool, maven-checkstyle-plugin, that we use in continuous integration testing and regression testing. This reliance continues to prevent Checkstyle to release breaking changes to our project as this also breaks all the usage in our testing. This has constantly requires us to do work arounds to not disturb our connection and reliance on maven-checkstyle-plugin.

Checkstyle needs to break away and really only rely on tools we maintain. Below is a list of connected issues which detail some of the areas that need to change in order to break away from this plugin.

Connected Issues:

Example of Plugin Issue: Upgrade XML logger to XML 1.1


Project Name: Update Google style config to most recent content of style guide and resolve known issues with modules that are in such config

Skills required: basic Java

Project goal: improve quality of google style guide implementation

Project size: large

Mentors: Roman Ivanov, Richard Veach

Description:

Checkstyle already have good implementation of Google style. We have coverage page that describe what version of style we support. We need to review all changes like this in Google style guide and update config of Checks to support what is described. Additionally we need to resolve defects and problems that are reported by our users about mismatch of style guide. Conceptually all issues that are reported for Modules/Checks that are present in Google Style config need to be reviewed, labeled by be easily searched by filter below and fixed.

Connected Issues:


Project Name: Patch Suppression improvement

Skills required: basic Java

Project type: extension of existing feature implementation.

Project goal: implement new strategies for existing filter/suppression module or improve existing

Project size: large

Mentors: Roman Ivanov

Description: Introducing Checkstyle to a project can be a challenging and NOT an easy job, especially when a project has massive amount of code, very active in development, and there are no resources to start a new process of code cleanup. It may require an extensive effort, especially when there is legacy code from previous contributors that becomes a monotonous job, that everyone tries to avoid. It is easy to say how code should look like, but may be hard to actually enforce rules in existing codebase.

For example Guava is not following google style, and it is easy to say how code should look like but hard to assign somebody to fix ALL problems from previous contributors. It is very boring activity that all will try to avoid. Good practice from openjdk actually discourage code changes without good reason.

Better approach is to let existing code be as is and validate only new code. Checkstyle already has a wide array of filter functionality that could suppress certain violations if user classify a violation as “won’t fix”. Just getting started with setting up the initial suppressions still requires a huge effort to review all the violations, or organize a team on special cleanup process.

This project was originally done at GSOC 2020, but during usage of this project we found problems that checkstyle violations are still going beyond changed code that creates avalanche of change so it complicate usage of it in real project.

We need to invest focus on parsing of patch files to get more precise location of changes and be able skip violation if fix for it goes outside of changed lines. For example: user changing line wrapping of long signature of method and we should not demand decreasing of amount of parameters or fixing names, as this will trigger changes in other part of code.

As proof of success for this project, it is required to get some open source project onboard to use checkstyle and this new feature. It would be good to try collaborate one more time with Guava project or we can ask our friends in Eclipse-CS or Spring or Hbase project.

Clone this wiki locally