New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
License header check #364
License header check #364
Conversation
I found this discussion thread related to this header audit tool: https://lists.apache.org/thread/mcw0l9g71yzvcftkppnrxspz7skdgglz I am familiar with Apache Rat as well and would love to know its limitations. |
Apache Rate is mainly based on predefined rules and regular expressions to search for license text, which may not accurately recognize complex or non-standard license statements.
Rat cannot automatically determine whether a file indeed needs to contain a certain authorization statement, and in some cases, it may need to be manually reviewed and decided based on specific circumstances.
Rat may not be able to parse and check the content of all types of files, especially for some special encoding formats or binary files.
For large-scale projects or projects with special authorization requirements, configuring the Rate to adapt to all situations may be complex and require the writing of additional metadata or rule files.
When integrating Rat into the CI/CD process, additional work may be required to handle report results and ensure that violations can be detected and fixed in a timely manner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The project uses spotless-maven-plugin to enforce/check the LICENSE header as part of the build. Why do we need a separate CI action and additional tooling?
@zabetak Because some non Java files, |
@wuchunfu Spotless is not limited to Java as far as I know. Indeed the current configuration of the plugin only deals with Java files but this can change I believe. The current PR adds almost all non-java files to |
@zabetak I have tried several methods, but have not been successful. May I know which projects have been done in this way? Can you provide some example projects? |
@wuchunfu I don't have some sample open-source project in mind but it appears that the license-header step is a generic step that could be applied in a variety of formats according to the documentation. For instance for .md files adding the following configuration should do the trick:
It's probably necessary to create appropriate configurations for different file types and some things could obviously be grouped together. I see the same discussion spanning under #359 so not sure where is the best place to continue discussing. |
@zabetak I agree with your suggestion, thank you for your guidance. I will use |
@xtable-bot run azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spotless plugin will fail if the header is missing, right? We are running that as part of our regular CI so we shouldn't need this anymore?
In some known rules, we can check through |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wuchunfu Can you add a quick section to the README or something that gives some quick pointers to how to update the spotless plugin if we come across a new file type we need to incorporate into that auto-formatter?
@the-other-tim-brown Thank you for your reminder. I will organize it and write it in readme later |
@the-other-tim-brown PTAL, thanks |
@@ -99,6 +99,36 @@ The custom hadoop configurations can be passed in with the `--hadoopConfig [cust | |||
The config in custom hadoop config file will override the default hadoop configurations. For an example | |||
of a custom hadoop config file, see [hadoop.xml](https://xtable.apache.org/docs/fabric#step-2-translate-source-table-to-delta-lake-format-using-apache-xtable-incubating). | |||
|
|||
# New file type spotless automatic formatting and modification operation | |||
|
|||
When a new type of file is merged into the warehouse, it is possible that the `spotless-maven-plugin` will not add the license header by default, and when it is detected by ci and prompted to add the `license header`, what we have to do is to modify the project. The modified part of `pom.xml` under the root path is as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a new type of file is merged into the warehouse, it is possible that the `spotless-maven-plugin` will not add the license header by default, and when it is detected by ci and prompted to add the `license header`, what we have to do is to modify the project. The modified part of `pom.xml` under the root path is as follows: | |
When a new type of file is added to the repo, it is possible that the `spotless-maven-plugin` will not add the license header by default. This is detected by the `License Check` GitHub Action. When this happens the developer should must modify part of `pom.xml` under the root path is as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wuchunfu here are a few reasons for my suggestion:
- Warehouse is a commonly used term in the data space for a Data Warehouse (like Snowflake and BigQuery). It would be best to make sure we avoid confusion.
- Adding the name of the check in CI can help people know what to look for.
@xtable-bot run azure |
@the-other-tim-brown PTAL, thanks |
@ashvina any feedback? |
@the-other-tim-brown PTAL, thanks |
I have no significant concerns with SkyWalking Eyes. In conjunction with Spotless, it could assist in maintaining code compliance. We might consider revisiting this plugin if the combination of the two begins to pose a problems or overheads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wuchunfu can you squash the commits for us?
@the-other-tim-brown done |
CI report:Bot commands@xtable-bot supports the following commands:
|
License header check