-
-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce "collection" projects for better usage of hierarchical view #2041 #3258
base: master
Are you sure you want to change the base?
Introduce "collection" projects for better usage of hierarchical view #2041 #3258
Conversation
Has there been any updates/progress on the feature in completion of the 4.11 release? |
Looking forward for this feature 👍 |
I messed up the branch trying to rebase onto master to resolve 1 conflicting file... ending up with >1300 changed files now that already are changed in master. no idea what git is doing... will try to fix it but I'm struggling a bit how to get back to a clean state |
0aa204e
to
ddc5c2b
Compare
Warning: I fixed a lot of conflicts and did not have time to test it. Will try to find time tomorrow to test it. I think I finally also fixed the branch. |
@nscuro I wonder why the unit test still fails. I don't see a change that could influence the parent of a project not being serialized? |
@rkg-mm FWIW there can be subtle changes in how serialization libraries (i.e. Jackson) handle certain fields, when those libraries are updated. Or how the persistence framework fetches fields implicitly. |
Yea so, the PRs backend & frontend seem to be working well. But I have no clue about this unit test and will need some help here. The test explicitly tests the now not working condition, so I guess its important that it works, but I don't see a change that causes the failure.
|
* @author Ralf King | ||
* @since 4.11.0 | ||
*/ | ||
public enum ProjectCollectionLogic { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add some documentation about this. In #2041 you mention a few use cases that these address, having those be part of the docs will definitely help others.
src/main/java/org/dependencytrack/persistence/ProjectQueryManager.java
Outdated
Show resolved
Hide resolved
@@ -538,6 +547,7 @@ public Project updateProject(Project transientProject, boolean commitIndex) { | |||
project.setDescription(transientProject.getDescription()); | |||
project.setVersion(transientProject.getVersion()); | |||
project.setClassifier(transientProject.getClassifier()); | |||
project.setCollectionLogic(transientProject.getCollectionLogic()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the collection logic from say AGGREGATE_DIRECT_CHILDREN
to HIGHEST_SEMVER_CHILD
can drastically change metric values on existing projects. There will be no indication as to if and when the logic was changed, which can make it tricky to explain why values changed so much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, but also it would be an intended change. Idk what could be done about this except add such change to the logfile, but that would only be helpful for deep analysis of such issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea, but at least on a first look the frontend chart lib doesn't support this. not sure how difficult this would be to add on top of it. Any experience with that? I never used that component.
From data perspective in backend it should not be too complicated.
edit: or would be adding it to the charts tooltip be enough? Also it seems there are options to style datapoints differently. Maybe we could give such points a different styling + add the info in the tooltip?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have too strong opinions on how we highlight it, but I do think that highlighting it one way or the other would be good.
Tooltip on it's own might be too hidden, but styling the datapoints sounds good.
src/main/java/org/dependencytrack/tasks/metrics/ProjectMetricsUpdateTask.java
Outdated
Show resolved
Hide resolved
/** | ||
* Find the child with the highest SemVer version and show only that child's data. | ||
* Ignores children not using SemVer versions. | ||
*/ | ||
HIGHEST_SEMVER_CHILD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How critical is the capability to choose the test child by version?
I'm asking because metrics updates will be done via stored procedures for DT v5.x, since we found calculating them in the app to be too inefficient. What this means is that certain semantics will be harder to achieve, one such example being versioning (it's possible to write functions for them of course).
My main concern is that SemVer will not be sufficient for everyone, there are so many versioning schemes out there, and it can become a true nightmare trying to support multiple of them. That is already true for Java, it will be even more true for SQL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its pretty important for versioned software.
Only alternative I could think of:
One needs logic in CI scripts to identify the latest SemVer version in Dtrack after creating a version and must auto-manage tags to always append a "highest-version" tag or similar to the latest version on its own and auto-remove it for all other versions. Then you can use the "aggregate by tag" logic to achieve the same. But this requires quite some logic in CI builds.
Or could we at least offer an API to identify the latest SemVer version of a project? That would not be called as often as Metrics update, hence doing it outside SQL would be ok. Then it would be a step easier to implement above logic.
Or even better: one could enable an optional feature that lets the project management APIs auto-tag the latest SemVer version automatically whenever projects are added/removed/reversioned. (Could also be a DB flag instead of a Tag)
Would this be an option? Then we don't lose any comfort AND the info could even be reused for other potential features, while not slowing down or making more complicated the metrics update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or could we at least offer an API to identify the latest SemVer version of a project?
When dealing with versioned software like this, wouldn't it be a fair assumption that BOMs of older versions would have been uploaded earlier than those of newer versions?
At the very least, the latest version should be somewhere among the most recently uploaded BOMs, no? So you could use this endpoint in CI to pull the last, say, 5
projects with name foo
and then do the version comparison as needed:
curl -H 'X-Api-Key: <API_KEY>' \
'http://localhost:8080/api/v1/project?name=foo&sortName=lastBomImport&sortOrder=desc&offset=0&limit=5'
To be clear, I'm not strictly against supporting SemVer in DT, I'm just trying to avoid us steering into a maintenance mess once people start asking for more versioning schemes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When dealing with versioned software like this, wouldn't it be a fair assumption that BOMs of older versions would have been uploaded earlier than those of newer versions?
Not necessarily.
Lets say you have 4 major versions
1.x.x, 2.x.x, 3.x.x, 4.x.x
and you support e.g. the last 3 of those due to business contracts (>=2.x.x) additionally you have active development on the latest version (4.x.x). In this case it can easily happen your upload order is for a while (first to latest):
4.1.0
4.1.1 (bugfix)
3.4.1 (backport of bugfix)
2.3.1 (backport of bugfix)
2.3.2 (additional fix in older version that doesn't apply to newer version)
...
At the very least, the latest version should be somewhere among the most recently uploaded BOMs, no? So you could use this endpoint in CI to pull the last, say, 5 projects with name foo and then do the version comparison as needed:
Likely, yes. I would also assume for our current use cases 5 would be enough. However, idk if that would be the case for everyone. Special cases (e.g. newly using Dtrack and importing existing SBOMs of a lot of SW versions) also not considered.
Not sure whats the best strategy here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also assume for our current use cases 5 would be enough. However, idk if that would be the case for everyone.
Nothing would prevent users to traverse the project list to the very end in order to determine the "latest" version.
Not sure whats the best strategy here...
My suggestion would be to investigate if and how the use case can be addressed with tools we already have available, including pulling all project versions and doing the "latest" determination on the client-side (i.e. CI scripts).
Based on that, figure out what the pains / limitations are with that approach, and how a server-side implementation could help resolve that. Are the identified limitations show-stoppers or merely annoyances?
If we are really sure we need a server-side implementation, we will need something that covers more than vanilla SemVer: Some organizations will version their software according to Maven, others may use .NET semantics, and so on. We also need to consider that organizations may track software they did not build (i.e. container images from vendors), and thus don't have control over versioning. The vers
specification lists a few versioning schemes here: https://github.com/package-url/purl-spec/blob/version-range-spec/VERSION-RANGE-SPEC.rst#some-of-the-known-versioning-schemes
We don't have to cover all cases from the very beginning, but the solutions should at least be extensible such that more can be added in the future.
More generally, my recommendation would be to decouple the version-awareness from this PR, as to not block this from getting merged.
I debugged the area in question multiple times now and still don't quite understand how this is happening. The parent is part of the response when I comment out these lines: https://github.com/DependencyTrack/dependency-track/pull/3258/files#diff-75cd29a0a085d84dff3cb4794242e0d1b400385f1acfae172346c3e6780629b0R582-R590 Something about those lines causes |
oh come on.. Idk what IntelliJ is doing but whenever I rebase this branch on your master I end up having all changes from master in my PR again. What am I doing wrong here? I don't get it... edit: ok fixed it again with some hacky workaround. still don't know what I do wrong here... |
…CollectionLogics. Modifies project metrics updates to consider collection project logics. Added several events to trigger project metrics updates to update collection projects if child projects change in a relevant way. Signed-off-by: Ralf King <rkg@mm-software.com>
Add test cases for new metrics update logic for collection projects. Signed-off-by: Ralf King <rkg@mm-software.com>
… all projects Signed-off-by: Ralf King <rkg@mm-software.com>
Signed-off-by: Ralf King <rkg@mm-software.com>
… wrongly influence numbers Signed-off-by: Ralf King <rkg@mm-software.com>
Signed-off-by: Ralf King <rkg@mm-software.com>
Signed-off-by: Ralf King <rkg@mm-software.com>
efe2c29
to
ecfc394
Compare
@rkg-mm How do you do the rebase? I usually run I never had this particular problem happen, but can imagine IntelliJ might do some unexpected things when merging/rebasing via UI. |
In IntelliJ UI I select the upstream master and select "Rebase (my branch) onto master" then push the changes to my remote. The strange thing happens here already, as suddenly my own remote has changes not recognized by my local branch and needs to either be merged (which I tried before and ended up with double changes on same files) or I select "rebase anyway" which avoids double merging. But in both cases I still end up with all those unwanted changes again. Maybe I should try the console attempt next time... |
* Add validation to prevent invalid states of collection projects (prevent collection project having Components or services), including several unit tests covering these scenarios. Signed-off-by: Ralf King <rkg@mm-software.com>
…ject actually changed. Signed-off-by: Ralf King <rkg@mm-software.com>
Fixed by your suggestion |
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesCodacy will stop sending the deprecated coverage status from June 5th, 2024. Learn more |
@rkg-mm We have to schedule this for 4.12 to allow us for sufficient time to review and test. |
no problem, I understand. Lets just hope the release cycle is a bit shorter this time :D |
Is there a set release cadence? or other logic for determining when releases are completed? |
There is not at the moment. We originally planned to do monthly releases, but ended up stretching that (by far...) yet again. A problem is that the core team has certain items they want (and have to work on) in a given release, while said core team also needs to review and test community contributions. It's a luxury problem, but it means that more time is needed to meet both our own, and the community's expectations. This particular PR was planned for inclusion in 4.11 for a long time, but it also touches on the very core of how DT behaves, in turn requiring more review and testing than other, smaller PRs. And of course contributors also don't always have time to answer our annoying requests. :) We're still in the process of figuring out how to best plan and size releases. Going forward, we're labelling issues with T-shirt sizes, according to their effort. Aim being to limit the number of high-effort items per release, enabling faster cycles. |
Description
This change introduces logic for "collection projects". Those are basically projects used as parent for other projects that shall not hold any own component or vulnerability data, but instead get calculated from child projects using different configurable aggregation logics.
Corresponding Frontend PR with Screenshots: DependencyTrack/frontend#658
Addressed Issue
#2041
#657
#2410
Additional Details
Checklist
This PR fixes a defect, and I have provided tests to verify that the fix is effective