Introduce "collection" projects for better usage of hierarchical view #2041 #3258

rkg-mm · 2023-12-02T16:18:16Z

Description

This change introduces logic for "collection projects". Those are basically projects used as parent for other projects that shall not hold any own component or vulnerability data, but instead get calculated from child projects using different configurable aggregation logics.

Corresponding Frontend PR with Screenshots: DependencyTrack/frontend#658

Addressed Issue

#2041
#657
#2410

Additional Details

Checklist

I have read and understand the contributing guidelines
~~This PR fixes a defect, and I have provided tests to verify that the fix is effective~~
This PR implements an enhancement, and I have provided tests to verify that it works as intended
This PR introduces changes to the database model, and I have added corresponding update logic
This PR introduces new or alters existing behavior, and I have updated the documentation accordingly

cheapshot2000 · 2024-03-25T21:52:24Z

Has there been any updates/progress on the feature in completion of the 4.11 release?

TWpgo · 2024-04-22T11:34:35Z

Looking forward for this feature 👍

rkg-mm · 2024-04-22T19:56:18Z

I messed up the branch trying to rebase onto master to resolve 1 conflicting file... ending up with >1300 changed files now that already are changed in master. no idea what git is doing... will try to fix it but I'm struggling a bit how to get back to a clean state

rkg-mm · 2024-04-23T01:02:48Z

Warning: I fixed a lot of conflicts and did not have time to test it. Will try to find time tomorrow to test it. I think I finally also fixed the branch.

rkg-mm · 2024-04-23T08:56:17Z

@nscuro I wonder why the unit test still fails. I don't see a change that could influence the parent of a project not being serialized?

nscuro · 2024-04-23T09:10:50Z

@rkg-mm FWIW there can be subtle changes in how serialization libraries (i.e. Jackson) handle certain fields, when those libraries are updated. Or how the persistence framework fetches fields implicitly.

rkg-mm · 2024-04-23T09:33:16Z

Yea so, the PRs backend & frontend seem to be working well. But I have no clue about this unit test and will need some help here. The test explicitly tests the now not working condition, so I guess its important that it works, but I don't see a change that causes the failure.

 Error:  Tests run: 40, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 74.367 s <<< FAILURE! - in org.dependencytrack.resources.v1.ProjectResourceTest
Error:  patchProjectParentTest(org.dependencytrack.resources.v1.ProjectResourceTest)  Time elapsed: 1.483 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: 
JSON documents are different:
Different keys found in node "", missing: "parent", expected: 
<{"active":true,"collectionLogic":"NONE","name":"DEF","parent":{"name":"GHI","uuid":"${json-unit.matches:parentProjectUuid}","version":"3.0"},"properties":[],"tags":[],"uuid":"${json-unit.matches:projectUuid}","version":"2.0"}> 
but was: 
<{"active":true,"collectionLogic":"NONE","name":"DEF","properties":[],"tags":[],"uuid":"ba033fa5-aa20-44ae-85de-6a33e5cf36a3","version":"2.0"}>

	at org.dependencytrack.resources.v1.ProjectResourceTest.patchProjectParentTest(ProjectResourceTest.java:733)

nscuro · 2024-04-28T13:01:51Z

src/main/java/org/dependencytrack/model/ProjectCollectionLogic.java

+ * @author Ralf King
+ * @since 4.11.0
+ */
+public enum ProjectCollectionLogic {


It would be good to add some documentation about this. In #2041 you mention a few use cases that these address, having those be part of the docs will definitely help others.

src/main/java/org/dependencytrack/persistence/ProjectQueryManager.java

nscuro · 2024-04-28T13:24:06Z

src/main/java/org/dependencytrack/persistence/ProjectQueryManager.java

@@ -538,6 +547,7 @@ public Project updateProject(Project transientProject, boolean commitIndex) {
        project.setDescription(transientProject.getDescription());
        project.setVersion(transientProject.getVersion());
        project.setClassifier(transientProject.getClassifier());
+        project.setCollectionLogic(transientProject.getCollectionLogic());


Changing the collection logic from say AGGREGATE_DIRECT_CHILDREN to HIGHEST_SEMVER_CHILD can drastically change metric values on existing projects. There will be no indication as to if and when the logic was changed, which can make it tricky to explain why values changed so much.

That's correct, but also it would be an intended change. Idk what could be done about this except add such change to the logfile, but that would only be helpful for deep analysis of such issue.

Fair.

Naive idea: We could add a Collection logic change flag to ProjectMetrics. We could use that in the metrics graph, in some similar way to what you can do in Grafana with annotations:

I like that idea, but at least on a first look the frontend chart lib doesn't support this. not sure how difficult this would be to add on top of it. Any experience with that? I never used that component.
From data perspective in backend it should not be too complicated.

edit: or would be adding it to the charts tooltip be enough? Also it seems there are options to style datapoints differently. Maybe we could give such points a different styling + add the info in the tooltip?

I don't have too strong opinions on how we highlight it, but I do think that highlighting it one way or the other would be good.

Tooltip on it's own might be too hidden, but styling the datapoints sounds good.

src/main/java/org/dependencytrack/tasks/metrics/ProjectMetricsUpdateTask.java

nscuro · 2024-04-28T13:45:49Z

src/main/java/org/dependencytrack/model/ProjectCollectionLogic.java

+    /**
+     * Find the child with the highest SemVer version and show only that child's data.
+     * Ignores children not using SemVer versions.
+     */
+    HIGHEST_SEMVER_CHILD


How critical is the capability to choose the test child by version?

I'm asking because metrics updates will be done via stored procedures for DT v5.x, since we found calculating them in the app to be too inefficient. What this means is that certain semantics will be harder to achieve, one such example being versioning (it's possible to write functions for them of course).

My main concern is that SemVer will not be sufficient for everyone, there are so many versioning schemes out there, and it can become a true nightmare trying to support multiple of them. That is already true for Java, it will be even more true for SQL.

I think its pretty important for versioned software.
Only alternative I could think of:
One needs logic in CI scripts to identify the latest SemVer version in Dtrack after creating a version and must auto-manage tags to always append a "highest-version" tag or similar to the latest version on its own and auto-remove it for all other versions. Then you can use the "aggregate by tag" logic to achieve the same. But this requires quite some logic in CI builds.
Or could we at least offer an API to identify the latest SemVer version of a project? That would not be called as often as Metrics update, hence doing it outside SQL would be ok. Then it would be a step easier to implement above logic.

Or even better: one could enable an optional feature that lets the project management APIs auto-tag the latest SemVer version automatically whenever projects are added/removed/reversioned. (Could also be a DB flag instead of a Tag)
Would this be an option? Then we don't lose any comfort AND the info could even be reused for other potential features, while not slowing down or making more complicated the metrics update.

Or could we at least offer an API to identify the latest SemVer version of a project?

When dealing with versioned software like this, wouldn't it be a fair assumption that BOMs of older versions would have been uploaded earlier than those of newer versions?

At the very least, the latest version should be somewhere among the most recently uploaded BOMs, no? So you could use this endpoint in CI to pull the last, say, 5 projects with name foo and then do the version comparison as needed:

curl -H 'X-Api-Key: <API_KEY>' \ 'http://localhost:8080/api/v1/project?name=foo&sortName=lastBomImport&sortOrder=desc&offset=0&limit=5'

To be clear, I'm not strictly against supporting SemVer in DT, I'm just trying to avoid us steering into a maintenance mess once people start asking for more versioning schemes.

When dealing with versioned software like this, wouldn't it be a fair assumption that BOMs of older versions would have been uploaded earlier than those of newer versions?

Not necessarily.
Lets say you have 4 major versions
1.x.x, 2.x.x, 3.x.x, 4.x.x
and you support e.g. the last 3 of those due to business contracts (>=2.x.x) additionally you have active development on the latest version (4.x.x). In this case it can easily happen your upload order is for a while (first to latest):
4.1.0
4.1.1 (bugfix)
3.4.1 (backport of bugfix)
2.3.1 (backport of bugfix)
2.3.2 (additional fix in older version that doesn't apply to newer version)
...

At the very least, the latest version should be somewhere among the most recently uploaded BOMs, no? So you could use this endpoint in CI to pull the last, say, 5 projects with name foo and then do the version comparison as needed:

Likely, yes. I would also assume for our current use cases 5 would be enough. However, idk if that would be the case for everyone. Special cases (e.g. newly using Dtrack and importing existing SBOMs of a lot of SW versions) also not considered.

Not sure whats the best strategy here...

I would also assume for our current use cases 5 would be enough. However, idk if that would be the case for everyone.

Nothing would prevent users to traverse the project list to the very end in order to determine the "latest" version.

Not sure whats the best strategy here...

My suggestion would be to investigate if and how the use case can be addressed with tools we already have available, including pulling all project versions and doing the "latest" determination on the client-side (i.e. CI scripts).

Based on that, figure out what the pains / limitations are with that approach, and how a server-side implementation could help resolve that. Are the identified limitations show-stoppers or merely annoyances?

If we are really sure we need a server-side implementation, we will need something that covers more than vanilla SemVer: Some organizations will version their software according to Maven, others may use .NET semantics, and so on. We also need to consider that organizations may track software they did not build (i.e. container images from vendors), and thus don't have control over versioning. The vers specification lists a few versioning schemes here: https://github.com/package-url/purl-spec/blob/version-range-spec/VERSION-RANGE-SPEC.rst#some-of-the-known-versioning-schemes

We don't have to cover all cases from the very beginning, but the solutions should at least be extensible such that more can be added in the future.

More generally, my recommendation would be to decouple the version-awareness from this PR, as to not block this from getting merged.

nscuro · 2024-04-28T14:07:38Z

@rkg-mm But I have no clue about this unit test and will need some help here. The test explicitly tests the now not working condition, so I guess its important that it works, but I don't see a change that causes the failure.

I debugged the area in question multiple times now and still don't quite understand how this is happening. The parent is part of the response when I comment out these lines: https://github.com/DependencyTrack/dependency-track/pull/3258/files#diff-75cd29a0a085d84dff3cb4794242e0d1b400385f1acfae172346c3e6780629b0R582-R590

Something about those lines causes parent to be unloaded. But they're literally just calling a setter, so I don't quite get why. The test succeeds when adding project.getParent(); after the lines mentioned above. This kind of "force loading" is used in other areas of the code base. I don't like it, but in this case I think it's OK to do it.

rkg-mm · 2024-05-05T20:46:36Z

oh come on.. Idk what IntelliJ is doing but whenever I rebase this branch on your master I end up having all changes from master in my PR again. What am I doing wrong here? I don't get it...

edit: ok fixed it again with some hacky workaround. still don't know what I do wrong here...

…CollectionLogics. Modifies project metrics updates to consider collection project logics. Added several events to trigger project metrics updates to update collection projects if child projects change in a relevant way. Signed-off-by: Ralf King <rkg@mm-software.com>

Add test cases for new metrics update logic for collection projects. Signed-off-by: Ralf King <rkg@mm-software.com>

… all projects Signed-off-by: Ralf King <rkg@mm-software.com>

Signed-off-by: Ralf King <rkg@mm-software.com>

… wrongly influence numbers Signed-off-by: Ralf King <rkg@mm-software.com>

Signed-off-by: Ralf King <rkg@mm-software.com>

nscuro · 2024-05-05T20:54:08Z

@rkg-mm How do you do the rebase? I usually run git pull --rebase upstream master in the command line and then use IntelliJ to resolve any conflicts in case there are any.

I never had this particular problem happen, but can imagine IntelliJ might do some unexpected things when merging/rebasing via UI.

rkg-mm · 2024-05-05T20:58:58Z

In IntelliJ UI I select the upstream master and select "Rebase (my branch) onto master" then push the changes to my remote. The strange thing happens here already, as suddenly my own remote has changes not recognized by my local branch and needs to either be merged (which I tried before and ended up with double changes on same files) or I select "rebase anyway" which avoids double merging. But in both cases I still end up with all those unwanted changes again. Maybe I should try the console attempt next time...

* Add validation to prevent invalid states of collection projects (prevent collection project having Components or services), including several unit tests covering these scenarios. Signed-off-by: Ralf King <rkg@mm-software.com>

…ject actually changed. Signed-off-by: Ralf King <rkg@mm-software.com>

rkg-mm · 2024-05-06T16:16:16Z

@rkg-mm But I have no clue about this unit test and will need some help here. The test explicitly tests the now not working condition, so I guess its important that it works, but I don't see a change that causes the failure.

I debugged the area in question multiple times now and still don't quite understand how this is happening. The parent is part of the response when I comment out these lines: https://github.com/DependencyTrack/dependency-track/pull/3258/files#diff-75cd29a0a085d84dff3cb4794242e0d1b400385f1acfae172346c3e6780629b0R582-R590

Something about those lines causes parent to be unloaded. But they're literally just calling a setter, so I don't quite get why. The test succeeds when adding project.getParent(); after the lines mentioned above. This kind of "force loading" is used in other areas of the code base. I don't like it, but in this case I think it's OK to do it.

Fixed by your suggestion

codacy-production · 2024-05-06T16:36:32Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ +0.10% (target: -1.00%)	✅ 86.04% (target: 70.00%)

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`ff22d4f`)	22069	16756	75.93%
Head commit (`0cfdd8c`)	22259 (+190)	16923 (+167)	76.03% (+0.10%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#3258)	222	191	86.04%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

_{Codacy will stop sending the deprecated coverage status from June 5th, 2024. Learn more}

nscuro · 2024-05-07T14:12:39Z

@rkg-mm We have to schedule this for 4.12 to allow us for sufficient time to review and test.

rkg-mm · 2024-05-07T16:04:35Z

@rkg-mm We have to schedule this for 4.12 to allow us for sufficient time to review and test.

no problem, I understand. Lets just hope the release cycle is a bit shorter this time :D

cheapshot2000 · 2024-05-07T16:06:22Z

@rkg-mm We have to schedule this for 4.12 to allow us for sufficient time to review and test.

no problem, I understand. Lets just hope the release cycle is a bit shorter this time :D

Is there a set release cadence? or other logic for determining when releases are completed?

nscuro · 2024-05-07T16:18:09Z

Is there a set release cadence? or other logic for determining when releases are completed?

There is not at the moment. We originally planned to do monthly releases, but ended up stretching that (by far...) yet again. A problem is that the core team has certain items they want (and have to work on) in a given release, while said core team also needs to review and test community contributions. It's a luxury problem, but it means that more time is needed to meet both our own, and the community's expectations.

This particular PR was planned for inclusion in 4.11 for a long time, but it also touches on the very core of how DT behaves, in turn requiring more review and testing than other, smaller PRs. And of course contributors also don't always have time to answer our annoying requests. :)

We're still in the process of figuring out how to best plan and size releases. Going forward, we're labelling issues with T-shirt sizes, according to their effort. Aim being to limit the number of high-effort items per release, enabling faster cycles.

rkg-mm mentioned this pull request Dec 2, 2023

Introduce "collection" projects for better usage of hierarchical view #2041 DependencyTrack/frontend#658

Open

2 tasks

nscuro added this to the 4.11 milestone Dec 3, 2023

nscuro self-assigned this Jan 11, 2024

nscuro mentioned this pull request Jan 15, 2024

Add optional organization/group structure #657

Open

rkg-mm force-pushed the 2041-introduce-collection-projects branch from 0aa204e to ddc5c2b Compare April 23, 2024 00:50

nscuro reviewed Apr 28, 2024

View reviewed changes

rkg-mm added 7 commits May 5, 2024 22:49

Fix missing default.

b53b49c

Add test cases for new metrics update logic for collection projects. Signed-off-by: Ralf King <rkg@mm-software.com>

Add upgrade class to set default project collection logic to NONE for…

33c97f2

… all projects Signed-off-by: Ralf King <rkg@mm-software.com>

Fix unit test

8bbf63e

Signed-off-by: Ralf King <rkg@mm-software.com>

Exclude collection projects from portfolio metrics calculation to not…

a507c42

… wrongly influence numbers Signed-off-by: Ralf King <rkg@mm-software.com>

Updated copyright

e5a5eec

Signed-off-by: Ralf King <rkg@mm-software.com>

Add new Property to unit test data

ecfc394

Signed-off-by: Ralf King <rkg@mm-software.com>

rkg-mm force-pushed the 2041-introduce-collection-projects branch from efe2c29 to ecfc394 Compare May 5, 2024 20:53

rkg-mm added 2 commits May 6, 2024 18:09

* Fix failing unit test by enforcing parent loading

e0ea31c

* Add validation to prevent invalid states of collection projects (prevent collection project having Components or services), including several unit tests covering these scenarios. Signed-off-by: Ralf King <rkg@mm-software.com>

Only dispatch parent metrics update event when metrics of current pro…

0cfdd8c

…ject actually changed. Signed-off-by: Ralf King <rkg@mm-software.com>

nscuro modified the milestones: 4.11, 4.12 May 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce "collection" projects for better usage of hierarchical view #2041 #3258

Introduce "collection" projects for better usage of hierarchical view #2041 #3258

rkg-mm commented Dec 2, 2023 •

edited

cheapshot2000 commented Mar 25, 2024

TWpgo commented Apr 22, 2024

rkg-mm commented Apr 22, 2024

rkg-mm commented Apr 23, 2024 •

edited

rkg-mm commented Apr 23, 2024

nscuro commented Apr 23, 2024

rkg-mm commented Apr 23, 2024

nscuro Apr 28, 2024

nscuro Apr 28, 2024

rkg-mm May 5, 2024

nscuro May 23, 2024

rkg-mm May 24, 2024 •

edited

nscuro May 27, 2024

nscuro Apr 28, 2024

rkg-mm May 5, 2024 •

edited

nscuro May 23, 2024

rkg-mm May 24, 2024 •

edited

nscuro May 27, 2024

nscuro commented Apr 28, 2024

rkg-mm commented May 5, 2024 •

edited

nscuro commented May 5, 2024

rkg-mm commented May 5, 2024

rkg-mm commented May 6, 2024

codacy-production bot commented May 6, 2024

nscuro commented May 7, 2024

rkg-mm commented May 7, 2024

cheapshot2000 commented May 7, 2024

nscuro commented May 7, 2024

Introduce "collection" projects for better usage of hierarchical view #2041 #3258

Are you sure you want to change the base?

Introduce "collection" projects for better usage of hierarchical view #2041 #3258

Conversation

rkg-mm commented Dec 2, 2023 • edited

Description

Addressed Issue

Additional Details

Checklist

cheapshot2000 commented Mar 25, 2024

TWpgo commented Apr 22, 2024

rkg-mm commented Apr 22, 2024

rkg-mm commented Apr 23, 2024 • edited

rkg-mm commented Apr 23, 2024

nscuro commented Apr 23, 2024

rkg-mm commented Apr 23, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rkg-mm May 24, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rkg-mm May 5, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rkg-mm May 24, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nscuro commented Apr 28, 2024

rkg-mm commented May 5, 2024 • edited

nscuro commented May 5, 2024

rkg-mm commented May 5, 2024

rkg-mm commented May 6, 2024

codacy-production bot commented May 6, 2024

Coverage summary from Codacy

nscuro commented May 7, 2024

rkg-mm commented May 7, 2024

cheapshot2000 commented May 7, 2024

nscuro commented May 7, 2024

rkg-mm commented Dec 2, 2023 •

edited

rkg-mm commented Apr 23, 2024 •

edited

rkg-mm May 24, 2024 •

edited

rkg-mm May 5, 2024 •

edited

rkg-mm May 24, 2024 •

edited

rkg-mm commented May 5, 2024 •

edited