Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added new retention policy "onJobFailure" #1265

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

moraxy
Copy link

@moraxy moraxy commented Nov 25, 2022

This new retention policy onJobFailure prevents the deletion of a pod in case of a failed build result by the job that was using the pod. This is useful for debugging failed jobs because you are still able to open a terminal in any pod and investigate log files or similar, without having to set the retention policy to 'Always' or using workarounds like long timeouts.

The logic is comparatively simple: Use the runUrl pod annotation to query the particular run from Jenkins and check its result. Sometimes the result isn't available yet, in which case we wait up to 30 seconds.

To prevent having to import Jenkins classes into the test code, I'm only testing the utility methods.

This fixes JENKINS-688969

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

This new retention policy prevents the deletion of a pod in case of a failed build result by the job that was using the pod. This is useful for debugging failed jobs because you are still able to open a terminal in any pod and investigate log files or similar, without having to set the retention policy to 'Always' or using workarounds like long timeouts.
@moraxy moraxy requested a review from a team as a code owner November 25, 2022 03:15
@moraxy
Copy link
Author

moraxy commented Dec 8, 2022

Any update on this?

@Vlatombe Vlatombe added the enhancement Improvements label Dec 13, 2022
Copy link
Member

@Vlatombe Vlatombe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea, but it needs more work before it can be merged.

Also consider adding an integration test under KubernetesPipelineTest

package org.csanchez.jenkins.plugins.kubernetes.pod.retention;

import hudson.Extension;
import hudson.model.*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please set your IDE to avoid wildcard imports.

Comment on lines +38 to +41
// small convenience function
private void LOG(Level level, String message) {
LOGGER.log(level, () -> MODULENAME + ": " + message);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unneeded, the class name is already printed as part of the standard formatter.

@Override
public boolean shouldDeletePod(KubernetesCloud cloud, Pod pod) {
if (cloud == null || pod == null) {
LOG(Level.INFO, "shouldDeletePod called without actual cloud and pod");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Demote all this logging to FINE.

Comment on lines +55 to +59
Jenkins jenkins = Jenkins.getInstanceOrNull();
if (jenkins == null) {
LOG(Level.INFO, "Couldn't get the current Jenkins reference");
return true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Jenkins jenkins = Jenkins.getInstanceOrNull();
if (jenkins == null) {
LOG(Level.INFO, "Couldn't get the current Jenkins reference");
return true;
}
Jenkins jenkins = Jenkins.get();

Comment on lines +101 to +104
if (parts.length < 3) {
LOG(Level.INFO, "runUrl has unknown format: " + runUrl);
return null;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is way too simplistic. If you need to look up the run, I would advise instead to add new annotations (one for the item full name, another for the build id)

@Vlatombe
Copy link
Member

Vlatombe commented Feb 1, 2023

Conflict with #1304

@@ -6,6 +6,7 @@
<ol>
<li>Never - always delete the agent pod.</li>
<li>On Failure - keep the agent pod if it fails during the build.</li>
<li>On Job Failure - keep the agent pod if the build itself fails.</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should rather be called On Build Failure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in Jenkins terminology fails means FAILURE—the build did not run to normal completion. UNSTABLE (ran to completion but with test failures) would be considered “successful”. If you mean to use the current impl (checking for SUCCESS) then be clear that an unstable build will also retain the pod.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvements
Projects
None yet
3 participants