Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-25331: UPSTREAM: <carry>: extend termination events #1827

Merged
merged 1 commit into from
May 29, 2024

Conversation

tkashem
Copy link

@tkashem tkashem commented Dec 15, 2023

  • set EventTime for the shutdown events

  • we tie the shutdown events that follow with the UID of the first (shutdown initiated), this provides us with a more deterministic way to compute shutdown duration from these events

  • move code snippets from the upstream file to openshift specific patch file, it reduces chance of code conflict

Note for rebase: squash it into the following commit
cfbb6d6 UPSTREAM: : create termination events

@openshift-ci-robot openshift-ci-robot added the backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. label Dec 15, 2023
@openshift-ci-robot
Copy link

@tkashem: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

Comment /validate-backports to re-evaluate validity of the upstream PRs, for example when they are merged upstream.

@tkashem tkashem changed the title UPSTREAM: <carry>: extend termination events [WIP] UPSTREAM: <carry>: extend termination events Dec 15, 2023
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 15, 2023
@openshift-ci openshift-ci bot added vendor-update Touching vendor dir or related files approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 15, 2023
@openshift-ci-robot
Copy link

@tkashem: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

Comment /validate-backports to re-evaluate validity of the upstream PRs, for example when they are merged upstream.

@tkashem tkashem changed the title [WIP] UPSTREAM: <carry>: extend termination events OCPBUGS-25331: [WIP] UPSTREAM: <carry>: extend termination events Dec 15, 2023
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 15, 2023
@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Dec 15, 2023
@openshift-ci-robot
Copy link

@tkashem: This pull request references Jira Issue OCPBUGS-25331, which is invalid:

  • expected the bug to target the "4.16.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

  • we tie the shutdown events that follow with the UID of the first (shutdown initiated), this provides us with a more deterministic way to compute shutdown duration from these events

  • move code snippets from the upstream file to openshift specific patch file, it reduces chance of code conflict

Note for rebase: squash it into the following commit
cfbb6d6 UPSTREAM: : create termination events

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tkashem
Copy link
Author

tkashem commented Dec 15, 2023

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Dec 15, 2023
@openshift-ci-robot
Copy link

@tkashem: This pull request references Jira Issue OCPBUGS-25331, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.16.0) matches configured target version for branch (4.16.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (anli@redhat.com), skipping review request.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link

@tkashem: This pull request references Jira Issue OCPBUGS-25331, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.16.0) matches configured target version for branch (4.16.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (anli@redhat.com), skipping review request.

In response to this:

  • set EventTime for the shutdown events

  • we tie the shutdown events that follow with the UID of the first (shutdown initiated), this provides us with a more deterministic way to compute shutdown duration from these events

  • move code snippets from the upstream file to openshift specific patch file, it reduces chance of code conflict

Note for rebase: squash it into the following commit
cfbb6d6 UPSTREAM: : create termination events

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tkashem
Copy link
Author

tkashem commented Dec 16, 2023

/retest

Name: fmt.Sprintf("%v.%x", ref.Name, t.UnixNano()),
Namespace: ref.Namespace,
},
EventTime: t,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are creating corev1 Events, IMO we should mimic the behavior of the corev1 client implementation and set FirstTimestamp and LastTimestamp instead of EventTime: https://github.com/openshift/kubernetes/blob/master/staging/src/k8s.io/client-go/tools/record/event.go#L482-L483

The events/v1 client implementation is actually the one setting EventTime: https://github.com/openshift/kubernetes/blob/master/staging/src/k8s.io/client-go/tools/events/event_recorder.go#L102

Comment on lines +48 to +53
// when we emit the lifecycle events, we store the event ID of the first
// shutdown event "ShutdownInitiated" emitted so we can correlate it to
// the other shutdown events for a particular apiserver restart.
// This provides a more deterministic way to determine the shutdown
// duration for an apiserver restart
eventLock sync.Mutex
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of correlation are we talking about? Normally with client-go implementation, similar events are aggregated into one and with up-to-date count and timestamps you can easily track the lifecycle of the event. So maybe we could reuse some of client-go implementation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iiuc its a correlation with events from other components - i.e. disruption events on timeline, lb status and so on

@tkashem
Copy link
Author

tkashem commented Mar 4, 2024

/test e2e-gcp

1 similar comment
@tkashem
Copy link
Author

tkashem commented Mar 5, 2024

/test e2e-gcp

- we tie the shutdown events with the UID of  the first
  (shutdown initiated), this provides us with a more
  deterministic way to compute shutdown duration from
  these events

- move code snippets from the upstream file to openshift
  specific patch file, it reduces chance of code conflict

Note for rebase: squash it into the following commit
cfbb6d6 UPSTREAM: <carry>: create termination events
@openshift-ci-robot
Copy link

@tkashem: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

Comment /validate-backports to re-evaluate validity of the upstream PRs, for example when they are merged upstream.

@tkashem
Copy link
Author

tkashem commented Apr 16, 2024

/remove-label backports/unvalidated-commits
/label backports/validated-commits

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 16, 2024
@soltysh
Copy link
Member

soltysh commented Apr 25, 2024

/retest-required

@vrutkovs
Copy link
Member

/test unit

@vrutkovs
Copy link
Member

/test e2e-aws-ovn-fips

@vrutkovs
Copy link
Member

I don't see any places where oauth-apiserver or openshift-apiserver needs to be patched - they all seem to be using vendored kubernetes for shutdown events.

@tkashem shall we remove the hold and start vendoring new kube in openshift-apiserver/oauth-apiserver?

@vrutkovs
Copy link
Member

4.16 has branched so its safe to land
/hold cancel
/lgtm

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 20, 2024
@vrutkovs
Copy link
Member

/cherrypick release-4.16

@openshift-cherrypick-robot

@vrutkovs: once the present PR merges, I will cherry-pick it on top of release-4.16 in a new PR and assign it to you.

In response to this:

/cherrypick release-4.16

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 20, 2024
Copy link

openshift-ci bot commented May 20, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tkashem, vrutkovs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 4a87b53 and 2 for PR HEAD 518f920 in total

@vrutkovs
Copy link
Member

/refresh
/skip

@vrutkovs
Copy link
Member

/retest

@vrutkovs
Copy link
Member

/test e2e-gcp

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD b574b11 and 1 for PR HEAD 518f920 in total

@vrutkovs
Copy link
Member

/test e2e-gcp

@vrutkovs
Copy link
Member

/retest

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f1cd924 and 0 for PR HEAD 518f920 in total

@openshift-ci-robot
Copy link

/hold

Revision 518f920 was retested 3 times: holding

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 28, 2024
@vrutkovs
Copy link
Member

/retest

Copy link

openshift-ci bot commented May 29, 2024

@tkashem: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@vrutkovs
Copy link
Member

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 29, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit e39a138 into openshift:master May 29, 2024
19 checks passed
@openshift-ci-robot
Copy link

@tkashem: Jira Issue OCPBUGS-25331: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-25331 has been moved to the MODIFIED state.

In response to this:

  • set EventTime for the shutdown events

  • we tie the shutdown events that follow with the UID of the first (shutdown initiated), this provides us with a more deterministic way to compute shutdown duration from these events

  • move code snippets from the upstream file to openshift specific patch file, it reduces chance of code conflict

Note for rebase: squash it into the following commit
cfbb6d6 UPSTREAM: : create termination events

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

@vrutkovs: new pull request created: #1980

In response to this:

/cherrypick release-4.16

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backports/validated-commits Indicates that all commits come to merged upstream PRs. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. vendor-update Touching vendor dir or related files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants