Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add paginated feature to list rules api #14017

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
199 changes: 185 additions & 14 deletions web/api/v1/api.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ package v1

import (
"context"
"crypto/sha1"
"errors"
"fmt"
"math"
Expand Down Expand Up @@ -168,6 +169,17 @@ type apiFuncResult struct {

type apiFunc func(r *http.Request) apiFuncResult

type listRulesPaginationRequest struct {
MaxAlerts int32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd just stick to either int or int64 for these fields here, less conversions / casts needed later on and no need to artificially limit the value range.

MaxRuleGroups int32
NextToken string
}

type parsePaginationError struct {
err error
parameter string
}

// TSDBAdminStats defines the tsdb interfaces used by the v1 API for admin operations as well as statistics.
type TSDBAdminStats interface {
CleanTombstones() error
Expand Down Expand Up @@ -1329,6 +1341,12 @@ type RuleDiscovery struct {
RuleGroups []*RuleGroup `json:"groups"`
}

// PaginatedRuleDiscovery has info for all rules with pagination.
type PaginatedRuleDiscovery struct {
RuleGroups []*RuleGroup `json:"groups"`
NextToken string `json:"nextToken"`
}

// RuleGroup has info for rules which are part of a group.
type RuleGroup struct {
Name string `json:"name"`
Expand Down Expand Up @@ -1363,6 +1381,29 @@ type AlertingRule struct {
Type string `json:"type"`
}

type AlertingRulePaginated struct {
// State can be "pending", "firing", "inactive".
State string `json:"state"`
Name string `json:"name"`
Query string `json:"query"`
Duration float64 `json:"duration"`
KeepFiringFor float64 `json:"keepFiringFor"`
Labels labels.Labels `json:"labels"`
Annotations labels.Labels `json:"annotations"`
AlertInfo *AlertsPaginated `json:"alertInfo"`
Health rules.RuleHealth `json:"health"`
LastError string `json:"lastError,omitempty"`
EvaluationTime float64 `json:"evaluationTime"`
LastEvaluation time.Time `json:"lastEvaluation"`
// Type of an alertingRule is always "alerting".
Type string `json:"type"`
}

type AlertsPaginated struct {
Alerts []*Alert `json:"alerts"`
HasMore bool `json:"hasMore"`
}

type RecordingRule struct {
Name string `json:"name"`
Query string `json:"query"`
Expand Down Expand Up @@ -1408,6 +1449,15 @@ func (api *API) rules(r *http.Request) apiFuncResult {
return invalidParamError(err, "exclude_alerts")
}

paginationRequest, parseErr := parseListRulesPaginationRequest(r)
if parseErr != nil {
return invalidParamError(parseErr.err, parseErr.parameter)
}

if paginationRequest != nil {
sort.Sort(GroupStateDescs(ruleGroups))
}

rgs := make([]*RuleGroup, 0, len(ruleGroups))
for _, grp := range ruleGroups {
if len(rgSet) > 0 {
Expand All @@ -1422,6 +1472,12 @@ func (api *API) rules(r *http.Request) apiFuncResult {
}
}

// Skip the rule group if the next token is set and hasn't arrived the nextToken item yet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Skip the rule group if the next token is set and hasn't arrived the nextToken item yet.
// Skip the rule group if the next token is set and hasn't arrived at the nextToken item yet.

groupID := getRuleGroupNextToken(grp.File(), grp.Name())
if paginationRequest != nil && len(paginationRequest.NextToken) > 0 && paginationRequest.NextToken >= groupID {
continue
}

apiRuleGroup := &RuleGroup{
Name: grp.Name(),
File: grp.File(),
Expand Down Expand Up @@ -1453,20 +1509,47 @@ func (api *API) rules(r *http.Request) apiFuncResult {
if !excludeAlerts {
activeAlerts = rulesAlertsToAPIAlerts(rule.ActiveAlerts())
}
enrichedRule = AlertingRule{
State: rule.State().String(),
Name: rule.Name(),
Query: rule.Query().String(),
Duration: rule.HoldDuration().Seconds(),
KeepFiringFor: rule.KeepFiringFor().Seconds(),
Labels: rule.Labels(),
Annotations: rule.Annotations(),
Alerts: activeAlerts,
Health: rule.Health(),
LastError: lastError,
EvaluationTime: rule.GetEvaluationDuration().Seconds(),
LastEvaluation: rule.GetEvaluationTimestamp(),
Type: "alerting",
hasMore := false
if paginationRequest != nil && paginationRequest.MaxAlerts >= 0 && len(rule.ActiveAlerts()) > int(paginationRequest.MaxAlerts) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just reuse activeAlerts from above rather than calling the potentially costly rule.ActiveAlerts() again?

If I understand correctly, that would even be more correct, as excludeAlerts above may already have limited down the list of alerts to be returned to 0, so comparing to the original length of the full alert list wouldn't make sense anymore.

activeAlerts = activeAlerts[:int(paginationRequest.MaxAlerts)]
hasMore = true
}

if paginationRequest != nil {
enrichedRule = AlertingRulePaginated{
State: rule.State().String(),
Name: rule.Name(),
Query: rule.Query().String(),
Duration: rule.HoldDuration().Seconds(),
KeepFiringFor: rule.KeepFiringFor().Seconds(),
Labels: rule.Labels(),
Annotations: rule.Annotations(),
AlertInfo: &AlertsPaginated{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether it'd be more intuitive and understandable naming to call the field itself AlertsPaginated as well, just like its type. "info" is so generic :)

Alerts: activeAlerts,
HasMore: hasMore,
},
Health: rule.Health(),
LastError: lastError,
EvaluationTime: rule.GetEvaluationDuration().Seconds(),
LastEvaluation: rule.GetEvaluationTimestamp(),
Type: "alerting",
}
} else {
enrichedRule = AlertingRule{
State: rule.State().String(),
Name: rule.Name(),
Query: rule.Query().String(),
Duration: rule.HoldDuration().Seconds(),
KeepFiringFor: rule.KeepFiringFor().Seconds(),
Labels: rule.Labels(),
Annotations: rule.Annotations(),
Alerts: activeAlerts,
Health: rule.Health(),
LastError: lastError,
EvaluationTime: rule.GetEvaluationDuration().Seconds(),
LastEvaluation: rule.GetEvaluationTimestamp(),
Type: "alerting",
}
}
case *rules.RecordingRule:
if !returnRecording {
Expand Down Expand Up @@ -1497,6 +1580,15 @@ func (api *API) rules(r *http.Request) apiFuncResult {
rgs = append(rgs, apiRuleGroup)
}
}

if paginationRequest != nil {
paginatedRes := &PaginatedRuleDiscovery{RuleGroups: make([]*RuleGroup, 0, len(ruleGroups))}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, pre-allocating the RuleGroups slice here isn't needed for anything, as that slice is not appended to / written to anywhere, it's just completely overwritten below with a separately allocated value.

Suggested change
paginatedRes := &PaginatedRuleDiscovery{RuleGroups: make([]*RuleGroup, 0, len(ruleGroups))}
paginatedRes := &PaginatedRuleDiscovery{}

You could also just construct the struct right where you return it, then you can immediately set the fields to the values you already have by then:

return apiFuncResult{&PaginatedRuleDiscovery{ RuleGroups: returnGroups, NextToken: nextToken }, ...}

returnGroups, nextToken := TruncateGroupInfos(rgs, int(paginationRequest.MaxRuleGroups))
paginatedRes.RuleGroups = returnGroups
paginatedRes.NextToken = nextToken
return apiFuncResult{paginatedRes, nil, nil, nil}
}

res.RuleGroups = rgs
return apiFuncResult{res, nil, nil, nil}
}
Expand All @@ -1515,6 +1607,77 @@ func parseExcludeAlerts(r *http.Request) (bool, error) {
return excludeAlerts, nil
}

func parseListRulesPaginationRequest(r *http.Request) (*listRulesPaginationRequest, *parsePaginationError) {
var (
returnMaxAlert = int32(-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency:

Suggested change
returnMaxAlert = int32(-1)
returnMaxAlerts = int32(-1)

returnMaxRuleGroups = int32(-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably just use uint64 for these, even if we never expect to exceed an int32. Just feels more natural nowadays, avoids parsing issues when a user sets a ridiculously high limit, and you need to have one less cast (if changing to strconv.ParseUInt() below.

returnNextToken = ""
)

if r.URL.Query().Get("maxAlerts") != "" {
maxAlert, err := strconv.ParseInt(r.URL.Query().Get("maxAlerts"), 10, 32)
if err != nil || maxAlert < 0 {
return nil, &parsePaginationError{
err: fmt.Errorf("maxAlerts need to be a valid number and larger than 0: %w", err),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it says that maxAlerts needs to be larger than 0, but the check above only checks that it's at least 0. Which one is intended?

If you want to allow 0, then maybe strconv.ParseUInt() would be better. Then you can omit the explicit maxAlert < 0 check completely as that's covered by the function.

parameter: "maxAlerts",
}
}
returnMaxAlert = int32(maxAlert)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need two variable names for each parameter (e.g. maxAlert and returnMaxAlerts), at least after making the variable type adjustments I mentioned above. I'd just omit the ones with the return prefix, set the three variables to their default values at the top, and then overwrite them directly from the parsed parameters. That's a bit simpler to read as well I think.

}

if r.URL.Query().Get("maxRuleGroups") != "" {
maxRuleGroups, err := strconv.ParseInt(r.URL.Query().Get("maxRuleGroups"), 10, 32)
if err != nil || maxRuleGroups < 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

return nil, &parsePaginationError{
err: fmt.Errorf("maxRuleGroups need to be a valid number and larger than 0: %w", err),
parameter: "maxAlerts",
}
}
returnMaxRuleGroups = int32(maxRuleGroups)
}

if r.URL.Query().Get("nextToken") != "" {
returnNextToken = r.URL.Query().Get("nextToken")
}

if returnMaxRuleGroups >= 0 || returnMaxAlert >= 0 || returnNextToken != "" {
return &listRulesPaginationRequest{
MaxRuleGroups: returnMaxRuleGroups,
MaxAlerts: returnMaxAlert,
NextToken: returnNextToken,
}, nil
}

return nil, nil
}

func TruncateGroupInfos(groupInfos []*RuleGroup, maxRuleGroups int) ([]*RuleGroup, string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now reading this function again after adding the other comments about it, I think it'd make sense to write it in a different way that would make it simpler and more efficient:

  • Do a special-case early exit for the maxRuleGroups < 0 || len(groupInfos) <= maxRuleGroups case (just return groupInfos as is and "" for the token).
  • Then: Instead of looping over all groups and appending them individually, just return groupInfos[:maxRuleGroups] for the rule groups and the token for groupInfos[maxRuleGroups-1].

resultNumber := 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For simplicity I'd probably omit this extra tracking variable and just use len(returnGroupDescs) instead in both places where it's needed.

var returnPaginationToken string
returnGroupDescs := make([]*RuleGroup, 0, len(groupInfos))
for _, groupInfo := range groupInfos {
// Add the rule group to the return slice if the maxRuleGroups is not hit
if maxRuleGroups < 0 || resultNumber < maxRuleGroups {
returnGroupDescs = append(returnGroupDescs, groupInfo)
resultNumber++
continue
}

// Return the next token if there is more aggregation group
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Return the next token if there is more aggregation group
// Return the next token if there are more aggregation groups

if maxRuleGroups > 0 && resultNumber == maxRuleGroups {
returnPaginationToken = getRuleGroupNextToken(returnGroupDescs[maxRuleGroups-1].File, returnGroupDescs[maxRuleGroups-1].Name)
break
}
}
return returnGroupDescs, returnPaginationToken
}

func getRuleGroupNextToken(namespace, group string) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still stick to file instead of namespace here, I'd rather not introduce the concept of a general namespace only for this function.

h := sha1.New()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorting by the hash effectively scrambles the rule group ordering in comparison to the normal / expected order that is returned without pagination and that the rule groups were written in by the user. Is there a good reason why we actually need the token to look like a hash value vs. just using the unhashed file + group name concatenation as a token that would preserve the original ordering?

h.Write([]byte(namespace + ";" + group))
return fmt.Sprintf("%x", h.Sum(nil))
}

type prometheusConfig struct {
YAML string `json:"yaml"`
}
Expand Down Expand Up @@ -1911,3 +2074,11 @@ func parseLimitParam(limitStr string) (limit int, err error) {

return limit, nil
}

type GroupStateDescs []*rules.Group

func (gi GroupStateDescs) Swap(i, j int) { gi[i], gi[j] = gi[j], gi[i] }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come gi vs. g or gsd?

func (gi GroupStateDescs) Less(i, j int) bool {
return getRuleGroupNextToken(gi[i].File(), gi[i].Name()) < getRuleGroupNextToken(gi[j].File(), gi[j].Name())
}
func (gi GroupStateDescs) Len() int { return len(gi) }