Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vault 2823 cc namespace #12393

Merged
merged 16 commits into from Sep 7, 2021
Merged
3 changes: 3 additions & 0 deletions changelog/12393.txt
@@ -0,0 +1,3 @@
```release-note: improvement
core: observe the client counts broken down by namespace for partial month client count
```
130 changes: 104 additions & 26 deletions vault/activity_log.go
Expand Up @@ -67,6 +67,11 @@ type segmentInfo struct {
entitySequenceNumber uint64
}

type clients struct {
distinctEntities uint64
nonEntityTokens uint64
}

// ActivityLog tracks unique entity counts and non-entity token counts.
// It handles assembling log fragments (and sending them to the active
// node), writing log segments, and precomputing queries.
Expand Down Expand Up @@ -143,6 +148,10 @@ type ActivityLog struct {

// for testing: is config currently being invalidated. protected by l
configInvalidationInProgress bool

// All known active entity count by namespace ID this month; use fragmentLock read-locked
// to check whether it already exists.
entityCountByNamespaceID map[string]uint64
}

// These non-persistent configuration options allow us to disable
Expand All @@ -164,17 +173,18 @@ func NewActivityLog(core *Core, logger log.Logger, view *BarrierView, metrics me
}

a := &ActivityLog{
core: core,
configOverrides: &core.activityLogConfig,
logger: logger,
view: view,
metrics: metrics,
nodeID: hostname,
newFragmentCh: make(chan struct{}, 1),
sendCh: make(chan struct{}, 1), // buffered so it can be triggered by fragment size
writeCh: make(chan struct{}, 1), // same for full segment
doneCh: make(chan struct{}, 1),
activeEntities: make(map[string]struct{}),
core: core,
configOverrides: &core.activityLogConfig,
logger: logger,
view: view,
metrics: metrics,
nodeID: hostname,
newFragmentCh: make(chan struct{}, 1),
sendCh: make(chan struct{}, 1), // buffered so it can be triggered by fragment size
writeCh: make(chan struct{}, 1), // same for full segment
doneCh: make(chan struct{}, 1),
activeEntities: make(map[string]struct{}),
entityCountByNamespaceID: make(map[string]uint64),
currentSegment: segmentInfo{
startTimestamp: 0,
currentEntities: &activity.EntityActivityLog{
Expand Down Expand Up @@ -531,7 +541,7 @@ func (a *ActivityLog) loadPriorEntitySegment(ctx context.Context, startTime time
// Or the feature has been disabled.
if a.enabled && startTime.Unix() == a.currentSegment.startTimestamp {
for _, ent := range out.Entities {
a.activeEntities[ent.EntityID] = struct{}{}
a.AddEntity(ent)
}
}
a.fragmentLock.Unlock()
Expand Down Expand Up @@ -571,7 +581,7 @@ func (a *ActivityLog) loadCurrentEntitySegment(ctx context.Context, startTime ti
}

for _, ent := range out.Entities {
a.activeEntities[ent.EntityID] = struct{}{}
a.AddEntity(ent)
}

return nil
Expand Down Expand Up @@ -684,6 +694,7 @@ func (a *ActivityLog) resetCurrentLog() {

a.fragment = nil
a.activeEntities = make(map[string]struct{})
a.entityCountByNamespaceID = make(map[string]uint64)
a.standbyFragmentsReceived = make([]*activity.LogFragment, 0)
}

Expand Down Expand Up @@ -1095,6 +1106,7 @@ func (a *ActivityLog) perfStandbyFragmentWorker() {
// clear active entity set
a.fragmentLock.Lock()
a.activeEntities = make(map[string]struct{})
a.entityCountByNamespaceID = make(map[string]uint64)
a.fragmentLock.Unlock()

// Set timer for next month.
Expand Down Expand Up @@ -1309,6 +1321,7 @@ func (a *ActivityLog) AddEntityToFragment(entityID string, namespaceID string, t
NamespaceID: namespaceID,
Timestamp: timestamp,
})
a.entityCountByNamespaceID[namespaceID] += 1
a.activeEntities[entityID] = struct{}{}
}

Expand Down Expand Up @@ -1353,7 +1366,7 @@ func (a *ActivityLog) receivedFragment(fragment *activity.LogFragment) {
}

for _, e := range fragment.Entities {
a.activeEntities[e.EntityID] = struct{}{}
a.AddEntity(e)
}

a.standbyFragmentsReceived = append(a.standbyFragmentsReceived, fragment)
Expand Down Expand Up @@ -1792,26 +1805,91 @@ func (c *Core) activeEntityGaugeCollector(ctx context.Context) ([]metricsutil.Ga

// partialMonthClientCount returns the number of clients used so far this month.
// If activity log is not enabled, the response will be nil
func (a *ActivityLog) partialMonthClientCount(ctx context.Context) map[string]interface{} {
func (a *ActivityLog) partialMonthClientCount(ctx context.Context) (map[string]interface{}, error) {
a.fragmentLock.RLock()
defer a.fragmentLock.RUnlock()

if !a.enabled {
// nothing to count
return nil
return nil, nil
}
byNamespace := make([]*ClientCountInNamespace, 0)
responseData := make(map[string]interface{})
totalEntities := 0
totalTokens := 0

clientCountTable := createClientCountTable(a.entityCountByNamespaceID, a.currentSegment.tokenCount.CountByNamespaceID)

entityCount := len(a.activeEntities)
var tokenCount int
for _, countByNS := range a.currentSegment.tokenCount.CountByNamespaceID {
tokenCount += int(countByNS)
queryNS, err := namespace.FromContext(ctx)
if err != nil {
return responseData, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are your thoughts on returning this empty responseData vs nil when there is an err? It seems to be a pretty common pattern to return nil with an error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same in other err != nil cases

}
clientCount := entityCount + tokenCount

responseData := make(map[string]interface{})
responseData["distinct_entities"] = entityCount
responseData["non_entity_tokens"] = tokenCount
responseData["clients"] = clientCount
for nsID, clients := range clientCountTable {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

ns, err := NamespaceByID(ctx, nsID, a.core)
if err != nil {
return responseData, err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be return nil, err.

}

if a.includeInResponse(queryNS, ns) {
var displayPath string
if ns == nil {
displayPath = fmt.Sprintf("deleted namespace %q", nsID)
} else {
displayPath = ns.Path
}

byNamespace = append(byNamespace, &ClientCountInNamespace{
NamespaceID: nsID,
NamespacePath: displayPath,
Counts: ClientCountResponse{
DistinctEntities: int(clients.distinctEntities),
NonEntityTokens: int(clients.nonEntityTokens),
Clients: int(clients.distinctEntities + clients.nonEntityTokens),
},
})
totalEntities += int(clients.distinctEntities)
totalTokens += int(clients.nonEntityTokens)

}
}

responseData["by_namespace"] = byNamespace
responseData["distinct_entities"] = totalEntities
responseData["non_entity_tokens"] = totalTokens
responseData["clients"] = totalEntities + totalTokens

return responseData, nil
}

//createClientCountTable maps the entitycount and token count to the namespace id
func createClientCountTable(entityMap map[string]uint64, tokenMap map[string]uint64) map[string]*clients {
//add distinct entity count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//add distinct entity count

clientCountTable := make(map[string]*clients)
for nsID, count := range entityMap {
if _, ok := clientCountTable[nsID]; !ok {
clientCountTable[nsID] = &clients{distinctEntities: 0, nonEntityTokens: 0}
}
clientCountTable[nsID].distinctEntities += count

}
//add non-entity token count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//add non-entity token count

for nsID, count := range tokenMap {
if _, ok := clientCountTable[nsID]; !ok {
clientCountTable[nsID] = &clients{distinctEntities: 0, nonEntityTokens: 0}
}
clientCountTable[nsID].nonEntityTokens += count

}
return clientCountTable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

}

func (a *ActivityLog) AddEntity(e *activity.EntityRecord) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this should be private - i didn't see any outside-package usage (or any reason to make it public)

if _, ok := a.activeEntities[e.EntityID]; !ok {
a.activeEntities[e.EntityID] = struct{}{}
a.entityCountByNamespaceID[e.NamespaceID] += 1
}

return responseData
}
83 changes: 80 additions & 3 deletions vault/activity_log_test.go
Expand Up @@ -1390,6 +1390,80 @@ func setupActivityRecordsInStorage(t *testing.T, base time.Time, includeEntities
return a, entityRecords, tokenRecords
}

//setupActivityRecordsInStorageRootNS creates entity and non-entity tokens only with root namespace
func setupActivityRecordsInStorageRootNS(t *testing.T, base time.Time, includeEntities, includeTokens bool) (*ActivityLog, []*activity.EntityRecord, map[string]uint64) {
t.Helper()

core, _, _ := TestCoreUnsealed(t)
a := core.activityLog
monthsAgo := base.AddDate(0, -3, 0)

var entityRecords []*activity.EntityRecord
if includeEntities {
entityRecords = []*activity.EntityRecord{
{
EntityID: "11111111-1111-1111-1111-111111111111",
NamespaceID: "root",
Timestamp: time.Now().Unix(),
},
{
EntityID: "22222222-2222-2222-2222-222222222222",
NamespaceID: "root",
Timestamp: time.Now().Unix(),
},
{
EntityID: "33333333-2222-2222-2222-222222222222",
NamespaceID: "root",
Timestamp: time.Now().Unix(),
},
}

testEntities1 := &activity.EntityActivityLog{
Entities: entityRecords[:1],
}
entityData1, err := proto.Marshal(testEntities1)
if err != nil {
t.Fatalf(err.Error())
}
testEntities2 := &activity.EntityActivityLog{
Entities: entityRecords[1:2],
}
entityData2, err := proto.Marshal(testEntities2)
if err != nil {
t.Fatalf(err.Error())
}
testEntities3 := &activity.EntityActivityLog{
Entities: entityRecords[2:],
}
entityData3, err := proto.Marshal(testEntities3)
if err != nil {
t.Fatalf(err.Error())
}

WriteToStorage(t, core, ActivityLogPrefix+"entity/"+fmt.Sprint(monthsAgo.Unix())+"/0", entityData1)
WriteToStorage(t, core, ActivityLogPrefix+"entity/"+fmt.Sprint(base.Unix())+"/0", entityData2)
WriteToStorage(t, core, ActivityLogPrefix+"entity/"+fmt.Sprint(base.Unix())+"/1", entityData3)
}

var tokenRecords map[string]uint64
if includeTokens {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything up to this point looks identical with setupActivityRecordsInStorage - it might be a good idea to break some of that out into a base/helper function, and then just add the namespace flavor on top for the namespace version of the function. this would result in more clear and maintainable code

tokenRecords = make(map[string]uint64)
tokenRecords["root"] = 4
tokenCount := &activity.TokenCount{
CountByNamespaceID: tokenRecords,
}

tokenData, err := proto.Marshal(tokenCount)
if err != nil {
t.Fatalf(err.Error())
}

WriteToStorage(t, core, ActivityLogPrefix+"directtokens/"+fmt.Sprint(base.Unix())+"/0", tokenData)
}

return a, entityRecords, tokenRecords
}

func TestActivityLog_refreshFromStoredLog(t *testing.T) {
a, expectedEntityRecords, expectedTokenCounts := setupActivityRecordsInStorage(t, time.Now().UTC(), true, true)
a.SetEnable(true)
Expand Down Expand Up @@ -2499,9 +2573,9 @@ func TestActivityLog_Deletion(t *testing.T) {
func TestActivityLog_partialMonthClientCount(t *testing.T) {
timeutil.SkipAtEndOfMonth(t)

ctx := context.Background()
ctx := namespace.RootContext(nil)
now := time.Now().UTC()
a, entities, tokenCounts := setupActivityRecordsInStorage(t, timeutil.StartOfMonth(now), true, true)
a, entities, tokenCounts := setupActivityRecordsInStorageRootNS(t, timeutil.StartOfMonth(now), true, true)

a.SetEnable(true)
var wg sync.WaitGroup
Expand All @@ -2520,7 +2594,10 @@ func TestActivityLog_partialMonthClientCount(t *testing.T) {

expectedClientCount := partialMonthEntityCount + partialMonthTokenCount

results := a.partialMonthClientCount(ctx)
results, err := a.partialMonthClientCount(ctx)
if err != nil {
t.Fatal(err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to call .Error() here if you don't want to

}
if results == nil {
t.Fatal("no results to test")
}
Expand Down
23 changes: 23 additions & 0 deletions vault/activity_log_testing_util.go
Expand Up @@ -36,6 +36,29 @@ func (c *Core) InjectActivityLogDataThisMonth(t *testing.T) (map[string]struct{}
return activeEntities, tokens
}

// InjectActivityLogDataThisMonth populates the in-memory client store
// with some entities and tokens for root namespace , overriding what was already there
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// with some entities and tokens for root namespace , overriding what was already there
// with some entities and tokens for root namespace, overriding what was already there

// It is currently used for API integration tests
func (c *Core) InjectActivityLogDataThisMonthRootNS(t *testing.T) (map[string]uint64, map[string]uint64) {
t.Helper()

entitiesByNS := map[string]uint64{
"root": 1,
}
tokens := map[string]uint64{
"root": 5,
}

c.activityLog.l.Lock()
defer c.activityLog.l.Unlock()
c.activityLog.fragmentLock.Lock()
defer c.activityLog.fragmentLock.Unlock()

c.activityLog.currentSegment.tokenCount.CountByNamespaceID = tokens
c.activityLog.entityCountByNamespaceID = entitiesByNS
return entitiesByNS, tokens
}

// Return the in-memory activeEntities from an activity log
func (c *Core) GetActiveEntities() map[string]struct{} {
out := make(map[string]struct{})
Expand Down
2 changes: 1 addition & 1 deletion vault/external_tests/activity/activity_test.go
Expand Up @@ -87,7 +87,7 @@ func TestActivityLog_MonthlyActivityApi(t *testing.T) {
validateClientCounts(t, resp, 0, 0)

// inject some data and query the API
entities, tokens := core.InjectActivityLogDataThisMonth(t)
entities, tokens := core.InjectActivityLogDataThisMonthRootNS(t)
expectedEntities := len(entities)
var expectedTokens int
for _, tokenCount := range tokens {
Expand Down
7 changes: 6 additions & 1 deletion vault/logical_system_activity.go
Expand Up @@ -142,7 +142,12 @@ func (b *SystemBackend) handleMonthlyActivityCount(ctx context.Context, req *log
return logical.ErrorResponse("no activity log present"), nil
}

results := a.partialMonthClientCount(ctx)
results, err := a.partialMonthClientCount(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i made this comment earlier, but it doesn't look like results has any info in it when err != nil - it might be more clear to show that results has nothing in it and put an empty Data object in the response in this error case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you have any thoughts on this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Go convention for funcs that return (x,error) is to populate only one of the two return values - I'd prefer if it were exactly one, but sometimes we do return (nil,nil), which tends to cause bugs because people are so used to the more common practice of always returning exactly one non-nil value.

This carries over to our handlers as well. If they return a non-nil error, they should be returning a nil response. The complication is that we can return errors in two ways. We can either return (nil, err) in which case the user will not an error message, they'll just get a 500 internal server error. Or we can return (logical.ErrorResponse("message"), nil) in which case we'll put "message" into the response's error field and return a 400 bad request. Or we can return (logical.ErrorReponse("message"), err), in which case they'll see "message" in the response's error field and the status code of the response will depend on err, see logical.RespondErrorCommon for specifics.

There's no general guidance about when to use these different approaches, though we should probably write some. One tip I can offer is that if the error is an "expected" error, e.g. the user provided bad input in their request, or asked to do something we don't want to allow, it's good to provide an explanation of the problem in the response's error message. If it's an internal failure that shouldn't have happened, that isn't predictable based on the inputs, then we usually tend more towards the (nil, err) return where they just a 500. This is because we don't know what appear in the error, and we might wind up leaking information that the user shouldn't see. Often in that case we'll log the error.

Anyway, in this case, when err != nil, I would simply return (nil, err).

if err != nil {
return &logical.Response{
Data: results,
}, err
}
if results == nil {
return logical.RespondWithStatusCode(nil, req, http.StatusNoContent)
}
Expand Down