Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add otel metrics for spiderpool #81

Closed
wants to merge 4 commits into from

Conversation

Icarus9913
Copy link
Collaborator

@Icarus9913 Icarus9913 commented Mar 29, 2022

Signed-off-by: Icarus9913 icaruswu66@qq.com

Add otel metrics module for spiderpool to monitor.

  • Test

metricController.RegisterMetric("Node_IP_Allocation_Counts", "The total counts of node IP allocations", metrics.COUNTER)
metricController.RegisterMetric("Node_IP_Allocation_Duration", "The duration of node IP allocations", metrics.HISTOGRAM)
metricController.SetMetricAlert("Node_IP_Allocation_Counts", int64(1), int64(1000))
metricController.SetMetricValue("Node_IP_Allocation_Counts", int64(3), map[string]string{"type": "total", "node": "node1"})

This comment was marked as outdated.

// TODO: Should we return the data encoded in JSON?
// GetMetricDatabase returns the current metricController metricInstance.
func (mc *metricController) GetMetricDatabase() map[string]*Metric {
return mc.metricInstance
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处返回database是否需要执行json.marshal,最后返回一个string字符串?
因为直接打印这个map[string]*Metric,这里有指针,而打印并不会去解析指针的值,而是直接输出地址。

}

// NewMetricController will create a MetricController instance.
func NewMetricController(meterName string) (MetricController, func(http.ResponseWriter, *http.Request), error) {

This comment was marked as outdated.

case HISTOGRAM:
m.Metric = metric.Must(mc.meter).NewFloat64Histogram(metricName, metric.WithDescription(description))
}
mc.metricInstance[metricName] = m
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about a metricName already exists ?

}

// RegisterMetric registers metrics.
func (mc *metricController) RegisterMetric(metricName, description string, metricType MetricType) {

This comment was marked as outdated.

mi.AlertLimitUp = limitUp
mi.AlertLimitDown = limitDown
} else {
fmt.Printf("%s does not exist", metricName)

This comment was marked as outdated.

func (mc *metricController) SetMetricAlert(metricName string, limitUp, limitDown interface{}) {
if mi, ok := mc.metricInstance[metricName]; ok {
mi.AlertLimitUp = limitUp
mi.AlertLimitDown = limitDown

This comment was marked as outdated.

}

// RecordTimeMetric returns a function which will record the duration after being called.
func (mc *metricController) RecordTimeMetric(metricName string, attrList map[string]string) (fn func()) {

This comment was marked as outdated.


switch metricType {
case COUNTER:
m.Metric = metric.Must(mc.meter).NewInt64Counter(metricName, metric.WithDescription(description))

This comment was marked as outdated.

return
}

fmt.Printf("%s does not exist", metricName)

This comment was marked as outdated.

if m, ok := mc.metricInstance[metricName]; ok {
if m.Mtype != HISTOGRAM {
fmt.Printf("%s is not %v type to record duration!", metricName, HISTOGRAM)
return

This comment was marked as outdated.

Expect(err).NotTo(HaveOccurred())
})

It("use prometheus as exporter", func() {

This comment was marked as outdated.

@weizhoublue
Copy link
Collaborator

每个函数的健壮性不好,没有入参校验,错误了没有抛给外部
单元测试太简单,需要有正向和反向用例,甚至是并发竞争,且覆盖率100%

@Icarus9913
Copy link
Collaborator Author

metric 部分的设计,是否应该做成不对业务影响的模块,不耦合?对此,我只在新建接口时才有过一次err异常抛出。
指标收集部分的代码是插入在业务代码中,应该仅作为一个插入,而不应该对业务层有err介入,否则写业务会因为没有记录到指标而抛异常?我的想法是,中间键部分的代码,仅仅作为业务代码部门的某一两行的污染,若有异常应log指出。不应让其err影响到业务层?

processor "go.opentelemetry.io/otel/sdk/metric/processor/basic"
selector "go.opentelemetry.io/otel/sdk/metric/selector/simple"
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be a default logger to use
var loger = log.Stdloger.(" metric")

metricController, httpHandle, err = metrics.NewMetricController(SpiderPoolMeter)
Expect(err).NotTo(HaveOccurred())
})

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is easy to use DescribeTable to cover more test case

@weizhoublue weizhoublue added this to the kinds of pkg module milestone Mar 30, 2022
@weizhoublue weizhoublue linked an issue Mar 30, 2022 that may be closed by this pull request
@weizhoublue
Copy link
Collaborator

weizhoublue commented Mar 30, 2022

不让外部感知 erro 不叫 “不耦合”,外部不要 call function call 才叫 不耦合
没见过哪个 开源的 module 能自己内部消化掉 error,整个 function call 都失败了,还完全不让外部感知
应该要 传递 error 出来 , 上层 业务 是否 忽略或处理,是上层的事,打印出错也可以上 层自己的logger 来控制 。
不需要 本pkg 操心 error 的处理,否则外部业务 完全丧失控制

if len(metricName) == 0 {
return metric.Int64Counter{}, fmt.Errorf("Failed to create metric Int64Counter, metric name is asked to be set.")
}
return metric.Must(meter).NewInt64Counter(metricName, metric.WithDescription(description)), nil

This comment was marked as outdated.

if len(metricName) == 0 {
return metric.Float64Histogram{}, fmt.Errorf("Failed to create metric Float64Histogram, metric name is asked to be set.")
}
return metric.Must(meter).NewFloat64Histogram(metricName, metric.WithDescription(description)), nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if description is empty, metric.WithDescription() will go wrong ?

if len(metricName) == 0 {
return metric.Float64Histogram{}, fmt.Errorf("Failed to create metric Float64Histogram, metric name is asked to be set.")
}
return metric.Must(meter).NewFloat64Histogram(metricName, metric.WithDescription(description)), nil

This comment was marked as outdated.

This comment was marked as outdated.

if len(metricName) == 0 {
return nil, fmt.Errorf("Failed to create metric Int64Counter, metric name is asked to be set.")
}
return meter.SyncInt64().Counter(metricName, instrument.WithDescription(description))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if len(metricName) == 0 {
return metric.Int64Counter{}, fmt.Errorf("Failed to create metric Int64Counter, metric name is asked to be set.")
}
return metric.Must(meter).NewInt64Counter(metricName, metric.WithDescription(description)), nil

This comment was marked as outdated.

This comment was marked as outdated.


close(c)
//Consistently(c, "60s").Should(BeClosed())
Eventually(c, "20s").Should(BeClosed())

This comment was marked as resolved.

http.HandleFunc("/metrics", httpHandle)
go func() {
_ = http.ListenAndServe(":2222", nil)
}()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

第一,这个 http server 启动后,在 it 中没有 进行 client 消费,没有起到真正的 作用,未能验证 metric真正的能输出正确的值
第二,,在 整工程 跑单元测试时,这个 协程泄露,没有回收
DeferCleanup(func() {
close http server
})

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议 if server.ListenAndServe() { printf " I am existing , info=%v ". %v} , 万一 本地端口本占用呢?或者 中途 挂了,或者 知道 afterall 通知推出了

SpiderPoolMeter = "spider_pool_meter"
MetricNodeIPAllocationCountsName = "Node_IP_Allocation_Counts"
MetricNodeIPAllocationDurationName = "Node_IP_Allocation_Duration"
)

This comment was marked as resolved.

MetricNodeIPAllocationDuration syncfloat64.Histogram
}

var _ = Describe("metrics", Label("unitest"), Ordered, func() {

This comment was marked as resolved.

@codecov
Copy link

codecov bot commented Apr 18, 2022

Codecov Report

Merging #81 (0a8b6d0) into main (1621ec6) will increase coverage by 3.94%.
The diff coverage is 92.68%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #81      +/-   ##
==========================================
+ Coverage   77.58%   81.52%   +3.94%     
==========================================
  Files           2        3       +1     
  Lines         116      157      +41     
==========================================
+ Hits           90      128      +38     
- Misses         19       21       +2     
- Partials        7        8       +1     
Flag Coverage Δ
unittests 81.52% <92.68%> (+3.94%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/metrics/metrics.go 92.68% <92.68%> (ø)

@Icarus9913 Icarus9913 closed this Apr 19, 2022
@Icarus9913 Icarus9913 reopened this Apr 19, 2022
@Icarus9913
Copy link
Collaborator Author

对于metrics的unitest记录某种指标是否成功记录/消费的验证,otel库目前缺少该功能。已有其他contributor提过该类的issue和pr,但是还没有merge。
open-telemetry/opentelemetry-go#2776

@weizhoublue weizhoublue reopened this Apr 19, 2022
@Icarus9913 Icarus9913 force-pushed the feat/wk/otel-metrics branch 2 times, most recently from 94e34bb to a64f45e Compare April 19, 2022 08:26
@weizhoublue
Copy link
Collaborator

有冲突了,解决下

@spidernet-io spidernet-io deleted a comment from weizhoublue Apr 21, 2022
Signed-off-by: Icarus9913 <icaruswu66@qq.com>
@Icarus9913 Icarus9913 force-pushed the feat/wk/otel-metrics branch 8 times, most recently from 1dbeba8 to c41344e Compare April 22, 2022 10:01

err = server.Serve(listener)
if nil != err && err != http.ErrServerClosed && err != net.ErrClosed {
logger.Error(err.Error())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ginkgo 用的是 BY() 或 FAIL() 或 GinkgoWriter.Printf , 这种日志才能追加到 测试报告中,
没有 使用 logger 或者 printf 的,单元测试成千上万的 日志,是没法看 的,也不能在测试报告中看到

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

解决下

}

// verify counts instrument
if strings.Contains(line, MetricNodeIPAllocationCountsName) && strings.Contains(line, OTEL_COUNTS_SIGNAL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果是能 把 json字符串 解析成 struct, 使用 struct 成员 来 值判断,会是更加 正道的 方法
尤其是 一些 validator 库的使用 ,能加速效率
https://github.com/go-playground/validator/blob/master/_examples/simple/main.go
https://github.com/gookit/validate/blob/master/README.zh-CN.md

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: Icarus9913 <icaruswu66@qq.com>
# Conflicts:
#	go.mod
#	go.sum
#	vendor/modules.txt
Signed-off-by: Icarus9913 <icaruswu66@qq.com>
@sonarcloud
Copy link

sonarcloud bot commented Apr 23, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@github-actions
Copy link

github-actions bot commented Jul 9, 2022

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the pr/stale This pull is inactive label Jul 9, 2022
@github-actions github-actions bot removed the pr/stale This pull is inactive label Jul 19, 2022
@weizhoublue weizhoublue closed this Sep 1, 2022
@Icarus9913 Icarus9913 deleted the feat/wk/otel-metrics branch March 10, 2023 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/not-ready not ready for merging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

otel metric , log
2 participants