Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[collectd 6] Initial implementation for epics plugin #4282

Draft
wants to merge 1 commit into
base: collectd-6.0
Choose a base branch
from

Conversation

matwey
Copy link
Contributor

@matwey matwey commented Feb 18, 2024

EPICS is Experimental Physics and Industrial Control System.

Reference: https://epics-controls.org/

Fixes: #4221

ChangeLog: epics plugin: Initial implementation for epics plugin for collectd 6.0

@collectd-bot collectd-bot added this to the 6.0 milestone Feb 18, 2024
Copy link
Contributor

@eero-t eero-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it would be better to have v5 version in the first commit, and apply v6 changes on top of that in second commit. As v5 plugin is already in, it would mean that one needs to review only the v5->v6 changes.

Copy link
Member

@octo octo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @matwey, thanks for the PR! I left some comments inline, the most severe is probably the casting and dereferencing of pointers, which I think is very likely to cause problems.

src/epics.c Outdated
bool thread_loop;
} epics_plugin = {.lock = PTHREAD_MUTEX_INITIALIZER};

static void free_pvs() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use this instead:

static void free_pvs(void) {
//                   ^^^^

The difference is that with () the compiler does not check the (unused) arguments being passed to the function. This is backwards compatibility behavior to way back when and should not be used.

src/epics.c Outdated

if (p->is_label) {
free(p->label);
} else if (p->metric_list.ptr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

free() handles NULL gracefully. This allows us to simplify this code to execute the non-label code unconditionally.

src/epics.c Outdated
struct pv *pvs;
size_t pvs_num;

metric_family_t *mfs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use families for global variables and struct fields and fams for local variables, instead of mfs. By consistently using the same name for common structures the codebase becomes much more readable.

src/epics.c Outdated
}

static int epics_config_metric_family(oconfig_item_t *ci, size_t mf_index) {
metric_family_t *fam = &epics_plugin.mfs[mf_index];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easier to just pass metric_family_t *fam as an argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mf_index is required to call epics_config_metric and epics_config_label. I decided to store family index instead of pointer in order to allow mfs reallocations. Otherwise, dangling pointers fix would be required on each mfs reallocation.

src/epics.c Show resolved Hide resolved
src/epics.c Outdated
Comment on lines 111 to 122
if (metric_type == METRIC_TYPE_COUNTER &&
(ch_type == DBR_SHORT || ch_type == DBR_LONG)) {
return DBR_LONG;
} else if (metric_type == METRIC_TYPE_GAUGE &&
(ch_type == DBR_FLOAT || ch_type == DBR_DOUBLE ||
ch_type == DBR_ENUM)) {
return DBR_DOUBLE;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting and then dereferencing a short* to a long*, or a float* to a double* will probably not work.

I think we could create two functions, one returning the value as an integer and the other returning the value as a double, e.g.:

static int64_t as_int64(evargs args) {
  switch (args.type) {
  case DBR_SHORT:
    return *((short const *) args.dbr);
  case DBR_LONG:
    return *((long const *) args.dbr);
  case DBR_DOUBLE:
    return *((double const *) args.dbr);
  case DBR_FLOAT:
    return *((float const *) args.dbr);
  // not sure what to do about DBR_ENUM
  }
  return 0;
}

static double as_double(evargs args) {
  // …
}

Then call one or the other, depending on the metric type:

static value_t as_value(evargs args, metric_type_t type) {
  switch (type) {
  case METRIC_TYPE_COUNTER:
    return (value_t){.counter = as_int64(args)};
  case METRIC_TYPE_GAUGE:
    return (value_t){.gauge = as_double(args)};
  // …
  }
  ERROR("epics plugin: invalid metric type: %d", type);
  return (value_t){0};
}

Finally, in the event handling you can then simply do this:

metric_list->ptr[i].value = as_value(args, epics_plugin.mfs[p->mf_index].type);

Copy link
Contributor Author

@matwey matwey Feb 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to simplify overall logic here. As far as I understand, there are two different types in Epics:

  1. DBF (Field type) is a type of the remote field,
  2. DBR (Request type) is a type requested from the remote side.
    The library performs conversions from actual DBF to requested DBR under the hood, including the conversions such as parsing from DBF_STRING to DBR_DOUBLE for instance.

So, as soon as we provide DBR_DOUBLE or DBR_LONG to ca_create_subscription as first argument, we will see only DBR_DOUBLE or DBR_LONG in handle_var_event and then dereferencing is safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code you mentioned should be

  if (metric_type == METRIC_TYPE_COUNTER &&
      (ch_type == **DBF**_SHORT || ch_type == **DBF**_LONG)) {
    return DBR_LONG;
  } else if (metric_type == METRIC_TYPE_GAUGE &&
             (ch_type == **DBF**_FLOAT || ch_type == **DBF**_DOUBLE ||
              ch_type == **DBF**_ENUM)) {
    return DBR_DOUBLE;
  }

Fortunately, DBF_SHORT and DBR_SHORT shares the same integer code. The code tries to notify the user that he wants to use unusual combination of DBR/DBF type, however, now I think it is for user to know what he wants.

src/epics.c Outdated
epics_plugin.pvs_num = 0;
}

static void free_mfs() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> free_families

Explicit empty argument list: (void)

src/epics.c Outdated
Comment on lines 518 to 520
for (struct pv *p = &epics_plugin.pvs[pv_i];
p != epics_plugin.pvs + epics_plugin.pvs_num && p->mf_index == i;
++p) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very hard to read. I think this would be much simpler:

// rename "pv_i" to "offset"
for (size_t j = 0; j < epics_plugin.pvs_num; j++) {
  size_t index = offset + j;
  struct pv *p = &epics_plugin.pvs[index];

  if (p->mf_index != i || !p->is_active || !p->is_label) {
    continue;
  }
  // …
}

@matwey matwey force-pushed the epics-6.0 branch 3 times, most recently from e2a2d99 to 17b6a76 Compare February 25, 2024 20:12
@octo
Copy link
Member

octo commented Feb 26, 2024

I tried building the plugin in the CI system, but without success:

  • A bug in the Debian package prevents the configure script from finding the EPICS package: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1059706
  • Apparently libvirt introduced an out-of-bounds write that our unit tests pick up, causing "Debian unstalbe" to fail. This is unrelated to this PR though.

@octo
Copy link
Member

octo commented Feb 26, 2024

@matwey Looks like this is progressing nicely. Let me know when the PR is ready for a review :D

@matwey matwey force-pushed the epics-6.0 branch 2 times, most recently from cf6eff7 to e6daf11 Compare February 28, 2024 08:39
@matwey
Copy link
Contributor Author

matwey commented Feb 28, 2024

@matwey Looks like this is progressing nicely. Let me know when the PR is ready for a review :D

I am slowly progressing. Now I've found that metric_list_t is supposed to be a list of the same metric values at different time moments. I don't know why I wrongly decided this was purposed for vector-valued metrics. So I need extra time to redesign the plugin.

src/epics.c Outdated
Comment on lines 36 to 43
union {
metric_list_t metric_list;
char *label;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metric_list is array of pointers, label is a single pointer, i.e. they have different allocation and free handling.

By stuffing them to same union you save 8 bytes per pv struct. What's the max number of pv structs you expect there to be?

(Union should be named, so people reading code notice more clearly that its members overlap.)

src/epics.c Outdated
Comment on lines 65 to 66
free(p->metric_list.ptr);
p->metric_list.num = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is array of structs containing pointers to additional lists (labels, maybe also metadata). Freeing just the root array leaks the rest. => Use the appropriate metric functions instead.

Comment on lines +77 to +93
for (size_t i = 0; i < epics_plugin.families_num; ++i) {
metric_family_t *fam = &epics_plugin.families[i];

free(fam->name);
free(fam->help);
free(fam->unit);
metric_family_metric_reset(fam);
}

free(epics_plugin.families);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't duplicate existing functionality, use the corresponding function from metric.h instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metric_family_free cannot be applied here because it tries to free individual fam pointer which is incorrect.

Copy link
Contributor

@eero-t eero-t Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Designing the code so that it uses existing helper functions to allocate/update/free given objects, makes its maintenance easier. Only those helpers need to be updated; one does not need search, understand, update and test also all the other places that do their things slightly differently from the helpers...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am totally agree, but there is no suitable existing helper function, what do you suggest exactly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allocate pointers individually, so that they can be separately freed?

src/epics.c Outdated
}

p->family_index = family_index;
p->is_label = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bool:

Suggested change
p->is_label = 1;
p->is_label = true;

src/epics.c Outdated
return pthread_join(epics_plugin.thread_id, NULL);
}

static int epics_config_metric(oconfig_item_t *ci, struct pv *p,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not returning any error code, just -1 or 0, so return value could as well be a bool.

Same applies also to the other functions following this.

return -1;
}

p->family_index = family_index;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity and to be more symmetric with the other function:

Suggested change
p->family_index = family_index;
p->family_index = family_index;
p->is_label = false;


label_set_reset(&label_set);
for (struct pv *p = &epics_plugin.pvs[offset];
p != epics_plugin.pvs + epics_plugin.pvs_num; ++p) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case:

Suggested change
p != epics_plugin.pvs + epics_plugin.pvs_num; ++p) {
p < epics_plugin.pvs + epics_plugin.pvs_num; ++p) {

}

for (struct pv *p = &epics_plugin.pvs[offset];
p != epics_plugin.pvs + epics_plugin.pvs_num; ++p, ++offset) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case:

Suggested change
p != epics_plugin.pvs + epics_plugin.pvs_num; ++p, ++offset) {
p < epics_plugin.pvs + epics_plugin.pvs_num; ++p, ++offset) {

src/epics.c Outdated
Comment on lines 362 to 372
static int epics_config_metric_family(oconfig_item_t *ci, size_t family_index) {
metric_family_t *fam = &epics_plugin.families[family_index];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for passing and (storing to pv struct) an index to a global array instead of a metric_family_t pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, families are reallocated at each config function invocation. Storing the pointer would easily make it dangling.

src/epics.c Outdated
@@ -0,0 +1,575 @@
/**
* collectd - src/epics.c
* Copyright (C) 2022 Matwey V. Kornilov
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2022?

@eero-t
Copy link
Contributor

eero-t commented Feb 28, 2024

Now I've found that metric_list_t is supposed to be a list of the same metric values at different time moments.

It's array of new metrics for given metric family (family = metric type with specific name, unit, and resource attributes). You submit them as part of family (after which you reset i.e. free them), so they will have the same timestamp.

You use metric labels to differentiate different metric values within given metric family.

For example, in the Sysman plugin, for GPU frequency there are two metric families, frequency and frequency ratio (of max possible). Both of these families have separate metric values for frequency request (from SW) and actual frequency (set on HW by FW).

OpenTelemetry specs have large number of examples for metric (family) names, and their (metric) labels: https://opentelemetry.io/docs/specs/semconv/

@eero-t eero-t changed the title Initial implementation for epics plugin for collectd 6.0 [collectd 6] Initial implementation for epics plugin Feb 28, 2024
@matwey matwey force-pushed the epics-6.0 branch 2 times, most recently from 08c77ec to f7f4eb6 Compare March 6, 2024 12:13
EPICS is Experimental Physics and Industrial Control System.

Reference: https://epics-controls.org/

Signed-off-by: Matwey V. Kornilov <matwey.kornilov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants