Skip to content
This repository has been archived by the owner on Dec 23, 2023. It is now read-only.

App Engine / Google Cloud Monitoring metrics support issues #2070

Open
matthewblain opened this issue Jan 4, 2021 · 1 comment
Open

App Engine / Google Cloud Monitoring metrics support issues #2070

matthewblain opened this issue Jan 4, 2021 · 1 comment
Labels

Comments

@matthewblain
Copy link

matthewblain commented Jan 4, 2021

Please answer these questions before submitting a bug report.

What version of OpenCensus are you using?

0.28.1

What JVM are you using (java -version)?

Whatever App Engine is using. The latest release notes from Google say "Updated Java SDK to version 1.9.84."

What did you do?

Used OpenCensus to create metrics for Google Cloud Monitoring in a Java8 App Engine Standard app. I am only using OpenCensus (at least for now) to push metrics using a View, Measure Map.

It appears to be working, once I spent a few hours tweaking various settings. This bug is sort of a laundry list of small issues. I am going to mix actual/expected here, realizing that some of these may be best forked off into their own issues and others addressed all at once.

RPCs worked.

Actual: Data flowed just fine from App Engine to Google Cloud Monitoring.
Expected: The documentation at https://opencensus.io/integrations/google_cloud/google_cloud_appengine_standard/ says it would not work at all due to GRPC issues. This appears not to be the case.
Workaround: Ignore documentation.

Labels were incomplete/insufficient

Actual: Only label was opencensus_task with value java-1@localhost
Expected: Some sort of per-instance label. Simplest if the random number were more random. Best if it were to use something App Engine specific (see next section).
Workaround; Use setConstantLabels to set a variety of labels. I also added opencensus_task with value java@$instance_id, which would be sufficient. I cannot quite tell if this is necessary/useful, or I should remove opencensus_task as there are now good labels.

I am using the following labels:

("module_id", "App Engine Module ID"): modulesService.getCurrentModule()
("version_id", "App Engine Version ID"): modulesService.getCurrentVersion()
("instance_id", "App Engine Instance ID"): modulesService.getCurrentInstanceId()

Resource type shows up as GCE VM

Actual: Resource type shows up as gce_instance. Various metadata is also blank
Expected: Resource type shows up as gae_instance. Metadata shows up using App Engine values. Perhaps this needs a contrib module, or perhaps it can easily be read through the os Environment variables and other system properties.
Workaround: Not sure yet. I imagine this can be solved by using StackdriverStatsConfiguration.setMonitoredResource .

Uncertain if there's any other concerns.

Actual: Seems to work
Expected: Will continue to work. But I'm concerned there may be some gotcha. For example, losing data with an inappropriate exportInterval. (The defaults should generally be good here.).

Additional context

Simply documenting all of this, which I've started in the 'workarounds' above, may be sufficient. Alternately, guidance as to how to use OpenTelemetry instead.

@jsuereth
Copy link
Contributor

Hey, since you didn't get any activity, just wanted to say thank you much for the feedback!

Yes sending metrics from AppEngine to GCM does seem to work. There's subtle bugs waiting in the weeds which is one reason we don't recommend it yet. Most of them revolve around data loss on eviction, and offering some AppEngine-specific setup to help ensure the defaults you use work in the environment.

Regarding a lot of your concerns:

  • Resource labelling and autodetection is still being worked on in OpenTelemetry, so likely not ready there yet. If you're using OpenCensus you'll have to do this by hand (sorry!)
  • Depending on how many instances you run, be careful of "high carnality" labels because these can tank the performance of your custom metric TimeSeries in dashboard + alerts.
  • We've seen some reports of issues with OpenCensus flushing its queues on containerless environment (Cloud Run, e.g.). While this likely shouldn't be an issue in AppEngine, we can't offer any guarantees.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants