Skip to content

A mock project that makes use of OpenTelemetry Collector, OTLP Exporter, and Receiver. It explains everything about collecting Telemetry data and exporting them to different backends.

zeyadkhaled/openversion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenVersion (Versioning microservice with OpenTelemetry)

Demo project that intergrates OpenTelemetry-go SDK and exports to GCP and Prometheus using Otel-Collector

Table of Contents

  1. Introduction
  2. Getting Started
  3. Running This
  4. Demo Project
  5. Extra info

Introduction

The repository explains what is OpenTelemetry, OpenTelemetry Collector, OTLP (Open Telemetry Protocol), and will proceed with a get started guideline to integrate opentelemetry in your Go project and export your telemetry data to different backends that are supported by the core Collector project and additional backends coming from the community contributions. The repository offers a mock microservice that makes use of collecting metrics and traces.

What is OpenTelemetry

OpenTelemetry provides a single set of APIs, libraries, agents, and collector services to capture distributed traces and metrics from your application.

OpenTelemetry Website

What are traces, metrics, and logs?

  • Metrics:

    • Everything from operating systems to applications generate metrics which, at the least, are going to include a name, a time stamp and a field to represent some value.

    • Most metrics will enable you to tell if a resource is alive or dead, but if the target is valuable enough you’ll want to be able to ascertain what is actually wrong with the system or going wrong.

  • Traces:

    • With so many application interdependencies these days, these operations will typically involve hops through multiple services (so called spans Traces, then, add critical visibility into the health of an application end-to-end.
  • Logs:

    • What’s more, logs tend to give more in-depth information about resources than metrics. So, if metrics showed the resource is dead, logs will help tell you why it died.

Explanation of Traces, Metrics, Logs

What is OpenTelemetry Collector

  • The OpenTelemetry Collector offers a vendor-agnostic implementation on how to receive, process, and export telemetry data. In addition, it removes the need to run, operate, and maintain multiple agents/collectors in order to support open-source telemetry data formats (e.g. Jaeger, Prometheus, etc.) sending to multiple open-source or commercial back-ends.

  • OpenTelemetry Collector is an executable that allows to receive telemetry data, optionally transform it and send the data further.

OpenTelemetry Collector Repo

  • The name might be misleading that this collector offer directly exporting data in OpenTelemetry format; however, this is not the case but it is possible through OTLP exporters and receievers.

What is OTLP

  • This is the protocol format that telemetry data could be collected in and then exported to a collector that has a receiver that understands this protocol format and could translate it to other backends like Prometheus, Jaeger, GCP.

Why OpenTelemetry

  • To date, two open-source projects have dominated the cloud-native telemetry landscape: OpenTracing and OpenCensus—each with its own telemetric standard and its own substantial community. In a commendable spirit of cooperation, however, the two projects have decided to converge in order to achieve an important mission: a single standard for built-in, high-quality cloud-native telemetry.

  • Essentially OpenTelemetry converges the best of both projects into a single standard around which the entire cloud-native development community can rally.

Blog Explaning What Exactly is OpenTelemetry

Why OpenTelemetry Collector

  • Using just vanilla OpenTelemetry to export to multiple backends would require the usage of various exporters that are vendor specific which means more code, more dependencies, and high risk of breaking your project when OpenTelemetry updates their API but the vendor specific exporter is unchanged. (The reason this repo came to existance was a trial to integrate exporting OpenTelemetry data to Google Cloud Tracing and this ended in failure after the poorly maintained GCP OpenTelemetry exporter was dependent on old version of OpenTelemetry SDK)

Get Started

Understanding Open Telemetry Collector Architectures

  • There are two ways to run the collector as:

    • An Agent that exposes different receivers for each backend and then through a single exporter exports to another collector and this collector then sends to multiple exporters. (This uses fewer resources at data sending stage and allows to deploy multiple collectors with more complexity.)

      Agent Architecture

    • The other way is by deploying a standalone collector service that exposes a receiever with a specific protocol that collects all data exported data then translates this data to the declared exporters (Backends). (This uses more resources at data sending stage and allows only on collector but it is easier to setup and is suitable for small-medium scale projects.)

      Collector Architecture

  • It might be confusing the usage of the word collector. The first essentially uses a collector instance as an agent that communicates to one or multiple collector instance as collectors. Second architecture directly uses a collector instance for the purpose of collecting and directly exporting to multiple backends.

In depth explanation of the Architecture

Understanding The Collector Architecure

  • The Collector supports several popular open-source protocols for telemetry data receiving and sending as well as offering a pluggable architecture for adding more protocols.

  • Data receiving, transformation and sending is done using Pipelines. The Collector can be configured to have one or more Pipelines. Each Pipeline includes a set of Receivers that receive the data, a series of optional Processors that get the data from receivers and transform it and a set of Exporters which get the data from the Processors and send it further outside the Collector. The same receiver can feed data to multiple Pipelines and multiple pipelines can feed data into the same Exporter.

What are the pipelines

  • Pipeline defines a path the data follows in the Collector starting from reception, then further processing or modification and finally exiting the Collector via exporters.

  • Pipelines can operate on 2 telemetry data types: traces and metrics. The data type is a property of the pipeline defined by its configuration. Receivers, exporters and processors used in a pipeline must support the particular data type otherwise ErrDataTypeIsNotSupported will be reported when the configuration is loaded.

Pipelines in depth

Breaking the pipline architecure

  • A pipline configuration follows this structure:

    service:
    pipelines: # section that can contain multiple subsections, one per pipeline
        traces:  # type of the pipeline
        receivers: [opencensus, jaeger, zipkin]
        processors: [tags, tail_sampling, batch, queued_retry]
        exporters: [opencensus, jaeger, stackdriver, zipkin]
    
  • The used receivers,processors,exporters are declared above the service in the given format:

    receivers:
      opencensus:
          endpoint: "0.0.0.0:55678"
    
      exporters:
      prometheus:
          endpoint: "0.0.0.0:8889"
    
      processors:
      batch:
    
  • Multiple piplines could be declared in the same collector:

    service:
        pipelines:
            traces:
                receivers: [opencensus]
                exporters: [logging]
                processors: [batch, queued_retry]
            metrics:
                receivers: [opencensus]
                exporters: [logging,prometheus]
    
Recievers
  • Receivers typically listen on a network port and receive telemetry data. Usually one receiver is configured to send received data to one pipeline, however it is also possible to configure the same receiver to send the same received data to multiple pipelines. This can be done by simply listing the same receiver in the “receivers” key of several pipelines.
Prometheus Reciever
  • When prometheus is used as a receiver it used as a drop-in replacement to scrape the services declared under it just as if it is declared inside of a prometheus.yml file.

Prometheus Receiver

Exporters
  • Exporters typically forward the data they get to a destination on a network (but they can also send it elsewhere, e.g “logging” exporter writes the telemetry data to a local file).

  • Exporter is basically the backend you are trying to export to. Exporters are either Metrics, Traces, or Both.

    Supported trace exporters (sorted alphabetically):
        Jaeger
        OpenCensus
        OTLP
        Zipkin

    Supported metric exporters (sorted alphabetically):
        OpenCensus
        Prometheus
        OTLP

    Supported local exporters (sorted alphabetically):
        File
        Logging
  • Each exporter have different configuration but follows the same declaration language. Exporters in Depth
Additional Exporters and Receivers
  • The offical contributions package includes support to additional exporters and receivers.

  • To use the contribution additional exporters:

    • Change the OTELCOL_IMG=otel/opentelemetry-collector-contrib:latest in .env

    • in the otel-collector-config.yaml file add your newly declared exporters and check their configuration settings from the github repo

Additional Exporters

Build a collector that uses OpenTelemetry Exporter and Reciever

  • This part is broken into 3 parts:

OTEL(open-telemetry) collector config file

  • This file is where the collector pipelines are declared.

  • A full Collector file that has an OTLP(open-telemetry protocol) reciever, Prometheus and Stackdriver Exporter, Some processors and extensions, and one Metric and One Trace pipeline will look like this:

  receivers:
  otlp:
    endpoint: 0.0.0.0:55678

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"
    namespace: promexample
    const_labels:
      label1: value1
    
  stackdriver:
    project: project-id-here
    metric_prefix: prefix
    number_of_workers: 3
    skip_create_metric_descriptor: true

processors:
  batch:
  queued_retry:

extensions:
  health_check:
  pprof:
    endpoint: :1888
  zpages:
    endpoint: :55679

service:
  extensions: [pprof, zpages, health_check]
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [logging,stackdriver]
      processors: [batch, queued_retry]
    metrics:
      receivers: [otlp]
      exporters: [logging,prometheus]

Docker compose file to run your service + collector

  • This is a docker-compose file that declares

    • Collector
    • Your service
    • Used exporters i.e Prometheus, Jaeger
  • NOTES:

    • Prometheus needs a prometheus config file to be loaded so be sure to declare such a file.
    • a .env file is needed with the docker image:tag of the collector and other required arguments.
    • Stackdriver requires GOOGLE_APPLICATION_CREDENTIALS environment variable to be set inside your collector container. To do so, load a local volume internally and set an environment variable with the internal location.

Declare an OTLP exporter inside your service

  • In your main.go of your application declare your exporter set with the address of your collector(Reciever) address. This address should be passed in the docker-compose file.

    import (
          "go.opentelemetry.io/otel/exporters/otlp"
    )
    collectorAddr, ok := os.LookupEnv("OTEL_RECIEVER_ENDPOINT")
      if !ok {
      	collectorAddr = otlp.DefaultCollectorHost + ":" + string(otlp.DefaultCollectorHost)
      }
      exporter, err := otlp.NewExporter(otlp.WithAddress(collectorAddr), otlp.WithInsecure())
    
  • For tracing you can then use this exporter to set a global trace provider.

Running this

  • Before running, if you are using stackdriver exporter be sure to change the credentials volume mounting in the docker-compose file i.e /Users/zeyad/gs.json:/etc/gs.json

  • Move to dev folder cd /dev

  • Run docker-compose up command.

  • After the containers are up, run cd ../internal/sqlfiles and then migrate --source=file://. --database=postgres://postgres:roottoor@localhost:5432/backend\?sslmode=disable up

    • migrate is an sql migration tool that sets the postgres DB with the scheme and the tables.
      • Check /internal/sqlfiles/README.md for details on downloading sql migration tool
  • If every thing is successful you will start seeing your stats showing in your deployed exporters.

Demo Project

Understadning the Concept

  • This project serves as an example of a Microservice that holds the current version for different applications. It is thought of as a service that will be queried by other services to get latest information about a specific application or about itself.

  • The application stores its information primarly in postgres with a middle redis cache layer.

  • The application is made to showcase OTEL Tracing and Metrics functionalities.

    • Tracing entry point is set in the Telemetry middlware where a span is created and the context is propagated all the way until all service has and the stores finished their queries.
    • Metrics depend on 4 different instruments, 2 of which are set in the MiddleWare to capture traffic count and weight. The other 2 are part of the service and stores where errors are counted and response times are measured.

Core packages

  • api package:

    • api.go:
      • Where the routes are added to router and also the Middleware for Tracing and Metrics collection is declared.
    • versionapi.go:
      • Where the endpoints are declared and their respective handlers are initialized by making calls to the underlying (version) service.
  • cmd/backend package:

    • main.go:
      • Initializes all the services,stores,api,and http server
    • telemetry.go:
      • Intializes the OTEL exporter for traces and the pusher for metrics and sets both in the global providers.
  • dev package:

    • contains files related with docker-compose, environment variables, requests file to test your server endpoints.
  • version package:

    • store package:

      • contains two declarations of the store interface in the version service.
        • redisstore is used as a cache layer
        • postgres store is the main store
    • service.go:

      • contains the structure of the service, declration of the structs it depends on, and the service methods.
    • version.go:

      • contains the declaration of the Application struct.

Extras

Collecting Traces

  • From exporter-side:

    • Per config package of OPTL sdk/trace

      • // DefaultMaxAttributesPerSpan is default max number of attributes per span
        DefaultMaxAttributesPerSpan = 32
        // DefaultMaxLinksPerSpan is default max number of links per span
        DefaultMaxLinksPerSpan = 32```
        
        
    • From examples found on open-telemetry go repository on github:

      • use sdktrace.AlwaysSample sampler to sample all traces.
      • In a production application, use sdktrace.ProbabilitySampler with a desired probability.
  • From collector-side:

How to use spans optimally

Collecting Metrics

  • So OpenTelemetry metrics api has 3 instruments to report metrics and summarized they are

  • There are three instruments you can use in OpenTelemetry, each defined by the method you call to send a metric event. They are:

    • Counters, which you Add a value to. These are good for values that you'd like to think of as a rate, or changes in a quantity.

    • Gauges, which you Set the value of. You can think of these as either a car's odometer (a monotonic gauge, it never decreases) or a car's speedometer (a non-monotonic gauge, as it can go up and down.)

    • Measures, to which you Record a value. These are useful to build histograms or summaries, metric projections that let you calculate averages of many values. Understanding Metrics in OPTL

    • Indepth dive of OpenTelemetry Metrics API Specification

    • RED, USE, 4 Golden Signals for collecting Metrics

      • We choose RED for this specific demo as it targets mostly metrics related with APIs rather than underlying Infrastructure where USE would be a better fit.