Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADOT EKS addon (v0.92.1-eksbuild.1) - Using otlphttp exporters when using the Add-on configuration schema #2685

Open
bgarcial opened this issue Mar 14, 2024 · 2 comments
Labels

Comments

@bgarcial
Copy link

bgarcial commented Mar 14, 2024

Describe the question
I want use the adot eks addon to deploy the aws-otel collector and send logs to an AWS Open Search Ingestion pipeline endpoint.
For that I need to work with the otlp exporter, so when configuring the json | yaml schema and referencing the otlp exporter, the addon operations says it is not allowed. For this I am in doubt about what otlp exporter use:

  • Here says that when it comes to using an OSIS endpoint pipeline, otlphttp should be used. I checked on this repo and I see otlphttp is supported as it is referenced in this list.

It also requires setting up the sigv4auth extension, but the extensions attribute is not allowed on the AWS DoT schema configuration that the EKS addon provides.

  • On the other hand, there is this another link documentation which states that for OpenSearch-DataPrepper, the otlp/data-prepper exporter should be used. I have the feeling this is outdated or does not apply to my case as I am using AWS OSIS and not Data Prepper as an Open source product. Am I right?

With both otlphttp and otlp/data-prepper I got an error that says those as exporters are "not defined in the schema and the schema does not allow additional properties ..." :

  1. otlphttp exporter case

When executing aws eks create-addon command I got this error

aws eks create-addon \
    --cluster-name "jupiter-test" \
    --addon-name adot \
    --configuration-values file://configuration-values-otlphttp.yml \
    --resolve-conflicts=PRESERVE

An error occurred (InvalidParameterException) when calling the CreateAddon operation: ConfigurationValue provided 
in request is not supported:
 Yaml schema validation failed with error: [$.collector.containerLogs.extensions: is not defined 
in the schema and the schema does not allow additional properties, 
$.collector.containerLogs.service: is not defined in the schema and the schema does not allow additional properties, $.collector.containerLogs.exporters.otlphttp: is not defined in the schema and the schema does not allow additional properties]

This is my configuration-values-otlphttp.yml:

admissionWebhooks:
  namespaceSelector: {}
  objectSelector: {}
affinity: {}
collector:
  containerLogs:
    extensions:
      sigv4auth:
        region: "eu-west-1"
        service: "osis"

    exporters:
      otlphttp:
        endpoint: "https://aws-otel-logs-pipeline-4chd6i53fppoxbgvrljzeislvq.eu-west-1.osis.amazonaws.com"
        auth:
          authenticator: sigv4auth
          compression: none

    service:
      pipelines:
        logs:
          exporters: [otlphttp]
    resources:
      limits:
        cpu: 1000m
        memory: 750Mi
      requests:
        cpu: 300m
        memory: 512Mi
    serviceAccount:
      annotations:
        eks.amazonaws.com/role-arn: arn:aws:iam::218610432265:role/eks-logs-test-ingesting-logs-from-otel-to-osis
kubeRBACProxy:
  resources:
    limits:
      cpu: 500m
      memory: 128Mi
    requests:
      cpu: 5m
      memory: 64Mi
manager:
  env:
    ENABLE_WEBHOOKS: "true"
  resources:
    limits:
      cpu: 1000m
      memory: 128Mi
    requests:
      cpu: 100m
      memory: 64Mi
nodeSelector: {}
replicaCount: 1
tolerations: []

Here, I want to highlight two things:

A. The extensions: attribute, I know it is a kind of "high level" collector configuration and in the configuration is like this as the doc says:

extensions:
  sigv4auth:
    assume_role:
      arn: "arn:aws:iam::123456789012:role/aws-service-role/access"
      sts_region: "us-east-1"

But I don't know how to place it on the Add-on configuration schema.
I am aware it should not be collector.containerLogs.extensions and therefore the error definition. But it seems the provided schema (which I copied from AWS adot console wizard) does not allow it in any place.
Does the adot eks addon allow this customization of extensions attribute?

The same is happening with service attribute where we reference the exporter defined. I am aware it should not be under .collector.containerLogs.service. But where to place it?

B. For the .collector.containerLogs.exporters.otlphttp I was expecting this to work as the above link documentation says otlphttp is allowed and that is its place according to the schema, but for me is weird to see, it seems the adot addon does not allow the exporter because of this error message:

$.collector.containerLogs.exporters.otlphttp: is not defined in the schema and the schema does not allow additional properties]

When I use awscloudwatchlogs exporter, on the config schema, the addon deployment works and I got the aws-otel operator and aws-otel collector deployed up and running and collecting logs and send them to cloudwatch (previousy assigned permisisons to the IAM role service account.)

Is the adot eks addon only intended to work with the awscloudwatchlogs exporter when it comes to managing the collector deployment?

Steps to reproduce if your question is related to an action

  • Have an eks cluster available.
  • Install cert-manager from helm chart. This command is enough
helm install \
        cert-manager jetstack/cert-manager \
        --namespace cert-manager \
        --create-namespace \
        --version v1.14.2 \
        --set installCRDs=true
  • Create an IAM policy with "osis:Ingest" to be attached to the IAM Role and take its arn resource address to put it on the --attach-policy-arn flag below when creating the iam service account:

  • Create the iamservice account. This command will create the IAM role that the service account will reference:

eksctl create iamserviceaccount \
    --name adot-col-container-logs \
    --namespace opentelemetry-operator-system \
    --cluster "jupiter-test" \
    --role-name "jupiter-eks-logs-test-ingesting-logs-from-otel-to-osis" \
    --attach-policy-arn arn:aws:iam::218610432265:policy/OpenSearchIngestion-From-AWS-Open-Telemetry-EKS-addon \
    --approve \
    --override-existing-serviceaccounts \
    --region eu-west-1
  • Create an OSIS pipeline and take its endpoint.

  • Deploy the adot eks addon:

aws eks create-addon \
    --cluster-name "jupiter-test" \
    --addon-name adot \
    --configuration-values file://configuration-values-otlphttp.yml\ # this is the above config for otlphttp exporter that fails
    --resolve-conflicts=PRESERVE

When using the config for cloudwatch exporter the deployment works as I mentioned.

What did you expect to see?
I would expect the folowing:

  • otlphttp exporter can be configured like .collector.containerLogs.exporters.otlphttp similar to the cloudwatch exporter here.

  • The Addon config schema allows service and extensions attributes to be referenced to deploy the adot eks addon.

Are these outcomes realistic by configuring them from the addon file values?

Or for that do I need to deploy independently the collector, like this way?
If so, then it means that the collector won't be managed by the adot addon but for me right?

Environment
Describe any aspect of your environment.
If this is related to a deployment of the ADOT Collector please
provide your Collector config file.

My basic collector configurations were provided above in the description of the problem.

Additional context

I got the same situation when deploying the otlp/data-prepper exporter

Other alternative I've tried is to edit in live the generated configmap once it is deployed and remove the cloudwatch exporter and add the otlphttp, but when I try to do that, the collector config does not update, the configs that I did from the eks addon remains, I guess because they are managed by aws eks (which is cloud formation) and then those remain?

I will appreciate your thoughts on this situation :)

@bgarcial
Copy link
Author

bgarcial commented Mar 15, 2024

I've managed to find a workaround and deploy the adot eks addon without the collector in this way:

admissionWebhooks:
  namespaceSelector: {}
  objectSelector: {}
affinity: {}
collector: {}
kubeRBACProxy:
  resources:
    limits:
      cpu: 500m
      memory: 128Mi
    requests:
      cpu: 5m
      memory: 64Mi
manager:
  env:
    ENABLE_WEBHOOKS: "true"
  resources:
    limits:
      cpu: 1000m
      memory: 128Mi
    requests:
      cpu: 100m
      memory: 64Mi
nodeSelector: {}
replicaCount: 1
tolerations: []

and then I customized the aws-otel collector deployment. Taking this as a reference my collector yaml file ended up like this, managing to use the normal config attributes the collector configmap expects like extensions, processors and others ones ...

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: adot-col-container-logs
  namespace: opentelemetry-operator-system
spec:
  mode: daemonset
  # image:  public.ecr.aws/aws-observability/aws-otel-collector:v0.38.1
  serviceAccount: adot-col-container-logs
  securityContext:
    runAsUser: 0
  volumeMounts:
    - name: varlogpods
      mountPath: /var/log/pods
      readOnly: true
    - name: varlibdockercontainers
      mountPath: /var/lib/docker/containers
      readOnly: true
  volumes:
    - name: varlogpods
      hostPath:
        path: /var/log/pods
        # type: ""
    - name: varlibdockercontainers
      hostPath:
        path: /var/lib/docker/containers
        # type: ""
  env:
    - name: "K8S_CLUSTER_NAME"
      value: "jupiter-test"
    - name: "K8S_NODE_NAME"
      valueFrom:
        fieldRef:
          fieldPath: "spec.nodeName"
    - name: "K8S_POD_NAME"
      valueFrom:
        fieldRef:
          fieldPath: "metadata.name"
    - name: "K8S_NAMESPACE"
      valueFrom:
        fieldRef:
          fieldPath: "metadata.namespace"
  podAnnotations:
    prometheus.io/scrape: 'true'
    prometheus.io/port: '8888'
  config: |
    extensions:
      sigv4auth:
        region: "eu-west-1"
        service: "osis"
    
    receivers:    
      filelog:
        include:
          - /var/log/pods/*/*/*.log
        # exclude:
          # Exclude logs from all containers named otel-collector
          # - /var/log/pods/*/otel-collector/*.log
        start_at: beginning
        include_file_path: true
        include_file_name: false
        operators:
          # Find out which format is used by kubernetes
          - type: router
            id: get-format
            routes:
              - output: parser-docker
                expr: 'body matches "^\\{"'
              - output: parser-crio
                expr: 'body matches "^[^ Z]+ "'
              - output: parser-containerd
                expr: 'body matches "^[^ Z]+Z"'
          # Parse CRI-O format
          - type: regex_parser
            id: parser-crio
            regex:
              '^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*)
              ?(?P<log>.*)$'
            output: extract_metadata_from_filepath
            timestamp:
              parse_from: attributes.time
              layout_type: gotime
              layout: '2006-01-02T15:04:05.999999999Z07:00'
          # Parse CRI-Containerd format
          - type: regex_parser
            id: parser-containerd
            regex:
              '^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*)
              ?(?P<log>.*)$'
            output: extract_metadata_from_filepath
            timestamp:
              parse_from: attributes.time
              layout: '%Y-%m-%dT%H:%M:%S.%LZ'
          # Parse Docker format
          - type: json_parser
            id: parser-docker
            output: extract_metadata_from_filepath
            timestamp:
              parse_from: attributes.time
              layout: '%Y-%m-%dT%H:%M:%S.%LZ'
          - type: move
            from: attributes.log
            to: body
          # Extract metadata from file path
          - type: regex_parser
            id: extract_metadata_from_filepath
            regex: '^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]{36})\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$'
            parse_from: attributes["log.file.path"]
            cache:
              size: 128 # default maximum amount of Pods per Node is 110
          # Rename attributes
          - type: move
            from: attributes.stream
            to: attributes["log.iostream"]
          - type: move
            from: attributes.container_name
            to: resource["k8s.container.name"]
          - type: move
            from: attributes.namespace
            to: resource["k8s.namespace.name"]
          - type: move
            from: attributes.pod_name
            to: resource["k8s.pod.name"]
          - type: move
            from: attributes.restart_count
            to: resource["k8s.container.restart_count"]
          - type: move
            from: attributes.uid
            to: resource["k8s.pod.uid"]
            
    processors:
      batch:
      filter/namespace_exclude:
        logs:
          exclude:
            match_type: strict
            resource_attributes:
            - key: k8s.namespace.name
              value: amazon-cloudwatch
    

    exporters:
      logging:
        verbosity: detailed
        sampling_initial: 5
        sampling_thereafter: 200
      otlphttp:
        endpoint: "https://aws-otel-logs-pipeline-xxxxxxxx.eu-west-1.osis.amazonaws.com/aws-otel-logs-pipeline"
        auth:
          authenticator: sigv4auth
        compression: none
      

    service:
      extensions: [sigv4auth]
      pipelines:
        logs:
          receivers: [filelog]
          processors: [batch,filter/namespace_exclude]
          exporters: [otlphttp,logging]


---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: adot-col-log-reader
rules:
  - apiGroups:
      - ""
    resources:
      - pods
      - pods/log
    verbs:
      - get
      - list
      - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: adot-col-log-reader-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: adot-col-log-reader
subjects:
- kind: ServiceAccount
  name: adot-col-container-logs
  namespace: opentelemetry-operator-system

My main concern with this approach was that by deploying separately the collector from the adot eks addon, the collector won't be managed by the addon, but it is not the case as it is still under the scope of the adot addon as the deployment says here:
image

So far the latest version of the addon used (v0.92.1-eksbuild.1), by default use these versions of:

  • operator public.ecr.aws/aws-observability/adot-operator:0.92.1
  • collector public.ecr.aws/aws-observability/aws-otel-collector:v0.37.0

But there are already newest version of both of them:

I guess the addon on the control plane will find the best time to update progressively, if I am not mistaken.

Some references:
https://opentelemetry.io/docs/kubernetes/collector/components/#filelog-receiver
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/filelogreceiver/README.md

Copy link
Contributor

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

@github-actions github-actions bot added the stale label May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant