Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After applying JMX filtering pattern, it takes 23 secs to scrape the metrics, is it expected behaviour? #955

Closed
sreelu27 opened this issue May 7, 2024 · 6 comments

Comments

@sreelu27
Copy link

sreelu27 commented May 7, 2024

I am trying to scrape metrics and using the JMX filtering pattern as shown below, we are seeing the Scrape Time as 23 seconds.

The configuration we are using is
Total number of Topic : 60
Partition count per topic: 180
Total Number of partitions: 180 * 60 = 10800
PVC Memory : 50 Gi
Live traffic : 50000
Replica count: 6 (6 brokers)

The default value for scrape time is 10 s, but for the below pattern it takes 23 seconds.

Is this the expected behavior? Is there any work around that we can apply to reduce the scrape time?

We are using JMX version 0.20.0

kafka_metrics_config.yaml: |
lowercaseOutputName: false
rules:
- pattern : kafka.controller<type=(ControllerStats|KafkaController|), name=(.+)><>(Count|Value|OneMinuteRate)
- pattern : kafka.server<type=KafkaServer, name=(.+)><>(Value)
- pattern : kafka.server<type=ReplicaManager, name=(IsrExpandsPerSec|IsrShrinksPerSec|FailedIsrUpdatesPerSec)><>(OneMinuteRate)
- pattern : kafka.network<type=RequestMetrics, name=(.+), request=(.+)><>(Count)
- pattern : kafka.producer<type=producer-metrics, client-id=(.+)><>(outgoing-byte-rate|record-error-total|network-io-total|network-io-rate)
- pattern : kafka.producer<type=producer-node-metrics, client-id=(.+), node-id=(.+)><>(outgoing-byte-total|outgoing-byte-rate)
- pattern : kafka.producer<type=producer-topic-metrics, client-id=(.+), topic=(.+)><>(byte-rate)
- pattern : java.lang<type=OperatingSystem><>(ProcessCpuTime|OpenFileDescriptorCount|AvailableProcessors|ProcessCpuLoad|FreeSwapSpaceSize|CommittedVirtualMemorySize|TotalSwapSpaceSize|TotalPhysicalMemorySize|FreePhysicalMemorySize|SystemCpuLoad|SystemLoadAverage)
- pattern : kafka.server<type=replica-fetcher-metrics, broker-id=(.+), fetcher-id=(.+)><>(network-io-total)
name: kafka_server_replica_fetcher_metrics_network_io_total
labels:
broker-id: $1
fetcher-id: $2
- pattern : kafka.server<type=replica-fetcher-metrics, broker-id=(.+), fetcher-id=(.+)><>(incoming-byte-rate)
name: kafka_server_replica_fetcher_metrics_incoming_byte_rate
labels:
broker-id: $1
fetcher-id: $2
- pattern : kafka.server<type=replica-fetcher-metrics, broker-id=(.+), fetcher-id=(.+)><>(connection-count)
name: kafka_server_replica_fetcher_metrics_connection_count
labels:
broker-id: $1
fetcher-id: $2
- pattern : kafka.server<type=replica-fetcher-metrics, broker-id=(.+), fetcher-id=(.+)><>(incoming-byte-total)
name: kafka_server_replica_fetcher_metrics_incoming_byte_total
labels:
broker-id: $1
fetcher-id: $2
- pattern : kafka.server<type=forwarding-metrics, BrokerId=(.+)><>(network-io-rate)
name: kafka_server_forwarding_metrics_network_io_rate
labels:
BrokerId: $1
- pattern : kafka.server<type=forwarding-metrics, BrokerId=(.+)><>(outgoing-byte-rate)
name: kafka_server_forwarding_metrics_outgoing_byte_rate
labels:
BrokerId: $1
- pattern : kafka.server<type=forwarding-metrics, BrokerId=(.+)><>(incoming-byte-rate)
name: kafka_server_forwarding_metrics_incoming_byte_rate
labels:
BrokerId: $1
- pattern : kafka.server<type=forwarding-metrics, BrokerId=(.+)><>(outgoing-byte-total)
name: kafka_server_forwarding_metrics_outgoing_byte_total
labels:
BrokerId: $1
- pattern : kafka.server<type=socket-server-metrics><>(network-io-rate)
name: kafka_Server_socket_server_metrics_network_io_rate
- pattern : kafka.server<type=socket-server-metrics, listener=(.+), networkProcessor=(.+)><>(io-wait-time-ns-total)
name: kafka_server_socket_server_metrics_io_wait_time_ns_total
labels:
listener: $1
networkProcessor: $2
- pattern : kafka.server<type=replica-fetcher-metrics, broker-id=(.+), fetcher-id=(.+)><>(io-wait-time-ns-avg)
name: kafka_server_replica_fetcher_metrics_io_wait_time_ns_avg
labels:
broker-id: $1
fetcher-id: $2
- pattern : kafka.server<type=forwarding-metrics, BrokerId=(.+)><>(network-io-total)
name: kafka_server_forwarding_metrics_network_io_total
labels:
BrokerId: $1
- pattern : kafka.server<type=socket-server-metrics, listener=(.+), networkProcessor=(.+)><>(network-io-total)
name: kafka_server_socket_server_metrics_network_io_total
labels:
listener: $1
networkProcessor: $2
- pattern : kafka.server<type=socket-server-metrics, listener=(.+), networkProcessor=(.+)><>(network-io-rate)
name: kafka_server_socket_server_metrics_network_io_rate
labels:
listener: $1
networkProcessor: $2
- pattern : kafka.server<type=alter-partition-metrics, BrokerId=(.+)><>(network-io-total)
name: kafka_server_alter_partition_metrics_network_io_total
labels:
BrokerId: $1
- pattern : kafka.server<type=alter-partition-metrics, BrokerId=(.+)><>(outgoing-byte-total)
name: kafka_server_alter_partition_metrics_outgoing_byte_total
labels:
BrokerId: $1
- pattern : kafka.server<type=alter-partition-metrics, BrokerId=(.+)><>(outgoing-byte-rate)
name: kafka_server_alter_partition_metrics_outgoing_byte_rate
labels:
BrokerId: $1
- pattern : kafka.server<type=alter-partition-metrics, BrokerId=(.+)><>(incoming-byte-rate)
name: kafka_server_alter_partition_metrics_incoming_byte_rate
labels:
BrokerId: $1
- pattern : kafka.server<type=alter-partition-metrics, BrokerId=(.+)><>(network-io-rate)
name: kafka_server_alter_partition_metrics_network_io_rate
labels:
BrokerId: $1
- pattern : kafka.server<type=txn-marker-channel-metrics><>(network-io-total)
name: kafka_Server_txn_marker_channel_metrics_network_io_total
- pattern : kafka.server<type=txn-marker-channel-metrics><>(network-io-rate)
name: kafka_Server_txn_marker_channel_metrics_network_io_rate
- pattern : kafka.server<type=txn-marker-channel-metrics><>(io-time-ns-total)
name: kafka_server_txn_marker_channel_metrics_io_time_ns_total
- pattern : kafka.server<type=txn-marker-channel-metrics><>(outgoing-byte-total)
name: kafka_server_txn_marker_channel_metrics_outgoing_byte-total
- pattern : kafka.server<type=txn-marker-channel-metrics><>(incoming-byte-rate)
name: kafka_server_txn_marker_channel_metrics_incoming_byte_rate
- pattern : kafka.server<type=socket-server-metrics, listener=(.+), networkProcessor=(.+)><>(connection-count)
name: kafka_server_socket_server_metrics_connection_count
labels:
listener: $1
networkProcessor: $2
- pattern : kafka.server<type=socket-server-metrics, listener=(.+), networkProcessor=(.+)><>(failed-authentication-total)
name: kafka_server_socket_server_metrics_failed_authentication_total
labels:
listener: $1
networkProcessor: $2
- pattern : kafka.server<type=socket-server-metrics, listener=(.+), networkProcessor=(.+)><>(outgoing-byte-rate)
name: kafka_server_socket_server_metrics_outgoing_byte_rate
labels:
listener: $1
networkProcessor: $2
- pattern : kafka.server<type=socket-server-metrics, listener=(.+), networkProcessor=(.+)><>(incoming-byte-rate)
name: kafka_server_socket_server_metrics_incoming_byte_rate
labels:
listener: $1
networkProcessor: $2
- pattern : kafka.server<type=socket-server-metrics, listener=(.+), networkProcessor=(.+)><>(outgoing-byte-total)
name: kafka_server_socket_server_metrics_outgoing_byte_total
labels:
listener: $1
networkProcessor: $2
- pattern : kafka.server<type=socket-server-metrics, listener=(.+), networkProcessor=(.+)><>(request-total)
name: kafka_server_socket_server_metrics_request_total
labels:
listener: $1
networkProcessor: $2
- pattern : kafka.server<type=BrokerTopicMetrics, name=(Bytes(.+)PerSec|MessagesInPerSec|TotalFetchRequestsPerSec|TotalProduceRequestsPerSec|ReplicationBytes(.+)PerSec|ReassignmentBytes(.+)PerSec)><>(MeanRate|OneMinuteRate|Count)
- pattern : kafka.server<type=BrokerTopicMetrics, name=(Bytes(.+)PerSec|MessagesInPerSec|TotalFetchRequestsPerSec|TotalProduceRequestsPerSec), topic=(.+)><>(MeanRate|OneMinuteRate|Count)

Could you please support us here? @dhoard

Thanks,
Sree

@dhoard
Copy link
Collaborator

dhoard commented May 7, 2024

@sreelu27 The exporter metrics response time is a function of your exporter rules and the Kafka MBeans execution time.

Your only options are to change your exporter rules or increase your Prometheus (calling code) scrape interval.

@sreelu27
Copy link
Author

sreelu27 commented May 8, 2024

Thank you @dhoard for the quick response. Will definitely try the suggestion.

@sreelu27
Copy link
Author

Hi @dhoard,

After testing with minimal patterns with minimal filtering, we are getting better scrape time.

I have a query in future if we add more patterns again scrape time may increase, even in the minimal pattern if we add more filtering will also leading the scrape timeout issue.

So, Could you please share if you have any plans in future to handle this scenario/use case?

Thanks,
Sree

@dhoard
Copy link
Collaborator

dhoard commented May 15, 2024

@sreelu27 the Kafka MBean execution time is causing the slow scrape time. The Kafka project would have to increase MBean performance.

We can't solve this in the exporter code.

@sreelu27
Copy link
Author

Thank you for your response. Much appreciated @dhoard

@dhoard
Copy link
Collaborator

dhoard commented May 16, 2024

Closing as resolved.

@dhoard dhoard closed this as completed May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants