Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/awscloudwatchlogsexporter] Improve performance of the awscloudwatchlogs exporter #26692

Merged

Conversation

rapphil
Copy link
Contributor

@rapphil rapphil commented Sep 14, 2023

Description: Adds support to the to parallelism in the awscloudwatchlogs exporter by leveraging the exporter helper.

In this PR, we are adding support to the num_consumers configuration in the sending_queue. This will allow users to specify the number of consumers that will consume from the sending_queue in parallel.

It is possible and straightforward to use this approach because CloudWatch logs no longer requires that you use a token to control access to the stream that you are writing to. You can write to the same stream in parallel.

To achieve this, this PR does the following:

  • Create Pusher that is able to push to multiple streams at the same time.
  • Move lifecycle of the Pusher to the function that is used to consume from the sending queue. This allows you to safely send to multiple streams at the same time without any resource contention since each call to consume logs will not share resources with others that are happening in parallel (one exception is the creation of log streams).

Besides that I analyzed the code and removed other limitations:

** How to review this PR: **

The first 3 commits in this PR were used to refactor the code before making the real changes. Please use the commits to simplify the review process.

Fixes #26360

Testing:

  • Unit tests were added.
  • Tested locally sending logs to cloudwatch logs.

Documentation: Documentation was added describing the new parameters.

StreamKey uniquely identify an cloudwatch logs tream and is used to show
where an event should be submited to.

Signed-off-by: Raphael Silva <rapphil@gmail.com>
The idea is that we will identify that an event belongs to a specific stream

Signed-off-by: Raphael Silva <rapphil@gmail.com>
Properly initialize a cloudwatch.Event in the awsemf exporter, setting the
destination log stream of the event

Signed-off-by: Raphael Silva <rapphil@gmail.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
Add support to multiple consumers in the cloudwatchlogs expoters. This will
allow requests to be sent in parallel to cloudwatch logs

Signed-off-by: Raphael Silva <rapphil@gmail.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
Cloudwatch logs no longer limits to 5 request per second per log stream.
Instead the limit is per account.

Signed-off-by: Raphael Silva <rapphil@gmail.com>
Sequence tokens are no longer necessary to make PutLogEvents calls to
CloudWatch logs. This changes removes all the logic for managing this token and
instead use a "Noop"

Signed-off-by: Raphael Silva <rapphil@gmail.com>
This optmization will reduce the memroy and CPU usage a bit
because we will not need to keep a big buffer of events that
were translated and insted they can be sent directly to the pusher.

Signed-off-by: Raphael Silva <rapphil@gmail.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: 'enhancement'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe breaking is ok? since the config is changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it is considered breaking in this case. The config is extended but the default behavior stays the same. @rapphil can you confirm this statement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct, we are extending the configuration and keeping the previous behaviour unchanged. That is why I don't consider this a breaking change.

exporter/awscloudwatchlogsexporter/exporter.go Outdated Show resolved Hide resolved
internal/aws/cwlogs/pusher.go Outdated Show resolved Hide resolved
exporter/awscloudwatchlogsexporter/config.go Outdated Show resolved Hide resolved
exporter/awscloudwatchlogsexporter/config.go Outdated Show resolved Hide resolved
internal/aws/cwlogs/cwlog_client.go Outdated Show resolved Hide resolved
exporter/awscloudwatchlogsexporter/exporter.go Outdated Show resolved Hide resolved
exporter/awscloudwatchlogsexporter/exporter.go Outdated Show resolved Hide resolved
exporter/awscloudwatchlogsexporter/factory.go Outdated Show resolved Hide resolved
Copy link
Contributor

@bryan-aguilar bryan-aguilar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nits

# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: 'enhancement'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it is considered breaking in this case. The config is extended but the default behavior stays the same. @rapphil can you confirm this statement?

exporter/awscloudwatchlogsexporter/exporter.go Outdated Show resolved Hide resolved
exporter/awscloudwatchlogsexporter/exporter.go Outdated Show resolved Hide resolved
exporter/awscloudwatchlogsexporter/exporter.go Outdated Show resolved Hide resolved
internal/aws/cwlogs/pusher.go Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 9, 2023

Codecov Report

Attention: 20 lines in your changes are missing coverage. Please review.

Files Coverage Δ
exporter/awscloudwatchlogsexporter/config.go 82.35% <100.00%> (+15.68%) ⬆️
exporter/awscloudwatchlogsexporter/factory.go 71.42% <100.00%> (+3.00%) ⬆️
exporter/awsemfexporter/emf_exporter.go 92.24% <100.00%> (-0.51%) ⬇️
exporter/awscloudwatchlogsexporter/exporter.go 84.07% <92.59%> (+7.66%) ⬆️
exporter/awsemfexporter/metric_translator.go 96.17% <82.35%> (-0.80%) ⬇️
internal/aws/cwlogs/cwlog_client.go 90.90% <75.00%> (-2.85%) ⬇️
internal/aws/cwlogs/pusher.go 91.70% <85.71%> (+1.12%) ⬆️

... and 7 files with indirect coverage changes

📢 Thoughts on this report? Let us know!.

@fatsheep9146
Copy link
Contributor

@rapphil there are still some checks are not passed

@rapphil
Copy link
Contributor Author

rapphil commented Oct 10, 2023

@fatsheep9146 those failures were unrelated and they were fixed just by merging from main again. thank you for your review.

@bryan-aguilar bryan-aguilar added the ready to merge Code review completed; ready to merge by maintainers label Oct 10, 2023
@codeboten codeboten merged commit 8bb5533 into open-telemetry:main Oct 11, 2023
88 checks passed
@github-actions github-actions bot added this to the next release milestone Oct 11, 2023
@rapphil rapphil deleted the rapphil-parallel-awscloudwatchlogsexporter branch October 12, 2023 16:52
JaredTan95 pushed a commit to openinsight-proj/opentelemetry-collector-contrib that referenced this pull request Oct 18, 2023
…oudwatchlogs exporter (open-telemetry#26692)

Adds support to the to parallelism in the
awscloudwatchlogs exporter by leveraging the [exporter
helper](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md).

In this PR, we are adding support to the `num_consumers` configuration
in the `sending_queue`. This will allow users to specify the number of
consumers that will consume from the sending_queue in parallel.

It is possible and straightforward to use this approach because
CloudWatch logs [no longer requires that you use a token to control
access to the stream that you are writing
to](https://aws.amazon.com/about-aws/whats-new/2023/01/amazon-cloudwatch-logs-log-stream-transaction-quota-sequencetoken-requirement/).
You can write to the same stream in parallel.

To achieve this, this PR does the following:
* Create Pusher that is able to push to multiple streams at the same
time.
* Move lifecycle of the Pusher to the function that is used to consume
from the sending queue. This allows you to safely send to multiple
streams at the same time without any resource contention since each call
to consume logs will not share resources with others that are happening
in parallel (one exception is the creation of log streams).

Besides that I analyzed the code and removed other limitations:
* locks that were not necessary
* Limiter that was used to limit the number of requests per stream to 5
per second. [The TPS is much higher now and is per
account.](https://aws.amazon.com/about-aws/whats-new/2023/01/amazon-cloudwatch-logs-log-stream-transaction-quota-sequencetoken-requirement/)

** How to review this PR: **

The first 3 commits in this PR were used to refactor the code before
making the real changes. Please use the commits to simplify the review
process.

**Link to tracking Issue:** open-telemetry#26360

**Testing:**

- Unit tests were added.
- Tested locally sending logs to cloudwatch logs.

**Documentation:** Documentation was added describing the new
parameters.

---------

Signed-off-by: Raphael Silva <rapphil@gmail.com>
Co-authored-by: Anthony Mirabella <a9@aneurysm9.com>
jmsnll pushed a commit to jmsnll/opentelemetry-collector-contrib that referenced this pull request Nov 12, 2023
…oudwatchlogs exporter (open-telemetry#26692)

Adds support to the to parallelism in the
awscloudwatchlogs exporter by leveraging the [exporter
helper](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md).

In this PR, we are adding support to the `num_consumers` configuration
in the `sending_queue`. This will allow users to specify the number of
consumers that will consume from the sending_queue in parallel.

It is possible and straightforward to use this approach because
CloudWatch logs [no longer requires that you use a token to control
access to the stream that you are writing
to](https://aws.amazon.com/about-aws/whats-new/2023/01/amazon-cloudwatch-logs-log-stream-transaction-quota-sequencetoken-requirement/).
You can write to the same stream in parallel.

To achieve this, this PR does the following:
* Create Pusher that is able to push to multiple streams at the same
time.
* Move lifecycle of the Pusher to the function that is used to consume
from the sending queue. This allows you to safely send to multiple
streams at the same time without any resource contention since each call
to consume logs will not share resources with others that are happening
in parallel (one exception is the creation of log streams).

Besides that I analyzed the code and removed other limitations:
* locks that were not necessary
* Limiter that was used to limit the number of requests per stream to 5
per second. [The TPS is much higher now and is per
account.](https://aws.amazon.com/about-aws/whats-new/2023/01/amazon-cloudwatch-logs-log-stream-transaction-quota-sequencetoken-requirement/)

** How to review this PR: **

The first 3 commits in this PR were used to refactor the code before
making the real changes. Please use the commits to simplify the review
process.

**Link to tracking Issue:** open-telemetry#26360

**Testing:**

- Unit tests were added.
- Tested locally sending logs to cloudwatch logs.

**Documentation:** Documentation was added describing the new
parameters.

---------

Signed-off-by: Raphael Silva <rapphil@gmail.com>
Co-authored-by: Anthony Mirabella <a9@aneurysm9.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporter/awscloudwatchlogs awscloudwatchlogs exporter exporter/awsemf awsemf exporter internal/aws ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[exporter/cloudwatchlogs] Add support concurrency in the awscloudwatchlogs exporter
5 participants