Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support gzip compression for node exporter collector #2337

Merged

Conversation

alisabzevari
Copy link
Contributor

@alisabzevari alisabzevari commented Jul 8, 2021

Fixes #1637,

I was not sure this is the right approach to address this issue. Please let me know if it is wrong.

Which problem is this PR solving?

  • This PR provides an option to compress (using gzip) the request body in exporter collector for node.

Short description of the changes

  • A new config named compress has been added to CollectorExporterNodeConfigBase. It is an optional boolean. The default behavior is to not compress the request body.
  • The main change happened in util.ts which pipes the body through a gzip instance. The code is inspired from here.
  • I have added only one test.
  • I have changed the usage of sinon stub for the request in tests to use Passthrough stream instead. Using Passthrough we can work with a real stream (readable and writable) which provides more control on the tests.

TODO

  • Content-Length header should be correct in case of compression enabled.
  • Update the readme to reflect the new config.
  • Compression is meant to be configurable through environment variables (e.g., OTEL_EXPORTER_OTLP_COMPRESSION) as well, according to spec.

@codecov
Copy link

codecov bot commented Jul 8, 2021

Codecov Report

Merging #2337 (3863d48) into main (d8fbedd) will increase coverage by 0.29%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #2337      +/-   ##
==========================================
+ Coverage   92.35%   92.64%   +0.29%     
==========================================
  Files         128      142      +14     
  Lines        4249     5101     +852     
  Branches      868     1050     +182     
==========================================
+ Hits         3924     4726     +802     
- Misses        325      375      +50     
Impacted Files Coverage Δ
...async-hooks/src/AsyncLocalStorageContextManager.ts
...sync-hooks/src/AbstractAsyncHooksContextManager.ts
...ontext-async-hooks/src/AsyncHooksContextManager.ts
packages/opentelemetry-web/src/types.ts 100.00% <0.00%> (ø)
packages/opentelemetry-web/src/utils.ts 94.90% <0.00%> (ø)
...kages/opentelemetry-exporter-collector/src/util.ts 100.00% <0.00%> (ø)
...mentation-xml-http-request/src/enums/EventNames.ts 100.00% <0.00%> (ø)
...ntelemetry-web/src/enums/PerformanceTimingNames.ts 100.00% <0.00%> (ø)
...-instrumentation-fetch/src/enums/AttributeNames.ts 100.00% <0.00%> (ø)
...ackages/opentelemetry-web/src/WebTracerProvider.ts 100.00% <0.00%> (ø)
... and 10 more

@alisabzevari alisabzevari force-pushed the gzip-compression-collector-exporter branch from 62c11b4 to 508f7e8 Compare July 8, 2021 18:37
@vmarchaud vmarchaud added the enhancement New feature or request label Jul 11, 2021
@alisabzevari alisabzevari force-pushed the gzip-compression-collector-exporter branch from 30d2300 to 446c71b Compare July 12, 2021 09:59
httpAgentOptions?: http.AgentOptions | https.AgentOptions;
}

export enum CompressionAlgorithm {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment, why define the enum values as strings?
Why not just a numeric value as the "NAME" will still be available is the resulting code and if required we can always look it up in config via uppercasing ... Just a thought.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally i advise using string to avoid problem when trying to update available value (that can share the same number even if values are different across versions)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as far as I know normal enums are hard to use for non typescript users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Flarna I know that's true for const enums but I'm not sure it's the case for regular enums. Do you have a particular case you're thinking of?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I know what you mean after thinking a little more. You're not thinking they can't do CompressionAlgorithm.GZIP but that 1 doesn't mean anything to them and passing a string 'gzip' is more clear?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normal enums are exported as objects so a JS user can use CompressionAlgorithm.GZIP. A JS user can also use the underlying constant (e.g. gzip).
The main difference is documentation. A typescript user can't to much wrong as editor will tell the allowed values to use and compiler checks again.

For JS users it's needed to document that an object named CompressionAlgorithm is exported incl. the properties (GZIP and NONE) referring to the constants to use.

This works fine but often JS APIs document the actual values to use instead referring to an object holding the constants (would be gzip/none here).

Standard const enums are really bad as JS users have to use the numeric values then.
const enums with strings as values would require JS users to use these strings in their code.

obecny
obecny previously requested changes Jul 12, 2021
Copy link
Member

@obecny obecny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not depend on zlib directly. We should create a config param and accepts a defined object with 2 function (zip, unzip) as interface. This can be compatible with zlib. This way people might be using any library with any version. ZLIB is quite popular people might have it already in their repo with different version etc. That's why I think we should define an interface that will allow more freedom to choose and will not create a strict dependency on 3rd party.

@alisabzevari
Copy link
Contributor Author

zlib is a part of nodejs. It is not a 3rd party lib.

@obecny
Copy link
Member

obecny commented Jul 12, 2021

zlib is a part of nodejs. It is not a 3rd party lib.

Oh you are absolutely right, :)

@obecny
Copy link
Member

obecny commented Jul 13, 2021

I thought the compression concerns should be hidden from the user in this function. What is the valid use-case for the user to set the Content-Encoding?

this is general approach, if user is able to pass a headers to the config, those headers should take precedence over other things, there can be edge cases which we are not aware of, but we should not prevent user from overriding things if for any reason wants to do that.

@dyladan
Copy link
Member

dyladan commented Jul 13, 2021

I thought the compression concerns should be hidden from the user in this function. What is the valid use-case for the user to set the Content-Encoding?

this is general approach, if user is able to pass a headers to the config, those headers should take precedence over other things, there can be edge cases which we are not aware of, but we should not prevent user from overriding things if for any reason wants to do that.

Not sure I agree. Sure, the user can set headers, but that doesn't mean we can't override them. The content-length header is a good example of one that the user could technically pass to the headers object but we would be incorrect not to overwrite. Since the user can't affect the encoding, I'm not sure it makes sense to preserve the value they set for the content-encoding header.

@obecny
Copy link
Member

obecny commented Jul 13, 2021

I thought the compression concerns should be hidden from the user in this function. What is the valid use-case for the user to set the Content-Encoding?

this is general approach, if user is able to pass a headers to the config, those headers should take precedence over other things, there can be edge cases which we are not aware of, but we should not prevent user from overriding things if for any reason wants to do that.

Not sure I agree. Sure, the user can set headers, but that doesn't mean we can't override them. The content-length header is a good example of one that the user could technically pass to the headers object but we would be incorrect not to overwrite. Since the user can't affect the encoding, I'm not sure it makes sense to preserve the value they set for the content-encoding header.

well in linux if you want to run rm -fr / as a root, no1 will prevent you from doing that. The same here if you want to override it you know what you are doing and you should be able to do that. We are providing api and sdk that should be as flexible as it can be. If you are writing a final product you can take care of such things, making sure your product will not fail and you can decide to what extent you want to correct user behaviour. But here we should imho give users as much freedom as possible.

Copy link
Member

@Flarna Flarna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I miss an update of the Readme to explain the new options.
This would be also a good place to explain that it may be a bad idea for the user to set e.g. a Content-Encoding or Content-Length header.

@vreynolds
Copy link
Contributor

Might be a concern for another PR: compression is meant to be configurable through environment variables (e.g., OTEL_EXPORTER_OTLP_COMPRESSION) as well, according to spec

@alisabzevari
Copy link
Contributor Author

I thought the compression concerns should be hidden from the user in this function. What is the valid use-case for the user to set the Content-Encoding?

this is general approach, if user is able to pass a headers to the config, those headers should take precedence over other things, there can be edge cases which we are not aware of, but we should not prevent user from overriding things if for any reason wants to do that.

Not sure I agree. Sure, the user can set headers, but that doesn't mean we can't override them. The content-length header is a good example of one that the user could technically pass to the headers object but we would be incorrect not to overwrite. Since the user can't affect the encoding, I'm not sure it makes sense to preserve the value they set for the content-encoding header.

well in linux if you want to run rm -fr / as a root, no1 will prevent you from doing that. The same here if you want to override it you know what you are doing and you should be able to do that. We are providing api and sdk that should be as flexible as it can be. If you are writing a final product you can take care of such things, making sure your product will not fail and you can decide to what extent you want to correct user behaviour. But here we should imho give users as much freedom as possible.

If there is an option to compress the body accepted by sendWithHttp, the function should be able to completely take the responsibility of handling the compression correctly. Otherwise, it will be a leaky abstraction.
I can suggest two alternatives here:

  1. Remove the compression from sendWithHttp completely and move this responsibility to whichever class that has complete control over the http headers. Not recommended though, because there will be a lot of duplicate implementation in exporter classes.
  2. sendWithHttp withdraws from interfereing with compression concerns if compression == CompressionAlgorithm.NONE. In this case, the class calling sendWithHttp should be able to set the headers and compress the body itself and sendWithHttp should not override the headers. Otherwise, sendWithHttp will override the needed headers in order to make sure compression works as promised.

@dyladan
Copy link
Member

dyladan commented Jul 23, 2021

Retrigger CLA

@dyladan dyladan closed this Jul 23, 2021
@dyladan dyladan reopened this Jul 23, 2021
@obecny obecny requested a review from rauno56 as a code owner July 26, 2021 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gzip OTLP/JSON request bodies
7 participants