Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(instrumentation-winston): add log sending to Logs API #1837

Merged
merged 62 commits into from Mar 14, 2024

Conversation

hectorhdzg
Copy link
Member

@hectorhdzg hectorhdzg commented Nov 30, 2023

Fixes #1558

Winston instrumentation will emit recordLog using Logs API whenever configured on, adding similar configuration options as the ones added for Bunyan in #1713

@hectorhdzg hectorhdzg requested a review from a team as a code owner November 30, 2023 00:18
Copy link

codecov bot commented Nov 30, 2023

Codecov Report

Merging #1837 (5e0291e) into main (dfb2dff) will increase coverage by 0.00%.
Report is 12 commits behind head on main.
The diff coverage is 82.71%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1837   +/-   ##
=======================================
  Coverage   90.97%   90.98%           
=======================================
  Files         146      145    -1     
  Lines        7492     7454   -38     
  Branches     1502     1488   -14     
=======================================
- Hits         6816     6782   -34     
+ Misses        676      672    -4     
Files Coverage Δ
packages/winston-transport/src/utils.ts 100.00% <100.00%> (ø)
.../winston-transport/src/OpenTelemetryTransportV3.ts 93.75% <93.75%> (ø)
...try-instrumentation-winston/src/instrumentation.ts 69.02% <74.00%> (+11.41%) ⬆️

... and 4 files with indirect coverage changes

attributes[key] = splat[key];
}
}
const timestamp = Date.now();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the logs API the comment states that timestamp should be in nanoseconds.

The spec also specifies nanoseconds

I wonder if the logs API timestamp field should be a HrTime instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that make sense, updated to use hrTime and convert it to nanoseconds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the LogRecord API allows for HrTime to be passed in, but looks like the types don't support it yet.

I've opened a PR for it: open-telemetry/opentelemetry-js#4345

VERSION,
Object.assign({}, DEFAULT_CONFIG, config)
);
this._logger = logs.getLogger(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this works with the example in readme. sdk.start() sets the global log provider but this constructor is ran before it so this._logger will always be noop no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, this is related to open-telemetry/opentelemetry-js#3622, same issue apply to LoggerProvider now, I can take care of it in open-telemetry/opentelemetry-js#4314, in the meantime will update the documentation to use registerInstrumentations instead of NodeSDK

Copy link
Member

@blumamir blumamir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this 🥇

Overall, LGTM. Added a few minor comments.

@@ -41,7 +41,8 @@
"access": "public"
},
"peerDependencies": {
"@opentelemetry/api": "^1.3.0"
"@opentelemetry/api": "^1.3.0",
"@opentelemetry/api-logs": "^0.45.1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is also installed as a regular dependency, I think we should not list it as peer dependency.
This is also what bunyan is currently doing.

Now that I think about it again, perhaps they should both use peer dependency to be consistent with "@opentelemetry/api" usage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I believe we should update Bunyan as well, I can create a different PR so people are aware specific changes are happening for that instrumentation, some issues in the readme that were brough up by @mhennoch should apply to bunyan as well

attributes[key] = splat[key];
}
}
const timestamp = hrTimeToNanoseconds(hrTime());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense for each log record producer to set theobservedTimestamp and timestamp upon creation.

timestamp

From the specs it reads that:

Field: Timestamp
Description: Time when the event occurred measured by the origin clock, i.e. the time at the source. This field is optional, it may be missing if the source timestamp is unknown.

Since the timestamp is known, I would expect that we set timestamp in LogRecord below.

Regardless of this PR, it seems that timestamp is optional both in the data model spec and in otlp, but is not optional in the sdk where it is null coalesed into now when unset. Is this behaviour following the spec?

observedTimestamp

The spec for data model says:

For events that originate in OpenTelemetry (e.g. using OpenTelemetry Logging SDK) this timestamp is typically set at the generation time and is equal to Timestamp.

So I guess we should not set it when calling the API? or set both timestamp and observedTimestamp to the same value in the logRecord?

Currently each time is captured at a different location thus they will probably not be equal, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it make sense to send the timestamp if the actual log had some information about it, not applicable in this case, I'm updating this to not send time parameter and let the SDK handle it.

Comment on lines 237 to 242
let msgArguments = 0;
for (let i = 1; i < args.length; i++) {
if (typeof args[i] === 'string') {
msgArguments++;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You can transform this code to a no-side-effects, shorter version like so:

Suggested change
let msgArguments = 0;
for (let i = 1; i < args.length; i++) {
if (typeof args[i] === 'string') {
msgArguments++;
}
}
const msgArguments = args.slice(1).filter(a => typeof a === 'string').length

or (if the first argument is always string):

Suggested change
let msgArguments = 0;
for (let i = 1; i < args.length; i++) {
if (typeof args[i] === 'string') {
msgArguments++;
}
}
const msgArguments = args.filter(a => typeof a === 'string').length - 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this code a little bit, so it doesn't apply anymore, let me know if some other code can be transformed as well.

Comment on lines 248 to 249
const values = args.slice(2, msgArguments + 1);
logRecord['msg'] = util.format(args[1], ...values);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't run it myself, but it seems winston describes that this behavior is not always applied, and if it does, the arguments might be non-string:

The log method provides the string interpolation using util.format. It must be enabled using format.splat().

and:

logger.log('info', 'test message %d', 123);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code, util.format support multiple types so passing them as they are now, FYI this code is only applicable to Winston 2 https://github.com/winstonjs/winston/tree/2.x, string interpolation in newer version is handled before write is called.

Comment on lines 230 to 235
for (let i = args.length - 1; i >= 0; i--) {
if (typeof args[i] === 'object') {
logRecord['meta'] = args[i];
break;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can do it in one go like so:

logRecord['meta'] = args.findLast(a => typeof a === 'object')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would need to cast args to an array to use that, I usually don't risk casting for args out of my control, also added some other logic to calculate the index to quickly determine the string interpolation parameters.

injectRecord(spanContext, record);
instrumentation._callHook(span, record);
return original.apply(this, args);
args.splice(insertAt, 0, record);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are manipulating the args here, and then use the updated value below for the log bridge. Is it ok?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the span information would be duplicated in this case I guess if Logger is also adding correlation information into the LogRecords

@martinkuba
Copy link
Contributor

I think it would be better to implement this as a Winston transport rather than an instrumentation, similar to how it has been implemented for Bunyan as a stream. The Logs Bridge API was intended for log appenders, not instrumentations.

@hectorhdzg
Copy link
Member Author

@martinkuba my initial approach was using Winston transports, and the problem I found was related to having a direct dependency to winston 3x and 2x to make it work, transports require to extend a specific class in winston is not just to follow some interface, so that would mean people installing @opentelemetry-instrumentation-winston will get winston 3x and winston 2x installed as well, I believe this is undesirable and that is why I decided to use existing patch in write/log methods instead. Related to log appenders or instrumentations, there was some discussion about this, we can create another new appender package just for Winston, Bunyan and others, but I believe the agreement was to use the instrumentation instead, I don't have a problem creating a new appender project here, but is just some maintenance overhead in my opinion and possible causing users confusion.

@martinkuba
Copy link
Contributor

@hectorhdzg That makes sense. I just wanted to highlight that the Logs API was not intended to be used in instrumentations (per my interpretation of the spec). Maybe it's fine to approach it this way, but I am also thinking in terms of allowing (or encouraging) instrumentations to generate logs. The PR linked below is related.
open-telemetry/opentelemetry-js#4314

I see that Java has different packages for different versions of Log4j appenders
https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/log4j

@hectorhdzg
Copy link
Member Author

@martinkuba I understand appender term makes sense as is what the specification mentions, I created a discussion issue to talk about this in a single place, as this conversation has been happening in several places. I don't have strong opinions about one or the other, I just want to allow this functionality and move this forward.
open-telemetry/opentelemetry-js#4355

maxdml added a commit to dbos-inc/dbos-transact that referenced this pull request Dec 11, 2023
This PR updates the telemetry subsystem to support 1) formatting log
entries to OTLP and 2) export logs/traces to configurable OTLP backends.

For instance, with a service that accepts both traces and logs (like the
OTel collector), the DBOS SDK can be configure to export both signals to
this service:
```yaml
telemetry:
    OTLPExporters:
        - tracesEndpoint: 'http://localhost:4318/v1/traces'
        - logsEndpoint: 'http://localhost:4318/v1/logs'
```
Note this setup works out of the box with Jaeger and the OTel collector
(see screenshots bellow.)
Also note `opentelemetry-js` logs API is experimental, so I fixed the
version we use to `0.41.2`

**Details**
- Use a wrapper around the Winston logger everywhere.
- The wrapper adds a new OTLP transport to the wrapped winston logger if
it detects configured exporters
- This transport simply formats an OTel `LogRecord` and push it to our
collector
- Massively simplify our typing by only using opentelemetry `LogRecord`
and `Span`

**Others**
- Metadata are injected to all Winston transports and the transport
logic decides whether to include them or not

**Thoughts**
- There exists a neater way to integrate the active span context to each
OTLP log records, which we can consider in the future. Talk to me for
details or see PR comments about having a context manager
- The Winston folks are actively working on an OTLP transport/exporter
themselves, see
[here](open-telemetry/opentelemetry-js-contrib#1558)
and
[there](open-telemetry/opentelemetry-js-contrib#1837).
In the future we might want to use it.

**Tests**

☑️ Unit tests: ideally we would be able to intercept outgoing HTTP
requests and expect the payload to conform our expectations. This seems
to be quite some work so pushing that for later.

✅ Tested with DBOS cloud integration tests.

✅ Logs w/o metadata:
![Screenshot 2023-12-11 at 11 57
42](https://github.com/dbos-inc/dbos-ts/assets/3437048/093efe7a-6ff2-4ec0-bf6b-f4d9e30f7ed9)


✅ Logs w/ metadata:
![Screenshot 2023-12-11 at 12 01
02](https://github.com/dbos-inc/dbos-ts/assets/3437048/8a8ed3ff-37a7-4632-aee4-d140bfb0a549)


✅ Logs & traces exported to a local OTel collector
![Screenshot 2023-12-10 at 14 23
25](https://github.com/dbos-inc/dbos-ts/assets/3437048/292acc9e-a7f6-47e5-bb58-f7216d8afcfd)

✅ Traces exported to a local Jaeger:
<img width="1784" alt="jaeger"
src="https://github.com/dbos-inc/dbos-ts/assets/3437048/89b6e8dc-a2da-43c6-9f4f-821ed31c2159">
@hectorhdzg
Copy link
Member Author

Updated the code to use Winston transports instead, we discussed log instrumentations in SIG meeting a few weeks ago, and the consensus was to provide a log appender class, a winston transport in this case, and also allow auto instrumentation injecting the transport automatically, please take a look when you have a chance. @martinkuba @blumamir @seemk @mhennoch

@Flarna
Copy link
Member

Flarna commented Feb 10, 2024

As far as I remember transports in winston are not sync always. They are even batched in some scenarios like high load or transport chain setup,... because internal streams are used.

So the traceId/spanId added by transport is not that one from log writing in user app - it's that one from transport.
Not sure what the log spec wants here as both might have a valid usecase.

If the traceId/spanId from actual log statement is needed the enrichment by log sdk using getCurrentSpan can't work as it is too late.

@pichlermarc pichlermarc self-requested a review February 12, 2024 12:50
Copy link
Member

@pichlermarc pichlermarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like all comments are now addressed. Thanks for sticking with us through the long review process @hectorhdzg.

@trentm I see there's still a changes requested review from your side - would you mind having another look to check if your comments have been addressed? 🙂


constructor(options?: Transport.TransportStreamOptions) {
super(options);
this._logger = logs.getLogger('@opentelemetry/winston', VERSION);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer @opentelemetry/winston-transport, I think that's the one that makes the most sense given that this can be used as a stand-alone

@hectorhdzg
Copy link
Member Author

@trentm let me know if there is any concern pending

Copy link
Contributor

@trentm trentm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really sorry for the block and long delay. I have a couple small changes that I think should be done (e.g. the package-lock.json vestigial entry), a few nits (i.e. feel free to ignore "nits" from me at your discretion), and then other Qs can be deferred to separate issues.

packages/winston-transport/README.md Outdated Show resolved Hide resolved
package-lock.json Outdated Show resolved Hide resolved
packages/winston-transport/package.json Outdated Show resolved Hide resolved
packages/winston-transport/src/utils.ts Outdated Show resolved Hide resolved
logger: Logger
): void {
const { message, msg, level, meta, ...splat } = record;
const attributes = Object.assign(meta ?? {}, {});
Copy link
Contributor

@trentm trentm Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attributes[key] = splat[key];
}
}
const logRecord: LogRecord = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This is where instrumentation-bunyan intentionally drops the trace_id, span_id, and trace_flags fields that might have been added by the enabled "log correlation" functionality.

However, to not hold up this PR, this could be deferred to a separate issue/PR along with the outstanding question -- #1837 (comment) -- about the LogRecord traceId, spanId, and traceFlags fields being incorrectly set in cases where there is stream buffering.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #2010 to follow up on this

@trentm
Copy link
Contributor

trentm commented Mar 14, 2024

@hectorhdzg I see you resolved a number of the conversations above, but I don't see any new commits pushed to your feature branch. Did you perhaps forget to push up your local changes? (Sorry if I'm jumping the gun. I'm trying to not be the slow reviewer here again. :)

Update: Ah, the latest comment update was only minutes ago, so I think I'm jumping the gun.

@hectorhdzg
Copy link
Member Author

@trentm sorry I usually start resolving whenever I update locally, and push changes after, everything should be up to date now.

Copy link
Contributor

@trentm trentm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@hectorhdzg hectorhdzg merged commit 358345f into open-telemetry:main Mar 14, 2024
17 checks passed
@dyladan dyladan mentioned this pull request Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Winston format/transport for OpenTelemetry
8 participants