Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(trace): host-controlled tracing #92

Merged
merged 13 commits into from Jun 24, 2023
Merged

feat(trace): host-controlled tracing #92

merged 13 commits into from Jun 24, 2023

Conversation

hawkw
Copy link
Contributor

@hawkw hawkw commented Jun 24, 2023

Currently, the tracing collector in the Allwinner D1 platform impl is
hardcoded to the DEBUG level. This is unfortunate, as it means that we
will always do the work of collecting and formatting DEBUG traces, even
when no one is listening on the UART tracing port. It would be more
efficient if we only formatted traces when a crowtty instance has
subscribed to the UART tracing port. Additionally, because trace
metadata is sent when the callsite is initially registered, a crowtty
instance that attaches after the board has started running may miss some
trace metadata. Finally, there is no way to change the trace level on
the fly, as it's hard-coded in the kernel.

This branch changes the mnemos-trace-proto wire protocol to include a
host to target message to select a tracing level. Now, the board can
start up with all tracing disabled, and crowtty can send the desired
tracing level once it connects. This way, we don't collect or format
traces at all when crowtty isn't connected. In order to ensure
crowtty waits to send the tracing level message only once the board's
tracing code has been initialized, we add a periodic "heartbeat" message
sent by the board on the tracing port while tracing is idle. This is
used by the host which is listening for traces to detect the presence of
the tracing collector on the debug target. The same message is used to
ack a successful request to select the tracing level.

When the tracing level is selected, the target rebuilds tracing's
callsite cache, which means all metadata for callsites which have
previously been hit will be sent to the host. This ensures that
crowtty always has the metadata for all tracing spans and events that
are enabled, even if the target encountered them before crowtty
connected.

Making this work required fixing an upstream tracing bug,
tokio-rs/tracing#2634, where calling into code
that includes tracing diagnostics from inside a
Collect::register_callsite method would cause a deadlock.
Additionally, I had to add code to the collector for tracking whether
it's inside its own send method, in order to temporarily disable bbq
tracing. This is because the collector's use of bbq creates an
infinite loop when the TRACE level is enabled.

Check this out:
image

@netlify
Copy link

netlify bot commented Jun 24, 2023

Deploy Preview for merry-scone-cc7a60 ready!

Name Link
🔨 Latest commit 318c479
🔍 Latest deploy log https://app.netlify.com/sites/merry-scone-cc7a60/deploys/64974feaa23c7f0008e197fa
😎 Deploy Preview https://deploy-preview-92--merry-scone-cc7a60.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@hawkw
Copy link
Contributor Author

hawkw commented Jun 24, 2023

In a subsequent branch, I intend to also add a host -> target heartbeat, so that the target can detect when crowtty has disconnected, and turn tracing off again.

@hawkw hawkw requested a review from jamesmunns June 24, 2023 19:07
Copy link
Contributor

@jamesmunns jamesmunns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks fine to me to merge! Left some comments, most are "some day maybe tweak this", take them or leave them :)

max_level: LevelFilter,
max_level: AtomicU8,

/// Tracks whether we are inside of the collector's `send_event` method, so
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it


// we probably won't use 256 whole bytes of cobs yet since all the host
// -> target messages are quite small
let mut cobs_buf: CobsAccumulator<16> = CobsAccumulator::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're gunna want some kind of config soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely, yeah

return tracing_core_02::Interest::never();
}

// Due to the fact that the collector uses `bbq` internally, we must
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a little icky, because this is a platform-specific impl detail - melpo doesn't use bbqueue for the tracing collection.

Might be worth instead to::

  • Provide bbqueue methods that don't emit traces (and use them in lichee-rv)
  • Start an "integration guide" that tells people to use something that doesn't emit traces when implementing tracing

I don't think this is a today blocker, but if we merge this as-is it might be good to throw up an issue to fix this someday.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this collector does use bbqueue --- so tif we are ever using the collector in this file, we are using bbqueue.

}
}

const fn u8_to_level(level: u8) -> LevelFilter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to have an invalid level be an error (and ignore the "change log level command" instead), otherwise if we receive a garbage message that HAPPENS to look like a command, we have a 251/256 chance of just shutting off all logging. Tho I guess we don't really need to log at all to a connection that is sending us garbage. Hm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these functions are used internally for converting the AtomicU8 to/from a LevelFilter, not for parsing the log level sent to us from the host. that gets deserialized using postcard, so if we get an invalid value, we will error.

};

#[derive(serde::Serialize, serde::Deserialize)]
pub enum TraceEvent<'a> {
/// Sent by the target periodically when not actively tracing, to indicate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to mark this non-exhaustive and add things to the end instead, to avoid breaking wire format, once we care about that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i figured we didn't currently care about stability for the crate API or for the wire format, but we can do that in the future.

source/kernel/src/trace.rs Outdated Show resolved Hide resolved
@hawkw hawkw merged commit be8ff93 into main Jun 24, 2023
11 checks passed
@hawkw hawkw deleted the eliza/trace-hostmsg branch June 24, 2023 21:04
@jamesmunns jamesmunns added area: tools & build Related to host developer tools, including tracing, Crowtty and build processes area: protocols Related to communication protocols, including SerMux and Traceproto. platform: D1 Specific to the Allwinner D1 hardware platform labels Jul 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: protocols Related to communication protocols, including SerMux and Traceproto. area: tools & build Related to host developer tools, including tracing, Crowtty and build processes platform: D1 Specific to the Allwinner D1 hardware platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants