feat: improve plain progress #7272

jedevc · 2024-05-03T15:25:45Z

With the changes as part of #6835 (in v0.11.0), all old TUIs were removed, including the old --progress=plain output. With #7069 (in v0.11.1), a limited version of the plain TUI was restored. However, as noted in #7137 (comment) and #7223 (comment), this new plain progress has issues - it only reports spans that have logs, and loses a lot of additional information.

This PR attempts to strike a balance between the v0.10.X plain output and the v0.11.X plain output - unfortunately, the old behavior cannot be perfectly restored due to the architectural changes in the move to OpenTelemetry.

Specifically, some guiding ideas I'm trying to stick to:

We should share as much code as possible between plain + tty output. We want to ensure that the visibility of spans is the same between both outputs, and that the verbosity options create roughly the same effects for both.
We should attempt to organize the logs by function call as much as possible - this is the direction of the tty output, as well as dagger cloud - being aligned keeps the outputs somewhat similar.
We should keep a familiar style to the old plain logs, with numerical prefixes, similar colors, etc.
We should attempt to default to linearly readable logs in the case of linear progress (e.g. a linear series of calls in the CLI should display linearly in the progress). We should aim to show the exact amount of interesting information. More complex parallel pipelines are out of scope, there is no way to do this neatly in a log format.

jedevc · 2024-05-03T15:28:44Z

As a sneak preview of what I'm building towards at the moment:

gerhard · 2024-05-03T16:24:01Z

Looking good @jedevc 💪

jedevc · 2024-05-07T16:30:54Z

Okay, continuing to make progress here, did some refactoring, but also started reworking the output a bit.

One of the main problems with plain progress is that we lose a lot of context for when anything appears. This was always kind of a problem, but it's now really quite annoying. So ideally, what we should do is try and provide some amount of context for each vertex.

One idea I've been playing around with is to show each vertex with its parent call (if there is one):

The merging between parts isn't quite right, but the grouping does seem to help improve readability. Unlike the pretty progress, it's not really doable to nest these indefinitely, since this will really clog up the output - but showing the most recent parent call does really help to understand where something comes from.

jedevc · 2024-05-17T15:57:50Z

Okay, so a lot of work from this got spun out into separate PRs: #7347 #7371 #7385

I also got some more time to focus on this one for a bit yesterday and today - I refactored most of the progress, and so now we have a lot of nice shared code between the plain and the pretty output. Here's the current state:

There's a few more steps to take early next week before this is done:

Introduce a small delay buffer for context switching (similar to what streaming_output.go was doing) - this will help for grouping, as well as avoiding out-of-order issues caused by logs and spans received in interesting orders.
Add some vertical spacing for visual separation
Tidy up any specifically nasty hacks
Get some user feedback
Tidy commit history
Check tests

jpadams · 2024-05-21T16:25:41Z

@jedevc I like the way this renders a lot. I don't think it's part of this PR likely, but I'd still like to see less/none of the HTTP HEAD/POST/PUT calls. I rebased this PR on top of main and rebuilt engine, but still have the HTTP output, so something to look into.

sipsma

Code LGTM! Obviously need to solve the test failure mystery, but otherwise good to merge

sipsma · 2024-05-22T16:08:29Z

dagql/idtui/frontend.go

+			}
+			depth--
+			indent(out, depth)
+			depth-- //nolint:ineffassign


wtf, silly linter... 😄

vito

First off, this looks great! I love that it keeps all the info and coloring from the TUI, and it doesn't feel too insanely verbose. The refactor is very sensible too.

Purely UI feedback, but we can iterate, it's already better than what we have:

I mentioned spacing previously, and I think keeping the top-level items separate but keeping the nested items flush is probably the best we can do. Besides maybe double-spacing at the top and single-spacing at the inner levels. Maybe worth trying, but dunno if it'd be better.

That being said it seems like sometimes calls "break out" into separate sections instead of being nested properly. Maybe a bug? (See the extra "Apko.wolfi" header below)

Could use a blank line here above "Container evaluated." to make it feel more final.

Since we're already omitting args on the "DONE" line maybe we should omit the return type too? Seems slightly noisy.

The spacing on the numbers feels a little awkward but I get that it keeps things aligned. Maybe the number could be right-aligned so it's flush with the :? Maybe 0-padded?

MAYBE the top-level inline args are indented one level too far? Not sure if intentional, but it means the children align a little differently than when it's not a multiline code block.

That's all I got for now! Again nothing blocking.

vito · 2024-05-22T16:36:26Z

cmd/dump-id/main.go

@@ -10,6 +10,8 @@ import (
 )

 func main() {
+	// XXX: check that this still works!


Not a big deal if it doesn't, it's already a bit nerfed from TUI refactoring. It used to give you a deep dump of the ID, but now it abbreviates into things like Container.withExec which isn't as useful.

vito · 2024-05-22T16:40:36Z

dagql/idtui/frontend.go

-	}
-	return view
+	// ConnectedToEngine is called when the CLI connects to an engine.
+	// TODO: remove this


Mostly curious, any particular reason to remove it? The thinking was that a plain UI would just print this, but a pretty TUI might show it somewhere onscreen

It gets refactored away in https://github.com/dagger/dagger/pull/7385/files#diff-375c41f6a94e797edee687b8fa151865e2192d8aa9088acaffa174d0de2c3578R284

We can just use slog here (at least for now).

vito · 2024-05-22T16:40:53Z

dagql/idtui/frontend.go

-	case tea.WindowSizeMsg:
-		fe.SetWindowSize(msg)
-		return fe, nil
+type Frontend interface {


Signed-off-by: Justin Chadwell <me@jedevc.com>

jedevc · 2024-05-23T08:31:43Z

Rebased onto main after #7385 was merged.

Signed-off-by: Justin Chadwell <me@jedevc.com>

These now strike a balance between the old and the new plain progress, while trying to take as much advantage of our new OTEL tooling as possible. These new logs use the span database, sharing the logic with the pretty progress as well (but handling logs entirely differently, we don't need/want fancy vterm handling for these). Signed-off-by: Justin Chadwell <me@jedevc.com>

We don't want to display these at all for the new TUI. Signed-off-by: Justin Chadwell <me@jedevc.com>

This got missed somewhere during a rebase. Signed-off-by: Justin Chadwell <me@jedevc.com>

jedevc · 2024-05-23T16:53:26Z

Right. I think I have a handle on what's going on. Compare the failing check on this PR here and the passing check on my work-in-progress PR here.

There's one key difference here: 100aaf2 (#7450)

Here's my best guess what's going on - somewhere, there's an unexpected layer of output buffering, which is preventing flushing data from the output stream - this seems to manifest as blocking something critical in a python test. If we just become more chatty on our logs... we flush more data.

Alongside the hard evidence of the different PRs:

This explains why this PR is a "regression" - the new plain progress is a lot less chatty, so we might flush less.
This would explain why enabling --debug "fixes" the issue - we re-enable a ton of chatty output, so we would flush more.

From the python subprocess side, we do set bufsize=0 explicitly - this should disable buffering. I wonder if there's still maybe something to pull on here though, because we don't see this in any other language SDK.

helderco · 2024-05-23T17:07:34Z

From the python subprocess side, we do set bufsize=0 explicitly - this should disable buffering. I wonder if there's still maybe something to pull on here though, because we don't see this in any other language SDK.

Interesting! Yeah, the runtime also disables it for modules:

dagger/sdk/python/runtime/main.go

Line 160 in 643aa61

WithEnvVariable("PYTHONUNBUFFERED", "1").

But CI doesn't:

dagger/ci/sdk_python.go

Lines 181 to 183 in 643aa61

    
           base := dag.Container(). 
        
           	From(fmt.Sprintf("python:%s-slim", version)). 
        
           	WithEnvVariable("PIPX_BIN_DIR", "/usr/local/bin").

It's an easy change to put there.

Signed-off-by: Justin Chadwell <me@jedevc.com>

jedevc · 2024-05-23T17:21:11Z

Hm, that doesn't seem to have worked.

Something weird here is also that the extra chattiness I added was for stderr - while I think we do a lot of our essential session communication over stdout. Still not quite sure what this could be though.

jedevc force-pushed the plain-progress branch from 2aff510 to ac406cd Compare May 7, 2024 15:28

jedevc mentioned this pull request May 9, 2024

fix: attach cli args to initialize parent #7347

Merged

jedevc force-pushed the plain-progress branch from e64261d to c21fadc Compare May 13, 2024 15:32

jedevc mentioned this pull request May 14, 2024

feat: mark [internal] buildkit prefixes as internal #7371

Merged

jedevc force-pushed the plain-progress branch from 70c95c5 to 842c1d7 Compare May 14, 2024 16:08

jedevc mentioned this pull request May 15, 2024

🐞 Dagger CLI v0.11.4 deadlocks when running dagger call --source=.:default test all #7387

Open

4 tasks

jedevc force-pushed the plain-progress branch 3 times, most recently from 7dab49c to d3e1667 Compare May 16, 2024 17:11

jedevc modified the milestones: v0.12.0, v0.11.5 May 16, 2024

jedevc force-pushed the plain-progress branch 2 times, most recently from 598b59d to 869181f Compare May 17, 2024 15:50

jedevc force-pushed the plain-progress branch 4 times, most recently from e22fcc9 to c0a4166 Compare May 20, 2024 17:23

jedevc marked this pull request as ready for review May 20, 2024 17:25

jedevc requested review from vito, helderco and sipsma May 20, 2024 17:25

jedevc force-pushed the plain-progress branch from c0a4166 to ee18d0f Compare May 21, 2024 13:28

jpadams force-pushed the plain-progress branch from ee18d0f to aa355c2 Compare May 21, 2024 23:00

jedevc force-pushed the plain-progress branch from aa355c2 to aadc351 Compare May 22, 2024 11:44

sipsma approved these changes May 22, 2024

View reviewed changes

dagql/idtui/frontend.go

}

depth--

indent(out, depth)

depth-- //nolint:ineffassign

Copy link

Contributor

sipsma May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wtf, silly linter... 😄

jedevc force-pushed the plain-progress branch from 2742fbc to 6cf5631 Compare May 22, 2024 16:28

vito approved these changes May 22, 2024

View reviewed changes

jedevc added 3 commits May 23, 2024 09:09

chore: refactor progress type parsing

aebcdf6

Signed-off-by: Justin Chadwell <me@jedevc.com>

chore: refactor out frontend opts

35994db

Signed-off-by: Justin Chadwell <me@jedevc.com>

chore: convert Frontend into interface

af0d676

Signed-off-by: Justin Chadwell <me@jedevc.com>

jedevc force-pushed the plain-progress branch from 6cf5631 to 4d0f83e Compare May 23, 2024 08:31

jedevc added 3 commits May 23, 2024 09:36

chore: split out Frontend into pretty and plain

f62f7c1

Signed-off-by: Justin Chadwell <me@jedevc.com>

chore: split out log collection from db

3c71799

Signed-off-by: Justin Chadwell <me@jedevc.com>

chore: refactor formatting and rendering out of frontend

7614d8a

Signed-off-by: Justin Chadwell <me@jedevc.com>

jedevc force-pushed the plain-progress branch 11 times, most recently from 2aaaccf to acb5e49 Compare May 23, 2024 13:29

jedevc added 2 commits May 23, 2024 14:30

fix: restore old engine progress callbacks

c2f2cec

We don't want to display these at all for the new TUI. Signed-off-by: Justin Chadwell <me@jedevc.com>

jedevc mentioned this pull request May 23, 2024

DNM: try and debug failing plain progress + python #7450

Draft

fix: ensure we keep setting log profile

7353924

This got missed somewhere during a rebase. Signed-off-by: Justin Chadwell <me@jedevc.com>

hack: set PYTHONUNBUFFERED during python test run

f41b707

Signed-off-by: Justin Chadwell <me@jedevc.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve plain progress #7272

feat: improve plain progress #7272

jedevc commented May 3, 2024

jedevc commented May 3, 2024

gerhard commented May 3, 2024

jedevc commented May 7, 2024

jedevc commented May 17, 2024 •

edited

jpadams commented May 21, 2024 •

edited

sipsma left a comment

sipsma May 22, 2024

vito left a comment •

edited

vito May 22, 2024

vito May 22, 2024

jedevc May 22, 2024

vito May 22, 2024

jedevc commented May 23, 2024

jedevc commented May 23, 2024 •

edited

helderco commented May 23, 2024

jedevc commented May 23, 2024

feat: improve plain progress #7272

Are you sure you want to change the base?

feat: improve plain progress #7272

Conversation

jedevc commented May 3, 2024

jedevc commented May 3, 2024

gerhard commented May 3, 2024

jedevc commented May 7, 2024

jedevc commented May 17, 2024 • edited

jpadams commented May 21, 2024 • edited

sipsma left a comment

Choose a reason for hiding this comment

sipsma May 22, 2024

Choose a reason for hiding this comment

vito left a comment • edited

Choose a reason for hiding this comment

vito May 22, 2024

Choose a reason for hiding this comment

vito May 22, 2024

Choose a reason for hiding this comment

jedevc May 22, 2024

Choose a reason for hiding this comment

vito May 22, 2024

Choose a reason for hiding this comment

jedevc commented May 23, 2024

jedevc commented May 23, 2024 • edited

helderco commented May 23, 2024

jedevc commented May 23, 2024

jedevc commented May 17, 2024 •

edited

jpadams commented May 21, 2024 •

edited

vito left a comment •

edited

jedevc commented May 23, 2024 •

edited