Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build some kind of log comparer #1270

Open
SethTisue opened this issue Oct 30, 2020 · 6 comments
Open

build some kind of log comparer #1270

SethTisue opened this issue Oct 30, 2020 · 6 comments
Labels

Comments

@SethTisue
Copy link
Member

we want something that allows us to assess the impact of a change in scala/scala that changes what warnings are emitted without actually making any projects fail

there is a short term desire to assess the impact of @dwijnand's big exhaustivity PR (scala/scala#9140) before we release 2.13.4. something janky for one-time use would be okay

but then also we'd like to have something that we can use over and over again

@SethTisue SethTisue self-assigned this Oct 30, 2020
@SethTisue
Copy link
Member Author

an implementation challenge here is that the community build is prone to intermittent failures. when we do a community build run that rebuilds everything, if we're lucky the whole thing might run to the end, but it's more typical that at least one repo fails, which then also usually means that some downstream repos don't run at all. so we have to do at least one followup run, and then reviewing the logs for every project involves combining logs from multiple runs

I'm imagining a script that we can give multiple run numbers and then it pulls down all the split logs from both runs and prefers the later logs over the earlier ones, unless the later log looks like this:

[airframe] Found cached project build, uuid c65436708c310dc3868207781240be26dcd403bd

in which case obviously it's the earlier log that's useful

note that I already built the log splitter a while back, so at e.g. https://scala-ci.typesafe.com/job/scala-2.13.x-jdk15-integrate-community-build/lastSuccessfulBuild/artifact/logs/ we have per-project logs, so that's good

@SethTisue
Copy link
Member Author

just straight-up diffing the logs is likely to produce a lot of noise, because there will be all sorts of uninteresting differences: different ordering due to timing, different time stamps and timings, different Scala SHAs for the different runs turning up in the output, and so on and so forth

so a general purpose log comparer might be tricky to build. it isn't obvious to me without trying it how bad it might be. like, maybe there will be a ton of noise, but most of the noise will come from the same handful of causes? if so, we might get usable results just by 1) sorting each log to eliminate order differences, and 2) having some regexes to identify lines we don't care about at all, and 3) if necessary have some further regexes for removing parts of lines that aren't important while still allowing us to see diffs in the rest of the line

the easiest thing is if we know what we're looking for. e.g. for the exhaustivity PR, we're mainly looking for the presence or absence of certain specific identifiable warnings, so we can just zero in on the lines where a certain regex or handful of regexes match. that might be enough for 2.13.4 and then we could make it more general purpose later?

@dwijnand
Copy link
Member

Interestingly (or not) my instincts are again to go upstream in dbuild and build something at that level, rather than post-process the unstructured, jumble of differently-ordered log lines. For example "Found cached project build" could be told to cache and redisplay the warnings it displayed when it did compile the sources.

For the exhaustivity warnings, yeah, something niche like all the lines with "warning:" but still separated per project, diff-ed against each other, would give us something to study.

What I'm looking for is true false positives and too many pseudo-false positives. By the latter I mean things like: how common is it for projects to pattern match on value classes or enumerations, that aren't understood by the exhaustivity analysis? And are there any more similar ones?

@SethTisue
Copy link
Member Author

returning this to the back burner — maybe it'll happen the next time we are feeling the pain of lacking it

@lrytz
Copy link
Member

lrytz commented Aug 26, 2021

I did it manually at some point: scala/scala#9672 (comment)

@SethTisue
Copy link
Member Author

just as a reminder to myself and whoever else, Jenkins offers an "all files in zip" link on the "Build Artifacts" page of a run, which is convenient for downloading and grepping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants