Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reformatting baseline XML causes baselines to fail #3316

Closed
colintheshots opened this issue Dec 23, 2020 · 20 comments
Closed

Reformatting baseline XML causes baselines to fail #3316

colintheshots opened this issue Dec 23, 2020 · 20 comments

Comments

@colintheshots
Copy link

Expected Behavior

XML should be mostly whitespace-agnostic. If the IDE reformats the XML file to pretty-print it, the baselines should still function.

Observed Behavior

Every baseline which gets pretty-printed onto multiple lines fails to function.

Steps to Reproduce

Generate a baseline for a huge project with ./gradlew detektBaseline
Allow the IDE or its git commit screen to reformat code.
Attempt to run ./gradlew detekt
The Gradle build fails with code smells wherever reformatting occurred.

Context

I was attempting to integrate Detekt into our Android Studio project and couldn't figure out why the detekt baseline kept breaking after I checked in code. It was that the Git Commit dialog inside Android Studio had Reformat Code checked. I was pulling my hair out as it worked each time before checking in.

Your Environment

  • Version of detekt used: 1.15.0
  • Version of Gradle used (if applicable): 6.6
  • Operating System and version: Mac Big Sur 11.0.1
  • Link to your project (if it's a public repository):
@schalkms
Copy link
Member

Thanks for experimenting and reporting this bug to this repo.
Ughh, this bug sounds bad. May I ask if you could provide the two different files for debugging purposes?
For me this seems to be an issue with whitespace characters.
Anyhow, this shouldn't happen as you mentioned.

@BraisGabin
Copy link
Member

I reproduced this issue in my end. It happens when you format this:

<ID>something</ID>

to this

<ID>
  something
</ID>

But, are they really the same xml? We use javax.xml to read our xmls so it doesn't seem likely that we have a parser problem. I can't find it in the standard or any trustfull source but I think that your formatter is changing the xml in a wrong way. https://stackoverflow.com/a/10923392/842697

In android, for example, this changes aren't important but that seems an Android choice. They decided to add a trim after reading the xml value.

We could add that trim too but I don't think that we should.

@schalkms
Copy link
Member

@BraisGabin okay, that's a different story (if true). Why is Android Studio then adding line breaks? Does it consider the max line length whilst reformatting the code?

@BraisGabin
Copy link
Member

I just checked and both IntelliJ and AS reformat long lines of xmls... maybe we should configure something in our parser so it's a bit smarter?

@schalkms schalkms removed the bug label Dec 26, 2020
@schalkms
Copy link
Member

maybe we should configure something in our parser so it's a bit smarter?

You would need to place a .trim() for each read XML innerText between two elements.

@BraisGabin
Copy link
Member

No really. This is an issue too:

<ID>fun foo</ID>

and

<ID>
  fun
  foo
</ID>

And this is kind of format is done by IntelliJ and AS too.

@schalkms
Copy link
Member

Okay, that's weird too. I'm not sure, whether handling this edge case should be done inside the core baseline parser. The mentioned case could happen for function signatures in the baseline xml file. Let's consider the following code. This could easily be reformatted.

<ID>ComplexMethod:DetektExtension.kt$DetektExtension$fun convertToArguments(): MutableList&lt;String&gt;</ID>

To my knowledge this is the first report about having such an issue with the baseline xml in detekt.

@BraisGabin
Copy link
Member

Yes. It's really weird. For that reason I was thinking if the formater could have a "flag" or something so we don't need to implement the "unformat" code.

@schalkms
Copy link
Member

For that reason I was thinking if the formater could have a "flag" or something so we don't need to implement the "unformat" code.

Just to make sure that I could correctly follow. With formatter you mean the IDEA one, right?

@BraisGabin
Copy link
Member

My bad! I didn't use the correct word there. I mean: "For that reason I was thinking if javax.xml could have a "flag" or something so we don't need to implement the "unformat" code."

@schalkms
Copy link
Member

schalkms commented Dec 26, 2020

"For that reason I was thinking if javax.xml could have a "flag" or something so we don't need to implement the "unformat" code."

Thanks for the update. This clears out some confusion.
That seems to be possible with setIgnoringElementContentWhitespace(), even though I haven't tried it yet.

@BraisGabin
Copy link
Member

I'll try it and create pr if it works. If not, I'll update here with the new information.

@github-actions
Copy link

github-actions bot commented Nov 7, 2021

This issue is stale because it has been open 90 days with no activity. Please comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Nov 7, 2021
@cortinico
Copy link
Member

I'll try it and create pr if it works. If not, I'll update here with the new information.

What's the status here? Should we add a "help wanted" to this?

@BraisGabin
Copy link
Member

BraisGabin commented Nov 7, 2021

Yes. I did nothing about this at the end.

@abhimaandunzo
Copy link

yeah this is irritating

@BraisGabin
Copy link
Member

I implemented #4335. I know that it doesn't fix the core issue here but it's better than actual behaviour.

It doesn't fix the core issue because I'm not sure that this is a detekt's issue.

<ID>fun foo</ID>

and

<ID>
  fun
  foo
</ID>

are sematically different. or at least that's what I understand reading the xml specification. If someone can show me something in the specification of xml that says the opposite we could try to fix it. But I think that this is not an issue in detekt. It is an issue in IntelliJ breaking the xml sematics. So I would vote to close this issue as soon as #4335 is merged.

I think that xml is not the best format to store the baselines and we should consider to change it for 2.0. But that's another topic.

@mtrakal
Copy link

mtrakal commented Aug 22, 2022

Android studio cause issue (as you wrote before) when I have set line breaks and refactor while commit:

After call gradlew detektBaseline:
image

After reformat code by Android studio:
image

Changes of course break Detekt. It would be nice when we can force not to format detekt baseline file.

@BraisGabin
Copy link
Member

Clearly this is annoying. But this is an Android Studio's issue. They are changing the meanning of the xml. They are breaking the standard. We tried to fix this once but, for detekt, an space before </ID> has meanning so we end up breaking things.

I'm going to close this as "wont-fix" because it's AS fault. But I think that for detekt 2.0 we should stop ussing xml for this. A plain file with one line per issue should be more that enough.

@BraisGabin BraisGabin closed this as not planned Won't fix, can't repro, duplicate, stale Aug 22, 2022
@cortinico
Copy link
Member

Clearly this is annoying. But this is an Android Studio's issue

@mtrakal I would suggest opening an issue on AS issue tracker as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants