Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spaces removed when decoding strings with escaped elements #129

Open
mxcl opened this issue Sep 10, 2019 · 7 comments
Open

Spaces removed when decoding strings with escaped elements #129

mxcl opened this issue Sep 10, 2019 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@mxcl
Copy link

mxcl commented Sep 10, 2019

I am getting this kind of XML from an ancient SOAP server:

<aResponse>&lt;uesb2b:response xmlns:uesb2b=&quot;http://services.b2b.ues.ut.uhg.com/types/plans/&quot; xmlns=&quot;http://services.b2b.ues.ut.uhg.com/types/plans/&quot;&gt;&#xd;
  &lt;uesb2b:st cd=&quot;GA&quot; /&gt;&#xd;
  &lt;uesb2b:obligId val=&quot;01&quot; /&gt;&#xd;
  &lt;uesb2b:shrArrangementId val=&quot;00&quot; /&gt;&#xd;
  &lt;uesb2b:busInsType val=&quot;CG&quot; /&gt;&#xd;
  &lt;uesb2b:metalPlans typ=&quot;Array&quot;
…

When I decode with a struct like:

struct Response: Decodable {
     let aResponse: String
}

Using:

let decoder = XMLDecoder()
decoder.shouldProcessNamespaces = true
let rsp = try decoder.decode(Response.self, from: data)
print(rsp.aResponse)

I get:

<uesb2b:response xmlns:uesb2b="http://services.b2b.ues.ut.uhg.com/types/plans/"xmlns="http://services.b2b.ues.ut.uhg.com/types/plans/">
<uesb2b:st cd="GA"/>
<uesb2b:obligId val="01"/>
<uesb2b:shrArrangementId val="00"/>
<uesb2b:busInsType val="CG"/>
<uesb2b:metalPlans typ="Array"arrayTyp="metalPlan[110]">
<uesb2b:metalPlan cd="AUWJ"rx="286A"level="S"min="0"max="0"/>
<uesb2b:metalPlan cd="AUWK"rx="286A"level="G"min="0"max="0"/><uesb2b:metalPlan …

(Newlines added for legibility). You can see the spaces on either side of the &quot;s in the attribute heavy nodes are removed (last two lines).

This makes the inner XML invalid.

Happy to fix the bug, just point me at the right code, thanks.

@mxcl
Copy link
Author

mxcl commented Sep 30, 2019

Ping on a pointer. This is a real problem for us.

@MaxDesiatov
Copy link
Collaborator

Hi @mxcl, sorry for the delay. Does setting trimValueWhitespaces to false on XMLDecoder instance resolve the issue for you?

@MaxDesiatov MaxDesiatov added the question Further information is requested label Oct 5, 2019
@MaxDesiatov
Copy link
Collaborator

I've added a test for this in #137, but it didn't require any changes in the library code, just disabling the trimValueWhitespaces flag. Please let me know if anything's missing.

MaxDesiatov added a commit that referenced this issue Oct 6, 2019
Add a test case previously reported in #129
@mxcl
Copy link
Author

mxcl commented Oct 7, 2019

I'll check, I apologize that I didn't see this property before.

@mxcl
Copy link
Author

mxcl commented Oct 7, 2019

Can confirm this fixes it thanks.

Not sure if the feature is working as intended, since it was trimming from the middle of my strings.

@MaxDesiatov
Copy link
Collaborator

MaxDesiatov commented Oct 7, 2019

We trim any string chunk that Foundation's XMLParser passes us in its delegate function, which usually is just whole XML element content. Oddly enough, XMLParser chunks strings on every escaped element, and I agree this is unexpected behavior. Will keep the issue open for a bit until I make it accumulate those chunks and trim whole element content as expected.

@MaxDesiatov MaxDesiatov added bug Something isn't working and removed question Further information is requested labels Oct 7, 2019
@MaxDesiatov MaxDesiatov assigned MaxDesiatov and unassigned mxcl Oct 7, 2019
bwetherfield pushed a commit to bwetherfield/XMLCoder that referenced this issue Oct 16, 2019
Add a test case previously reported in CoreOffice#129
@ethan-kusters
Copy link

Looks like this is already being addressed, but I just noticed the same issue with single quote values as well.

For example, the value:

<title>What does ‘being verified’ actually mean?</title>

was returned as:

What does'being verified' actually mean?

Setting trimValueWhitespaces to false fixed the problem for me as well. Thank you so much for all your work here!

arjungupta0107 pushed a commit to salido/XMLCoder that referenced this issue Jun 26, 2020
Add a test case previously reported in CoreOffice#129
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants