Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Decoding of Empty String #145

Merged
merged 23 commits into from Nov 13, 2019

Conversation

bwetherfield
Copy link
Collaborator

Prior to this fix, we found that decoding an empty string (represented as a null element), <string/> was confusing the decoder. I can't seem to find any issues logged. Perhaps @jsbean has an idea where these might be hiding...?

I'm also building up to a PR that addresses the array-of-empty-elements issue, but I am trying to disentangle the changes we merged into a separate branch!

@bwetherfield bwetherfield changed the title Fixes Decoding of Empty String [WIP] Fixes Decoding of Empty String Oct 16, 2019
@bwetherfield bwetherfield changed the title [WIP] Fixes Decoding of Empty String Fixes Decoding of Empty String Nov 2, 2019
@bwetherfield
Copy link
Collaborator Author

This one is now ready for your review, @MaxDesiatov ! Thanks


XCTAssertThrowsError(try decoder.decode(Container.self, from: xmlData))
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test pass? If so, I'd prefer to keep it in the test suite

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've played around with this, and it does work if the element is renamed to, say, notValue, but I'm encountering an ambiguity with the treatment of the value intrinsic. Would you be content with this if we pushed through the proposed change from value to xmlValue as the labeling of the intrinsic?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or even limiting the intrinsic to "")

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"" and "xmlValue" are both fine, "xmlValue" (or everything that starts with "xml" for that matter) is not a valid element or attribute, at least according to my understanding of the XML standard:

Names beginning with the string "xml", or with any string which would match (('X'|'x') ('M'|'m') ('L'|'l')), are reserved for standardization in this or future versions of this specification.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, isn't decoding the value of <container /> element to an empty string is the expected behaviour here? I think I also relied on this in CoreXLSX in some of its behavior if I remember correctly, but this needs to be confirmed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might seem a bit self-centered, but if it doesn't break CoreXLSX, I'm probably fine with the change, hope MusicXML test suite would pass as well. Just wondering how many other apps and libraries we could break with this change and what would be possible workarounds in that case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hoping we can perhaps circumvent these concerns by changing the use of the "value" coding key, per my latest PR! It does mean downstream users would have to introduce = "" additions, but since this change is probably on the way anyway, I wonder if now is as good a time as any...?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwetherfield Will this test be added back?

Copy link
Collaborator

@MaxDesiatov MaxDesiatov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, thanks for discovering this edge case @bwetherfield! I'm a bit worried about the removed test function though, hope it's easy to add it back and then it's ready to be merged 👍

@MaxDesiatov MaxDesiatov changed the title Fixes Decoding of Empty String Fix Decoding of Empty String Nov 2, 2019
MaxDesiatov pushed a commit that referenced this pull request Nov 11, 2019
## Overview
Restricts the special case usage of `"value"` OR `""` as a coding key, as implemented in #73 to just `""`. `"value"` resumes functioning as a normal coding key. 

## Example
As noted in #145, `"value"` is starting to collide awkwardly in one or two cases where it may be _implicit_ per the special case usage 
```swift
<value-containing>I am implicitly labeled</value-containing>
``` 
or _explicit_, as in 
```swift
<value-containing>
    <value>I am explicitly labeled</value>
</value-containing>
``` 
With the change proposed, the latter example will be unambiguously captured by
```swift
struct ValueContaining: Codable {
    let value: String
}
```
while the former is equivalent to
```swift
struct ValueContaining: Codable {
    let value: String

    enum CodingKeys: String, CodingKey {
        case value = ""
    }
}
```
## Source Compatability

This is a **breaking change** for downstream users of the special `"value"` coding key. All such uses should be replaced by `case value = ""` subject to this PR's inclusion.
@MaxDesiatov
Copy link
Collaborator

@bwetherfield I'm ready to review this again when conflicts are resolved 🙂

@bwetherfield
Copy link
Collaborator Author

Thanks, @MaxDesiatov ! I also need to add back the missing test :) Will keep you posted

@bwetherfield
Copy link
Collaborator Author

OK - should be ready for review! Thanks, @MaxDesiatov

Copy link
Collaborator

@MaxDesiatov MaxDesiatov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR diff still has the test removed, but I'm unsure what's the reason for that?

@bwetherfield
Copy link
Collaborator Author

My apologies, @MaxDesiatov - should be in now!

Copy link
Collaborator

@MaxDesiatov MaxDesiatov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwetherfield fantastic, thank you!

@MaxDesiatov MaxDesiatov merged commit 0eba4c6 into CoreOffice:master Nov 13, 2019
@MaxDesiatov MaxDesiatov added the enhancement New feature or request label Nov 13, 2019
MaxDesiatov pushed a commit that referenced this pull request Nov 17, 2019
## Overview

Fixes #123 and extends #145 to accommodate decoding of arrays of empty strings or mixed arrays of non-empty and empty strings.

## Example

We may now decode 
```xml
<container>
    <empty/>
    <empty/>
    <empty/>
</container>
```
into the following type
```swift
struct EmptyArray: Equatable, Codable {
    enum CodingKeys: String, CodingKey { case empties = "empty" }
    let empties: [Empty]
}
```
where 
```swift
struct Empty: Equatable, Codable { }
```
We can also decode a value of the following type
```swift
struct EmptyWrapper {
    let empty: Empty
}
```
from 
```xml
<wrapper>
    <empty/>
</wrapper>
```
Further, following from #145 we can decode arrays of empty strings
```xml
<container>
    <string-type/>
    <string-type>My neighbors are empty<string-type>
    <string-type/>
</container>
```
as
```swift
struct EmptyArray: Equatable, Codable {
    enum CodingKeys: String, CodingKey { case stringType = "string-type" }
    let stringType: [String]
}
```
and variants.
## Source Compatibility

In cases where we decode an array of type `[String?]`, an empty element is now decoded as `""` rather than `nil`, the justification being that `String` can itself take empty values. We use the convention that `nil` signifies the absence of an element (if 0 or 1 of the element are allowed) as in the updated [BreakfastTest](https://github.com/bwetherfield/XMLCoder/blob/0d20614e47df98d1a10174e992c585edf629c9b9/Tests/XMLCoderTests/BreakfastTest.swift) and in the following error-throwing [test case](https://github.com/MaxDesiatov/XMLCoder/blob/2855777ff868e8a4c1d944c7da0ddb49a8b3893e/Tests/XMLCoderTests/Minimal/NullTests.swift#L56-L68).

* Add nested choice unkeyed container decoding test
* Fix nested unkeyed container implementation for nested keyed box
* Clean up unwrapping syntax
* Treat corner case of empty string value intrinsic
* Remove xcodeproj junk
* Add some logging to assess where we're at
* Add cases for empty string as null element decoding
* Swiftformat
* Transform precondition to where clause in switch statement
* Remove print statements
* Add failing test for a nested array of empty-string value intrinsic elements
* Do a little cleanup of single keyed box passing around
* Refactor XMLKeyedDecodingContainer.decodeConcrete elements massaging
* Remove xcscheme junk
* Add fix for empty arrays, wrapped empties etc
* Clean up
* Revert singleKeyed dive change
* Accommodate singleKeyed reading options
* Alter Border Test
* Add test case that returns [nil] (exists a non-optional property)
* Eliminate possibly empty Int from Breakfast test
* Fix formatting
* Fix forcecast
* Fix formatting
* Update LinuxMain
* Fix tests such that null elements read as empty strings
* Fix linux main
* Add nested array of empty strings decoding in the explicit style
* Add mixed case empty and non-empty string cases
* Reinstate missing test
* Add test for decoding a null element into an optional type
arjungupta0107 pushed a commit to salido/XMLCoder that referenced this pull request Jun 26, 2020
## Overview
Restricts the special case usage of `"value"` OR `""` as a coding key, as implemented in CoreOffice#73 to just `""`. `"value"` resumes functioning as a normal coding key. 

## Example
As noted in CoreOffice#145, `"value"` is starting to collide awkwardly in one or two cases where it may be _implicit_ per the special case usage 
```swift
<value-containing>I am implicitly labeled</value-containing>
``` 
or _explicit_, as in 
```swift
<value-containing>
    <value>I am explicitly labeled</value>
</value-containing>
``` 
With the change proposed, the latter example will be unambiguously captured by
```swift
struct ValueContaining: Codable {
    let value: String
}
```
while the former is equivalent to
```swift
struct ValueContaining: Codable {
    let value: String

    enum CodingKeys: String, CodingKey {
        case value = ""
    }
}
```
## Source Compatability

This is a **breaking change** for downstream users of the special `"value"` coding key. All such uses should be replaced by `case value = ""` subject to this PR's inclusion.
arjungupta0107 pushed a commit to salido/XMLCoder that referenced this pull request Jun 26, 2020
Prior to this fix, we found that decoding an empty string (represented as a null element), `<string/>` was confusing the decoder.

* Treat corner case of empty string value intrinsic
* Remove xcodeproj junk
* Add some logging to assess where we're at
* Add cases for empty string as null element decoding
* Swiftformat
* Transform precondition to where clause in switch statement
* Remove print statements
* Remove nested syntax test
* Fix forcecast
(cherry picked from commit e6ec42a)
* Fix formatting
* Update LinuxMain
* Remove warnings
* Add fix for new empty string issue
* Add back missing missingTest - remove "value" special casing from implementation
* Fix formatting
* Recover missing test!
* Update linuxmain
arjungupta0107 pushed a commit to salido/XMLCoder that referenced this pull request Jun 26, 2020
## Overview

Fixes CoreOffice#123 and extends CoreOffice#145 to accommodate decoding of arrays of empty strings or mixed arrays of non-empty and empty strings.

## Example

We may now decode 
```xml
<container>
    <empty/>
    <empty/>
    <empty/>
</container>
```
into the following type
```swift
struct EmptyArray: Equatable, Codable {
    enum CodingKeys: String, CodingKey { case empties = "empty" }
    let empties: [Empty]
}
```
where 
```swift
struct Empty: Equatable, Codable { }
```
We can also decode a value of the following type
```swift
struct EmptyWrapper {
    let empty: Empty
}
```
from 
```xml
<wrapper>
    <empty/>
</wrapper>
```
Further, following from CoreOffice#145 we can decode arrays of empty strings
```xml
<container>
    <string-type/>
    <string-type>My neighbors are empty<string-type>
    <string-type/>
</container>
```
as
```swift
struct EmptyArray: Equatable, Codable {
    enum CodingKeys: String, CodingKey { case stringType = "string-type" }
    let stringType: [String]
}
```
and variants.
## Source Compatibility

In cases where we decode an array of type `[String?]`, an empty element is now decoded as `""` rather than `nil`, the justification being that `String` can itself take empty values. We use the convention that `nil` signifies the absence of an element (if 0 or 1 of the element are allowed) as in the updated [BreakfastTest](https://github.com/bwetherfield/XMLCoder/blob/0d20614e47df98d1a10174e992c585edf629c9b9/Tests/XMLCoderTests/BreakfastTest.swift) and in the following error-throwing [test case](https://github.com/MaxDesiatov/XMLCoder/blob/2855777ff868e8a4c1d944c7da0ddb49a8b3893e/Tests/XMLCoderTests/Minimal/NullTests.swift#L56-L68).

* Add nested choice unkeyed container decoding test
* Fix nested unkeyed container implementation for nested keyed box
* Clean up unwrapping syntax
* Treat corner case of empty string value intrinsic
* Remove xcodeproj junk
* Add some logging to assess where we're at
* Add cases for empty string as null element decoding
* Swiftformat
* Transform precondition to where clause in switch statement
* Remove print statements
* Add failing test for a nested array of empty-string value intrinsic elements
* Do a little cleanup of single keyed box passing around
* Refactor XMLKeyedDecodingContainer.decodeConcrete elements massaging
* Remove xcscheme junk
* Add fix for empty arrays, wrapped empties etc
* Clean up
* Revert singleKeyed dive change
* Accommodate singleKeyed reading options
* Alter Border Test
* Add test case that returns [nil] (exists a non-optional property)
* Eliminate possibly empty Int from Breakfast test
* Fix formatting
* Fix forcecast
* Fix formatting
* Update LinuxMain
* Fix tests such that null elements read as empty strings
* Fix linux main
* Add nested array of empty strings decoding in the explicit style
* Add mixed case empty and non-empty string cases
* Reinstate missing test
* Add test for decoding a null element into an optional type
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants