Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array of enums with associated values #25

Closed
jsbean opened this issue Dec 18, 2018 · 6 comments · Fixed by #119
Closed

Array of enums with associated values #25

jsbean opened this issue Dec 18, 2018 · 6 comments · Fixed by #119
Assignees

Comments

@jsbean
Copy link
Contributor

jsbean commented Dec 18, 2018

Thanks for taking up the maintenance on this project!

I am working with an XML schema which requires support for heterogeneous arrays of values within a set of specified types.

Consider the following enum which defines each of the types allowable:

enum AB {
    struct A: Decodable { let value: Int }
    struct B: Decodable { let value: String }
    case a(A)
    case b(B)   
}

Currently, the Decodable implementation is as follows (based on objc.io/codable-enums):

extension AB: Decodable {
    enum CodingKeys: String, CodingKey { case a, b }
    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        do {
            self = .a(try container.decode(A.self, forKey: .a))
        } catch {
            self = .b(try container.decode(B.self, forKey: .b))
        }
    }
}

A minimal example of the source XML could look something like this (though there could be arbitrary amounts of either type, in any order):

let xml = """
<?xml version="1.0" encoding="UTF-8"?>
<container>
    <a value="42"></a>
    <b value="forty-two"></b>
</container>
"""

Upon decoding it with the XMLDecoder:

let decoder = XMLDecoder()
let abs: [AB] = try! decoder.decode([AB].self, from: xml.data(using: .utf8)!)

We get the following:

// => [.a(AB.A(value: 42))]

The desired result should be:

// => [.a(AB.A(value: 42)), .b(AB.B(value: "forty-two"))]

I have tried many combinations of keyed, unkeyed, singleValue, and nested containers, yet haven't found a solution that treats the data as an Array rather than a Dictionary.

Let me know if I am holding this wrong (at either the Decodable or the XMLDecoder stages), or if this could be a current limitation of the XMLCoder implementation. I'd be happy to contribute, but if anyone notices something glaring here, this would be very helpful.

@MaxDesiatov
Copy link
Collaborator

MaxDesiatov commented Jan 12, 2019

Hi @jsbean, first of all thanks a lot for the report and sorry for the delay caused by our current focus on test coverage in XMLCoder. We've finally reached ~75% coverage and are ready to look into more bugfixes and disruptive changes in the codebase.

This indeed looks like a bug caused by XMLCoder wrapping the inner unkeyed box twice: first in a keyed box and then an unkeyed box on top of it. I'm currently looking into it more closely and will keep you updated.

screen shot 2019-01-12 at 11 52 43

MaxDesiatov pushed a commit that referenced this issue Jan 13, 2019
Unifies the representation of `T` where `T: Box` and makes it explicit where shared reference semantics is expected.

With this code performance is consistently 5-30% worse, could this be caused by copies of value types? Can this be improved when Swift 5.0 is released or whatever the version is that allows us to improve performance of value types? After reviewing related issues we'd prefer XMLCoder to be slightly slower, but with #12, #17 and #25 fixed than the opposite 😞 

* Added `SharedBox` type for on-demand reference semantics
* Remove `print` that kills performance tests
* Handle all cases with SharedBox containers
* Add RJITest back to the project file, fix the test
* Remove unused internal initializer
@jsbean
Copy link
Contributor Author

jsbean commented Jan 25, 2019

@MaxDesiatov, no worries! Apologies for disappearing for a bit. Went unplugged.

I'm glad the effort was put into enhancing the test coverage. I'll keep an eye on this. Let me know if there is anything I can do to contribute.

@MaxDesiatov
Copy link
Collaborator

MaxDesiatov commented Feb 19, 2019

@flowbe
Copy link
Contributor

flowbe commented Apr 25, 2019

Hello!

I am also affected by this issue, any news?

@jsbean
Copy link
Contributor Author

jsbean commented May 2, 2019

I am currently only looking at this from the decoding side of things, but here are some thoughts.

Here is a pipeline that seems important for this feature:

XMLCoderElement.flatten()-> KeyedBox
XMLStackParser.parse(...) -> KeyedBox
XMLDecoder.decode(...)

Inside XMLDecoder.decode, XMLStackParser.parse is called, returning a KeyedBox, though it is up-cast immediately to an abstract Box. Is it possible to instead return Box from XMLStackParser.parse? The keyed-ness is an issue for the following scenario (slightly simplified from the one in the original post):

Here's an enum:

enum IntOrString {
    case int(Int)
    case string(String)
}

Here's its Decodable bits:

extension IntOrString: Decodable {
    enum CodingKeys: String, CodingKey {
        case int
        case string
    }
    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        do {
            self = .int(try container.decode(Int.self, forKey: .int))
        } catch {
            self = .string(try container.decode(String.self, forKey: .string))
        }
    }
}

Here's an example of it in the wild:

let xml = """
<container>
    <int>"42"</int>
    <string>"forty-two"</string>
    <int>"43"</int>
</container>
"""

If we try to decode it like this:

let values = try XMLDecoder().decode([IntOrString].self, from: xml.data(using: .utf8)!)

Within XMLDecoder.decode, the topLevel: Box we get from XMLStackParser.parse (who got it from XMLCoderElement.flatten) is a KeyedBox, with elements of:

["int": ["42", "43"], "string": "forty-two"]

In this case, it does not behoove us to immediately treat this as a KeyedBox, as we lose information: the node: XMLCoderElement returned by XMLStackParser(...) -> XMLCoderElement has everything we need.

@MaxDesiatov, am I looking in the right place for this?

@MaxDesiatov
Copy link
Collaborator

I've been working on this issue for the last couple of days in attempt to resolve #91 and you're right, most of the problems are in how flatten works. The name is unhelpful though, it doesn't flatten anything, only transforms a tree of XMLCoderElement to a tree of boxes. I'm going to rename it to transformToBoxTree or similar.

A temporary hack I've applied during my experimentation was to transform a KeyedBox to an UnkeyedBox, but your comment is super helpful @jsbean! I now realize that transformation would convert

{"int": ["42", "43"], "string": "forty-two"}

into

[{"int": "42"}, {"int": "43"}, {"string": "forty-two"}]

which still loses the original order of keys.

To clean this up I'm thinking of removing UnkeyedBox entirely and making KeyedBox store everything internally as an array with an additional mapping from keys to array indices. This way both KeyedContainer and UnkeyedContainer would decode from KeyedBox. KeyedContainer would work as it did previously, while UnkeyedContainer is going to get the access to the original array of elements. Some of these cases were previously worked around in master with double wrapping KeyedBox in an UnkeyedBox and vice versa, which is unnecessarily complex. I'll create PRs as I go so that we don't conflict if anyone is working on this too.

MaxDesiatov added a commit that referenced this issue May 22, 2019
`UnkeyedBox` is a wrapper for `[Box]`, the assumption is that adding direct `Box` conformance to `[Box]` would simplify things a bit. Also, `KeyedStorage` now always stores `[Box]` per key without any special handling for single items per key to simplify this even more. In addition, order of values is preserved for all keys as required for #91 and #25.

This should also unblock #101 providing unified handling for elements without any content and attributes.

By pure serendipity this also fixes tests introduced in #38.

* Replace UnkeyedBox with Array, refine KeyedStorage
* Fix more of the broken tests
* One unfixed test left in benchmarks
* Single out failing benchmark in a separate test
* Fix all tests 🎉
* Fix compiler warning
* Fix Xcode 10.1 compilation error
* Remove unused AnyArray protocol
* Remove unused elementType function
* Simplify code to improve test coverage
@MaxDesiatov MaxDesiatov self-assigned this Jun 1, 2019
MaxDesiatov pushed a commit that referenced this issue Jul 30, 2019
## Introduction

This PR introduces the `XMLChoiceCodingKey` protocol, which enables the encoding and decoding of union-type–like enums with associated values to and from `XML` choice elements.

Resolves #25.
Resolves #91.

## Motivation

XML schemas support [choice](https://www.w3schools.com/xml/el_choice.asp) elements, which constrain their contents to a single instance of a member of a known set of types. Choice elements exhibit the properties of [union types](https://en.wikipedia.org/wiki/Union_type) and can be represented in Swift as enums with associated values, wherein each case of the enum carries with it a single associated value that is one of the types representable.

An example of how such a type is implemented in Swift:
   
```Swift
enum IntOrString {
    case int(Int)
    case string(String)
}
```

There is currently no automatic synthesis of the `Codable` protocol requirements for enums with assocated types in today's Swift. As such, it is required to provide custom implementations of the `init(from: Decoder)` initializer and the `encode(to: Encoder)` method to conform to the `Encodable` and `Decodable` protocols, respectively.

When encoding to and decoding from `JSON`, a single-element keyed container is created that uses the enum case name as the single key.

An example of adding Codable conformance to such a type when working with JSON
   
```Swift
extension IntOrString: Codable {
    enum CodingKeys: String, CodingKey { case int, string }
    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        do {
            self = .int(try container.decode(Int.self, forKey: .int))
        } catch {
            self = .string(try container.decode(String.self, forKey: .string))
        }
    }
    func encode(to encoder: Encoder) throws {
        var container = encoder.container(keyedBy: CodingKeys.self)
        switch self {
        case let .int(value):
            try container.encode(value, forKey: .int)
        case let .string(value):
            try container.encode(value, forKey: .string)
        }
    }
}
```

This may not be the most handsome approach, but it does the job without imprinting any format-specfic logic onto otherwise format-agnostic types.

This pattern works out of the box with the `JSONEncoder` and `JSONDecoder` provided by the `Foundation` framework. However, due to syntactic characteristics of the `XML` format, this pattern will **_not_** work automatically for encoding and decoding `XML`-formatted data, regardless of the tool used.

## Proposed solution

The proposed solution is to define a new `XMLChoiceCodingKey` protocol:

```Swift
/// An empty marker protocol that can be used in place of `CodingKey`.
/// It must be used when conforming a union-type–like enum with associated values to `Codable`
/// when the encoded format is `XML`.
public protocol XMLChoiceCodingKey: CodingKey {}
```

The `XMLChoiceCodingKey` protocol inherits from `CodingKey` and adds no new requirements. This conformance can be made retroactively, without additional implementation.

An example usage:
    
```Swift
extension IntOrString.CodingKeys: XMLChoiceCodingKey {}
``` 

## Detailed design

This proposal adds a single `public` `protocol` `XMLChoiceCodingKey`, as well as several `internal` types.

Under the hood, the `XMLChoiceEncodingContainer` and `XMLChoiceDecodingContainer` are used to provide `encode` and `decode` methods tuned for `XML` choice elements.

Because of the characteristics of the `XML` format, there are some ambiguities (from an encoding and decoding perpsective) between unkeyed container elements that contain choice elements and those that contain nested unkeyed container elements.

In order to untangle these ambiguities, the new container types utilize a couple of new `Box` types to redirect elements along the encoding and decoding process.

## Source compatibility

This is purely an additive change.
arjungupta0107 pushed a commit to salido/XMLCoder that referenced this issue Jun 26, 2020
`UnkeyedBox` is a wrapper for `[Box]`, the assumption is that adding direct `Box` conformance to `[Box]` would simplify things a bit. Also, `KeyedStorage` now always stores `[Box]` per key without any special handling for single items per key to simplify this even more. In addition, order of values is preserved for all keys as required for CoreOffice#91 and CoreOffice#25.

This should also unblock CoreOffice#101 providing unified handling for elements without any content and attributes.

By pure serendipity this also fixes tests introduced in CoreOffice#38.

* Replace UnkeyedBox with Array, refine KeyedStorage
* Fix more of the broken tests
* One unfixed test left in benchmarks
* Single out failing benchmark in a separate test
* Fix all tests 🎉
* Fix compiler warning
* Fix Xcode 10.1 compilation error
* Remove unused AnyArray protocol
* Remove unused elementType function
* Simplify code to improve test coverage
arjungupta0107 pushed a commit to salido/XMLCoder that referenced this issue Jun 26, 2020
## Introduction

This PR introduces the `XMLChoiceCodingKey` protocol, which enables the encoding and decoding of union-type–like enums with associated values to and from `XML` choice elements.

Resolves CoreOffice#25.
Resolves CoreOffice#91.

## Motivation

XML schemas support [choice](https://www.w3schools.com/xml/el_choice.asp) elements, which constrain their contents to a single instance of a member of a known set of types. Choice elements exhibit the properties of [union types](https://en.wikipedia.org/wiki/Union_type) and can be represented in Swift as enums with associated values, wherein each case of the enum carries with it a single associated value that is one of the types representable.

An example of how such a type is implemented in Swift:
   
```Swift
enum IntOrString {
    case int(Int)
    case string(String)
}
```

There is currently no automatic synthesis of the `Codable` protocol requirements for enums with assocated types in today's Swift. As such, it is required to provide custom implementations of the `init(from: Decoder)` initializer and the `encode(to: Encoder)` method to conform to the `Encodable` and `Decodable` protocols, respectively.

When encoding to and decoding from `JSON`, a single-element keyed container is created that uses the enum case name as the single key.

An example of adding Codable conformance to such a type when working with JSON
   
```Swift
extension IntOrString: Codable {
    enum CodingKeys: String, CodingKey { case int, string }
    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        do {
            self = .int(try container.decode(Int.self, forKey: .int))
        } catch {
            self = .string(try container.decode(String.self, forKey: .string))
        }
    }
    func encode(to encoder: Encoder) throws {
        var container = encoder.container(keyedBy: CodingKeys.self)
        switch self {
        case let .int(value):
            try container.encode(value, forKey: .int)
        case let .string(value):
            try container.encode(value, forKey: .string)
        }
    }
}
```

This may not be the most handsome approach, but it does the job without imprinting any format-specfic logic onto otherwise format-agnostic types.

This pattern works out of the box with the `JSONEncoder` and `JSONDecoder` provided by the `Foundation` framework. However, due to syntactic characteristics of the `XML` format, this pattern will **_not_** work automatically for encoding and decoding `XML`-formatted data, regardless of the tool used.

## Proposed solution

The proposed solution is to define a new `XMLChoiceCodingKey` protocol:

```Swift
/// An empty marker protocol that can be used in place of `CodingKey`.
/// It must be used when conforming a union-type–like enum with associated values to `Codable`
/// when the encoded format is `XML`.
public protocol XMLChoiceCodingKey: CodingKey {}
```

The `XMLChoiceCodingKey` protocol inherits from `CodingKey` and adds no new requirements. This conformance can be made retroactively, without additional implementation.

An example usage:
    
```Swift
extension IntOrString.CodingKeys: XMLChoiceCodingKey {}
``` 

## Detailed design

This proposal adds a single `public` `protocol` `XMLChoiceCodingKey`, as well as several `internal` types.

Under the hood, the `XMLChoiceEncodingContainer` and `XMLChoiceDecodingContainer` are used to provide `encode` and `decode` methods tuned for `XML` choice elements.

Because of the characteristics of the `XML` format, there are some ambiguities (from an encoding and decoding perpsective) between unkeyed container elements that contain choice elements and those that contain nested unkeyed container elements.

In order to untangle these ambiguities, the new container types utilize a couple of new `Box` types to redirect elements along the encoding and decoding process.

## Source compatibility

This is purely an additive change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants