Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enumerated NonEmpty #51

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

Alex-Ozun
Copy link

@Alex-Ozun Alex-Ozun commented Sep 13, 2022

Problem Statement

NonEmpty provides a compile-time guarantee that a collection contains a value. This invariant is preserved even after collection is transformed with map/flatMap/shuffled functions, which is expected. One would expect that enumerating a NonEmpty collection would also preserve this invariant, but it's not the case:

let xs = NonEmptyArray(1, 2, 3)
let a = xs.map(String.init) // ✅ still NonEmpty<[String]>
let b = xs.enumerated.map { "\($0.offset) \($0.element)" } // ⚠️ back to vanilla Array<String>

In order to get an enumerated NonEmpty result one has to either introduce an outer index, which feels gross and error prone:

var offset = 0
let b = xs.map {
  defer { offset += 1 }
  return "\(offset) \($0)"
}

or temporarily leave the NonEmpty context and force their way back:

let temp = xs.enumerated.map {  "\($0.offset) \($0.element)" }
let b = NonEmpty(rawValue: temp)!

Motivation

Semantically, there seems to be no apparent reason for the nonemptiness invariant to be lost during the enumeration. Enumeration is just a case of map transformation that embellishes original collection with indexes, and map already preserves nonemptiness. Hence, enumeration should too.

Implementation

Introduce a dedicated Enumerated type nested in NonEmpty, echoing Swift's EnumeratedSequence. Implement all functions that preserve nonemptiness invariant: map, flatMap, shuffled, first, etc. Conform NonEmpty.Enumerated to Sequence to match the functionality of EnumeratedSequence for backwards compatibility:

let xs = NonEmptyArray(1, 2, 3).enumerated() // type NonEmpty<[Int]>.Enumerated

let nonEmptyArray: NonEmpty<[String]> = xs.map { "\($0.offset) \($0.element)" } // new map overload
let array: [String] = xs.map { String($0.element) } // type inference for vanilla Array map still works

NB

Similar points could be made for other derivative sequence types, such as LazySequence, Zip2Sequence and ReversedCollection. For example, Haskell preserves nonemptiness when zipping.

@ph1ps
Copy link

ph1ps commented Oct 11, 2022

Although I think this PR is a great idea and illustrates that there are some parts missing in this library to fully support all of Swift's collection and sequence operators, it might not be the best solution to introduce such ad-hoc types like NonEmpty<_>.Enumerated.

Imagine continuing with this solution and adding something like NonEmpty<_>.Zip2Sequence. How would you define the free func zip(_:_)? It could look something like this:

func zip<E1, E2>(_ nonEmpty1: NonEmpty<E1>, _ nonEmpty2: NonEmpty<E2>) -> NonEmpty<???>.Zip2Sequence<E1, E2>

The first problem would be that the generic requirement of NonEmpty cannot be defined in a meaningful way. A workaround would be to introduce something like a NonEmptyZip2Sequence instead of extending the NonEmpty type, which would result in something like this:

func zip<E1, E2>(_ nonEmpty1: NonEmpty<E1>, _ nonEmpty2: NonEmpty<E2>) -> NonEmptyZip2Sequence<E1, E2>

But now we are running into another problem. We cannot zip NonEmpty<_>.Enumerated with another NonEmpty because their type does not match. A solution to this problem would be to introduce a whole lot of overloads with every possible combination of non empty types. In my opinion this would not be very viable either.

It would make the most sense to me, to imitate Apple's way of handling this problem like they did with eg. LazySequence and introduce something like a NonEmptyCollectionProtocol and a NonEmptySequenceProtocol. Then you could introduce types like NonEmptyMapCollection/Sequence or NonEmptyEnumeratedSequence which conform to those protocols.

You could then even go further and extend those protocols with default implementations like you did with first, map, shuffled so that you won't have to repeat this every single type you introduce. This approach would also get rid of all the force unwraps in the library like here or here because eg. NonEmptyMapCollections initializer would require a type that conforms to NonEmptyCollectionProtocol.

An example of what I was thinking about (for simplicity reasons I chose sequence over collection here):

protocol NonEmptySequenceProtocol: Sequence {
  associatedtype Base: Sequence
  var base: Base { get }
}

extension NonEmptySequenceProtocol {
  
  @inlinable
  var underestimatedCount: Int { return base.underestimatedCount }
}

extension NonEmptySequenceProtocol where Element == Base.Element {
  
  @inlinable
  func makeIterator() -> _NonEmptyIterator<Base.Iterator> { return _NonEmptyIterator(_base: base.makeIterator()) }
  
  @inlinable
  @warn_unqualified_access
  func min(by areInIncreasingOrder: (Element, Element) throws -> Bool) rethrows -> Element { return try self.min(by: areInIncreasingOrder)! }
  
  @inlinable
  @warn_unqualified_access
  func max(by areInIncreasingOrder: (Element, Element) throws -> Bool) rethrows -> Element { return try self.max(by: areInIncreasingOrder)! }
}

extension NonEmptySequenceProtocol where Element == Base.Element, Element: Comparable {
  
  @inlinable
  @warn_unqualified_access
  func min() -> Element { return self.min()! }
  
  @inlinable
  @warn_unqualified_access
  func max() -> Element { return self.max()! }
}

Here the rather simple map type implementation:

struct NonEmptyMapSequence<Base, Element> where Base: NonEmptySequenceProtocol {
  
  @usableFromInline
  var _base: Base
  
  @usableFromInline
  let _transform: (Base.Element) -> Element
  
  @inlinable
  init(_base: Base, transform: @escaping (Base.Element) -> Element) {
    self._base = _base
    self._transform = transform
  }
}

extension NonEmptyMapSequence {
  
  struct Iterator {
    
    @usableFromInline
    var _base: Base.Iterator
    
    @usableFromInline
    let _transform: (Base.Element) -> Element
    
    @inlinable
    init(
      _base: Base.Iterator,
      _transform: @escaping (Base.Element) -> Element
    ) {
      self._base = _base
      self._transform = _transform
    }
  }
}

extension NonEmptyMapSequence.Iterator: IteratorProtocol, Sequence {
  
  @inlinable
  mutating func next() -> Element? { return _base.next().map(_transform) }
}

extension NonEmptyMapSequence: Sequence {
  
  @inlinable
  func makeIterator() -> Iterator { return .init(_base: _base.makeIterator(), _transform: _transform) }
}

extension NonEmptyMapSequence: NonEmptySequenceProtocol {
  
  @inlinable
  var base: Base { _base }
}

And finally the extension on the protocol itself:

extension NonEmptySequenceProtocol {
  
  @inlinable
  func map<U>(_ transform: @escaping (Element) -> U) -> NonEmptyMapSequence<Self, U> { return .init(_base: self, transform: transform) }
}

The downsides I am currently seeing is that some implementations would then be lazy instead of eager like it is currently implemented, but I don't know if this would an actual problem and there will be a lot of reimplementations of already existing Swift code...

And to round this up and come back to the example I made previously, the zipping. You would need to implement a type like NonEmptyZip2Sequence and then just have this:

func zip<Sequence1, Sequence2>(_ sequence1: Sequence1, _ sequence2: Sequence2) -> NonEmptyZip2Sequence<Sequence1, Sequence2> where Sequence1: NonEmptySequenceProtocol, Sequence2: NonEmptySequenceProtocol

If something like "protocolizing" NonEmpty would make sense and is something that Brandon and Stephen could see landing in main I would be glad to collaborate on this with @Alex-Ozun in a joint effort to cover all possible Swift's collection and sequence operators.

I hope I did not miss anything here, so please correct me if I got something wrong. Thanks.

@Alex-Ozun
Copy link
Author

Thanks for your great comment @ph1ps! I didn't have time yet to carefully read your code snippets, but I got the general idea of your proposal, and I fully agree with it. As you can see in the NB paragraph of the PR description I alluded to the idea of covering the whole suite of non-empty collections/sequences. And that definitely requires the "protocolisation" of NonEmpty, like you said. With Point-free's blessing I'd be happy to go down this path with you.

But before we go down the rabbit hole, I think it's worth exploring how this work can be broken down and linked together.

  1. We could either move from abstractions to concretions, e.g. create protocols and free functions first and then implement concrete non-empty collections based on them.
  2. Or we could first implement concrete non-empty collections and then tie them up together with protocols and free functions. Granted, this might exact some implementation refactoring for collections, but APIs should remain stable.

For this specific PR I think it'd be beneficial to go from nested NonEmpty.Enumerated to stand-alone NonEmptyEnumerated to save us from typealias in the future and set the ground for future work.

@ph1ps
Copy link

ph1ps commented Oct 13, 2022

I think I'd favor the 2nd approach. I would imagine abstracting collections as not very straight forward (that's why I did not include it in my code samples) and furthermore LazyCollectionProtocol implementations share little to no code with each other.

One thing I also just realized is, that NonEmpty.Indices should also be non empty if the collection is not empty. I feel like it would make sense to implement this with the same batch as NonEmptyZip2Sequence to have the possibility to zip a collection and its indices like with normal collections. So this could be added to your list as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants