Tuesday, July 15, 2014

nil is now a convertible literal!

In doubly-optional values, I raised a number of issues with the way Swift handles nil.

In brief, when you have a type T??, there are many cases where you can't clearly distinguish between a T?? value that happens to be a T? nil vs. a T?? nil (in Haskell terms, between a Just Nothing and a Nothing). And most reasonable attempts to deal with that were either impossible to write, or led to compiler crashes or incorrect behavior.

It looks like part of the problem was that trying to handle nil a la C++98 NULL just didn't work right. You can't let nil be a value of every type, you can't let it be a value of a special type that's comparable to and assignable to every type, and you definitely don't want it to be implicitly convertible everywhere, so… what can you do? Well, that's exactly the same problem they solved for the number 0. 0 has to have a type, not every type; int shouldn't be comparable and assignable to every other numeric type (at least not with the other design choices they've made elsewhere), you definitely don't want int to implicitly convert to Int16 or Float32, and yet you want to be able to initialize non-int variables with 0, so… what can you do?

The solution for int was to create a new IntegerLiteral type, and a protocol for defining types that can be initialized with integer literals. And the same goes for float literals, string literals, and a few other types—most notably, array and dictionary literals.

And it looks like, with beta 3, they've gone with the same solution for nil; it's a singleton of a new NilLiteral type. Although I haven't gotten beta 3 installed and running yet, it seems like this should at least make the problem solvable, if not solve it.

I also like spelling Array<T> as [T] and Dictionary<Key, View> as [Key:View], and a few other changes. But of course this means that some of my existing code that worked in beta 1 and beta 2 (product, optionalize and friends, etc.) needs to be changed. And this doesn't seem to be the only breaking change. So it looks like the promise of no significant, code-breaking languages changes after beta 1 was a bit premature. Also, the stdlib header visible in Xcode seems to be even farther from what the compiler actually uses (and the error messages refer to) than in beta 2. So, maybe it's a bit early for me to be trying to implement a complete library, and I should instead just build one or two functions as a proof of concept that I can keep tearing apart and re-doing, then worry about writing something useful once the compiler and library are actually ready?

Thursday, July 3, 2014

Generic higher-order functions and overloads

Let's say we have some type named Spam, and a function validate of type Spam->bool that tells you whether a Spam is valid, and a Spam[] array, and we want to find all the invalid values.

We could of course write a loop, but that's silly; the reason filter exists is to make code like this simpler and more readable. But our predicate does the opposite of what we want, so how can we use filter?

We could write a new inline function that just reverses the sense of the predicate. That works in any language; here it is in Python, Haskell, and Swift:

    filter(lambda x: not validate(x), spam)
    filter (\x -> !validate(x)) spam
    filter(spam) { !validate($0) }

But this can get pretty tedious to write and read, and going through an extra closure for each element could have a performance impact.

Python solves this by identifying a few special cases that come up often, like this one, and writing special functions for those cases, in this case itertools.filterfalse:

    filterfalse(validate, spam)

Haskell, on the other hand, just provides tools to transform functions into different functions. In this case, you just compose the not function with your predicate:

    filter (not . validate) spam

If you're doing this a lot, you can use operator sectioning to define a negate function:

    negate = (not .)
    filter (negate validate) spam

For that matter, if you really wanted a filterfalse function, it's just:

    filterfalse = filter . negate

If that doesn't make any sense, well, that's my fault; I haven't explained that Haskell functions are curried. This means that a function (A, B)->T is the same thing as a function A->(B->T). So, filter doesn't take a predicate and a list and return a list; it takes a predicate and returns a partially-evaluated function that then takes a list and returns a list. So composing it with negate gives you a function that takes a predicate, negates it, and returns a partially-evaluated function that etc.

Anyway, even ignoring that last part (which isn't going to apply to Swift), the Haskell solution is clearly a lot more flexible and powerful. Could we do the same thing in Swift?

Well, there is no compose operator—but, much like Haskell, you can define custom operators out of any combination of the symbols /=-+*%<>!&|^.~ that doesn't already mean something. Unfortunately, a solitary dot already means something, but a double dot, .., is available. So, let's do that:

    operator infix .. {}

    func .. <T, U, V> (left: U->V, right: T->U) -> (T->V) {
        return { left(right($0)) }
    }

So far, so good. So, now we can do this, right?

    filter(spam, ! .. validate)

Wrong.

The first problem is that, while operator functions can be named by their symbols, the parser apparently can't handle two operators in a row—that is, you can't pass the ! function to an operator, or you'll get this string of errors:

    error: unary operator cannot be separated from its operand
    error: unary operator cannot be separated from its operand
    error: '..' is not a prefix unary operator

But we can get past that by giving the parser a little help:

    filter(spam, (!) .. validate)

But that doesn't work either:

    error: could not find an overload for '..' that accepts the supplied arguments

You can check the types; { !$0 } is Bool->Bool, and validate is Spam->Bool, so T is Spam, U is Bool, V is Bool, everything should work. But something is going wrong, and it's not entirely clear what. Can the inference engine not infer this type? Or is it the parser again? Let's try throwing in some more parens:

    filter(spam, ((!) .. validate))

That compiles. So… maybe comma is actually handled as an operator, even though the docs don't say so and it doesn't appear in the stdlib operators, and the comma operator has higher precedence than default custom operators, even though that's silly? Let's change our operator definition and see:

    operator infix .. { precedence 1 }

And now it compiles without the parens. So… I guess that was the issue.

OK, so we can compose the negation operator with a function and do things the Haskell way, rather than the Python way. Cool.

The next example after composing not in many Haskell tutorials is composing reverse and sort. Although Haskell has parametric types instead of generics, the idea is roughly similar. Both functions take a list of elements of any type T and return a list of elements of type T, so their composition does the same.

In Swift, sort takes an Array of any Comparable subtype T and returns an Array of T; reverse takes any Collection whose associated IndexType is at least BidirectionalIndex and returns a custom view struct. Since an Array<T> is a Collection with IndexType Integer (which is a subtype of BidirectionalIndex), everything should work. So, let's try it:

    desort = reverse .. sort

Nope:

    error: could not find an overload for 'sort' that accepts the supplied arguments

The problem isn't that the types are incompatible:

    let a = [0, 3, 1, 2]
    let b = reverse(sort(a))

For once, the error message is actually helpful. The problem is that sort is not actually a generic function, it's an overload of two separate generic functions. One takes an Array of any Comparable subtype T, the other takes an Array of any type T plus a predicate from (T, T)->Bool.

The frustrating thing is that the types of both overloads can be composed with reverse; (T[], (T, T)->Bool)->T[] is just as compatible with with (C : Collection where C.IndexType : BidirectionalIndex)->ReverseView<C> as T[]->T[] is.

In some cases, this isn't a problem. For example, you can map sort over an array. Why? Because map actually applies its function argument (as opposed to returning a function that will apply it later), which means the type of the arguments is known at the time you call map (rather than being a generic type to be resolved at the time you call the returned function); the Element type associated with the Sequence or Collection is the only argument to sort, therefore the one-parameter sort overload is selected.

So, generic higher-order functions that don't actually apply their function arguments immediately do not play well with overloads in Swift. Which sucks, because those are two powerful features that you'd frequently want to use in the same place (see my previous post about how overloads allow functions like map to return a lazy Collection when possible, a one-shot Generator when not), but instead, you have to choose one or the other. The Swift stdlib seems to have gone with the overload feature. Which means the Haskell-like solution of providing a compose operator, a flip function, etc. isn't going to work in Swift, and we'll have to go with the weaker option of explicitly providing functions like filterfalse for each important-enough special case.

One way Swift could resolve this is to give an overload set a disjunctive type made up of the types of each overload. So, the two overloads shown below would combine into the single type:

    func sort<T : Comparable>(array: T[]) -> T[]
    func sort<T>(array: T[], pred: (T, T) -> Bool) -> T[]

    func sort<T: Comparable | U>((T[] | (U[], (U, U) -> Bool)) -> (T[] | U[])

Would that type be compatible with reverse? Yes; both T[] and U[] match C : Collection where C.IndexType : BidirectionalIndex. So, if we define desort = reverse .. sort, its type ends up as:

    func desort<T: Comparable | U>((T[] | (U[], (U, U) -> Bool)) -> 
        (ReverseView<T[]> | ReverseView<U[]>)

Then, inside desort, whether it got one parameter or two, it's just passing that off to sort, which has to handle overload resolution normally, so everything should work.


Saturday, June 28, 2014

What's already built in?

How to find the standard library

When I started this, I knew that some of the functions I wanted would already exist, but I couldn't figure out how to find them.

I've since found it: In Xcode, type the word "false", right-click it, and Jump to Definition. The global variable false is defined in the same file as the whole rest of the standard library—all the functions, structs, and classes (and typealiases, globals, etc.) that exist. And there are even some useful comments in there.

It still won't tell you where the file actually is (if you try to use any of the navigation stuff, you end up navigating from the file you typed "false" into, not the stdlib file), but you can always Duplicate it to save a copy for future reference (e.g., if you want to have it available in emacs, not just Xcode).

More on Generator and friends

As I suspected: Generator is explicitly stated to be single-pass-and-consumed, and should also be a Sequence. A Sequence may or may not be reusable. A Collection is a Sequence that's guaranteed to be reusable, and also subscriptable. Probably not a coincidence that these are exactly the relationships between Iterator, Iterable, and Sequence in Python. (I'll get to Collection later; there's a lot more there.)

However, despite that comment, it's not enforced anywhere, or documented, that a Generator must be a Sequence; it's perfectly possible to define Generators that aren't Sequences, and in fact the stdlib defines quite a few of them, including the Generators for Dictionary and String. 

On top of that, while a Generator shouldn't start over if you call generate on it, and (contrary to what I originally thought) none of the stdlib Generator-Sequences do so, there seems to be a quirk (or, hopefully, bug) of the for statement that has the same effect. For example, if you define an Array a = [1, 2, 3], and a Generator g = a.generate(), then call g.next() to consume the first value, calling g = g.generate() will never get the first value back (good). However, a for loop—which is supposed to be equivalent to calling generate and then looping over next—does not consume some kinds of Generators. So for i in g {} will not advance g at all. I have no idea why that happens with, e.g., IndexingGenerator, but not my own types, although my suspicion is that it has to do with built-in (that is, compiler-supported) Generators vs. those written in plain Swift. Anyway, as far as I can tell, it shouldn't be happening with any of them.

Hopefully both of those problems will be fixed in the final 1.0 version. If not, that means you really can't use Generators in for loops; any itertools-style programs will have to use while-let loops all over the place. (And, if the first problem isn't fixed, they may even have to do an as? cast to Sequence followed before calling generate, because you can't rely on the generate method being present.)

It's also not documented anywhere that a Generator signals that it's done by returning nil, or that it should return nil forever after doing so the first time, etc., so that's still to some extent just a guess…

Meanwhile, there are a bunch of helper functions/types for creating Generators, which I wish I'd known about earlier.

IndexingGenerator: In Python, for historical reasons, anything indexable by successive integers starting from 0 is automatically iterable even if it doesn't provide __iter__, Swift of course didn't copy this, but it did add a type, IndexingGenerator, that makes it trivial to emulate. If you've got a Collection that you want to make iterable, just add a generate method that returns an IndexingGenerator(self), and you're done.

GeneratorOf: Given a next function (presumably a closure), this gives you a Generator that just calls that function over and over. Sort of like the two-argument form of iter in Python, except without a sentinel. You can also give this a Generator, in which case it wraps that Generator in some way that seems to tee some Generators (that is, it starts with the same state as the original Generator, but they can both be advanced independently from there) while delegating directly to others (so advancing either advances the other). From what I can tell, the same Generators that support for wrong also tee here, and vice versa. (But what about a generator created by GeneratorOf from your own next closure? I need to test that…)

GeneratorSequence: Given a Generator, this gives you another Generator, which tees or delegates to the original in exactly the same circumstances in which GeneratorOf would. Although this stores the Generator itself rather than its bound next method, as far as I can tell the effect is always identical anyway. I think the intention here may be to wrap a Generator that may not be a Sequence (which shouldn't be possible, but is) in one that definitely is a Sequence, but (a) that shouldn't be necessary if the stdlib were fixed, and (b) it still isn't necessary because GeneratorOf can always handle it.

SequenceOf: Given a Generator or a Sequence, this gives you a Sequence (that's not a Generator) that wraps it in the same way as the above two functions. Not sure why this is needed, or anything but a bad idea. If you use this on a single-pass generator, you're just fooling yourself into believing you have a non-Generator Sequence; when you then call generate on it multiple times, all of the resulting Generators will share state.

GeneratorOfOne: Given a single optional value, gives you a Generator that produces just that value then quits. Sort of like [value].generate(), but without needing to create an Array first. If you give it nil, the Generator is empty instead; not sure why that's useful, or why it doesn't just take a T instead of a T? in the first place. Also not sure why the argument is called elements rather than singular element.

CollectionOfOne: Given a single (non-optional) value, gives you a Collection that has only one element (and is indexed by the Bit type, not Integer… see below on that). I suspect the reason for this type is that the compiler can optimize away all the subscripting stuff when it sees this type better than when given a more general collection. (From a theoretical point of view, anywhere you could use a trivial monad instead of lowering a monadic function, you could implement that with CollectionOfOne in Swift; practically, you're not going to have monadic functions in the first place in Swift, and if you did, you'd write a trivial monad…)

EmptyGenerator: Returns a Generator of no elements. (Not as useless as it sounds—with no elements, you need somewhere to specify the type—it obviously can't be inferred from the elements or the function that produces the elements or anything else, right?—and EmptyGenerator<T>  is about as concise a way to say that as possible.)

EmptyCollection: Returns a Collection of no elements (which is indexed by Int, with start and end indices both 0).

Anyway, GeneratorOf seems like it might be able to replace some of the boilerplate structs I've been creating. But not all. In fact, in some cases, I probably need more structs (as explained later, after I get through Collection).

Warning: GeneratorOf<T> is a single type

When I first discovered and started playing with this stuff, I realized that GeneratorOf<T> being a concrete type meant that I could now use my chain(Sequence...) function on anything created by any of my functions using GeneratorOf, instead of only on identical functions. And this seemed great.

But then I realized that this was relying on an implementation detail of those functions. And, worse, an implementation detail that isn't shared by the standard collection types, or map and filter, etc. So, it's actually pretty misleading if chain(islice(foo, 1), repeat(x)) happens to compile. Chain needs to take any number of Sequences of any Sequence type (as long as they all share the same Element type—or, even better, convertible ones, but I don't think that's possible). If that can't be done, it's useless and shouldn't exist.

Collection and friends

As I mentioned earlier, Collection is much like Python's Sequence—it guarantees that the Collection can be iterated over as many times as you want without consuming it, and it also specifies that the Collection can be indexed with the subscript operator. There's also a MutableCollection, just like Python's MutableSequence, that inherits from Collection and adds mutating indexing. And there's an ExtensibleCollection, which doesn't inherit from Collection or define anything, but I think it's intended to be used for types with reserve and capacity.

But beyond that, Collection also has a lot in common with C++ Collections, as I'll explain in a second.

Subscripting uses a special method, subscript. Just like init, there's no func on it (and, for MutableCollection, no mutating either). But unlike init, there is a normal return type. And unlike any other method, that return type can be (but doesn't have to be) followed by a property description—{ get } or { get set }. This is how Swift does the equivalent of Python's separate __getitem__ and __setitem__ methods.

It's worth looking at C++ here. In C++, reference types (T& is a reference to an lvalue of type T), are used all over the place, and subscripting is one of those places. There's generally an operator[] const that returns a const T&, and an operator[] that returns a T&, so x[2] = 3 turns into x.operator[](2) = 3, where the left-hand side is a reference to the location x[2]. This is a tricky problem for collections that don't actually have addressable locations, like std::bitvector; they traditionally work around this by defining a Reference struct with a custom operator= that mutates the parent collection appropriately (which, before C++11, was a nice instance of the standard library violating its own standard requirements). On top of that, reference types add a whole lot of complexity to the language (especially once you start to consider what a reference to a reference means—and how generic functions can easily make bad assumptions about that—which is ultimately one of the main reasons C++11 was necessary). While they add a lot of power, in almost every case, there are easier ways to do the same thing—out and inout parameters, some way to do the equivalent of Python's __setitem__ and __setattr__, and maybe a few other special cases, and you've eliminated most of the need for references. So, that's what Swift is doing here with its special subscript method.

In both C++ and Python, while indexing can take any type you want, you pretty much always use integers (anything convertible to size_t in C++, anything with an __index__ method that returns an int in Python, or a size_t in the C API for Python extension methods). In Swift, subscript can take any type, and this is actually meant to be used. And this is how Swift handles the equivalent of C++ iterators.

In C++, iterators are an abstraction over C pointers. You get a pair of iterators from a collection with its begin and end methods, and then all of your subsequent code (including all the generic algorithms in the standard library, Boost, etc.) doesn't care where they came from. You just do *it to access the value, ++it to move to the next value, it != last to check if you've reached the end. This can be cool, especially since it means you can define iterators that do things like append onto the back of a vector, or pull characters off an input stream. But it also means that many iterators need a hidden reference back to their collection—and those that don't need that (e.g., because the collection is just a hunk of contiguous array storage or a graph of linked nodes) can easily invalidate iterators that other code is still using, leading to crashes or corruption.

In Swift, indexes are… well, indexes, things you pass to the subscript operator. So, instead of *it, you do coll[it], but otherwise, things are basically the same. They're just as closely bound to their collection class as C++ iterators, but don't try to hide that fact. This means you don't get the power of std::vector::back_inserter or std::istream_iterator, but Swift gets that power in another way (by using what Python calls iterators instead of what C++ calls iterators), and it also means Swift avoids the complexity and the safety problems of C++ iterators.

So, what do you need indexes for in the first place?

Swift has a tower of Index protocols, mirroring C++'s tower of iterator concepts, meaning it can accept things less powerful than randomly-accessible indexes. This means that (unlike in Python) a doubly-linked list can be a Collection, it's just one that can only be bidirectionally indexed rather than randomly.

It also means that you can write a generic function that uses one implementation if given something that can't be randomly indexed, but another (e.g., faster) one if given something that can be. And then you can compose those into higher-level functions, and eventually you end up with big chunks of code that work just fine for any collection type and make it easy to read off the performance for each one based on what kind of index it uses. (At least that's the theory; that's turned out to be a lot less important in C++ than Stepanov and Lee originally expected when they designed the STL, and since Swift doesn't have most of the STL algorithms, and the few it does have seem to be either methods or sequence-based instead of index-based, it's likely to be even less important in Swift…)

Briefly, a ForwardIndex defines only succ, ==, and !=; a BidirectionalIndex adds pred; a RandomAccessIndex (which all of the numeric types define) adds distanceTo, advanceBy, and sometimes comparisons and arithmetic. Actually, the protocols don't require any methods at all, but all of the concrete implementations define these methods. Also, while neither the protocols nor most of the concrete classes define an associated type DistanceType anywhere, various other functions seem to depend on it existing, so presumably you have to define it if you're writing your own index types. (That all seems a little off, but there are a couple other cases like that; I think the protocol definitions may just not be complete in the beta, and there's magic code to not check stdlib/builtin types.) As in C++, if you want to advance a ForwardIndex n times, you can do so by explicitly calling advance(idx, n), and likewise for distance(idx1, idx2); having separate free functions for those signals that you're going to get O(N) time rather than O(1).

(As a side note, I think in some cases the names are aliases for symbolic operator functions; they seem to have changed how things work multiple times and not finished cleaning up yet. Possibly defining succ gets you ++, uncheckedAdd gets you &+ and, if not defined elsewhere, +, etc. But it's not that straightforward. For example, Bit.one + Bit.one gives you "fatal error: Can't unwrap Optional.None". But there's no + method or function returning Bit?. What there is, is an uncheckedAdd that returns a (T, Bool) pair, where the T wraps around and the Bool is false if you overflow, and the &+ operator seems to use that to just return the first, while + seems to try to turn that into T? and then immediately unwrap it for some reason. Maybe there are multiple levels of translation from old to newer to newest API? Who knows. Maybe these functions aren't even being called, and never were, and someone just felt like defining a bunch of verbosely-named functions for no good reason. Anyway, the point is, don't rely on the fact that these names will be succ and pred and so on in the final release; they could be operators like ++ and -- instead.)

Beyond using things that are less powerful that randomly-accessible integer indexes, this flexibility also allows using something more powerful, like a Double (which could, say, interpolate between known values).

Plus, it allows something clever for Dictionary that Python can't do, thanks to static typing and overloading. A Swift Dictionary has an associated DictionaryIndex type, which is guaranteed to be different from its Key type, which means that you can do both by-key and by-index lookup on the Dictionary with no ambiguity. This could be even more interesting with a sorted-tree-based mapping (like C++'s std::map). In Python, because a dict is Iterable (yielding its keys), and has methods that give you key, value, and key-value pair views that are themselves Iterable, you can often get away with not being able to index it, but there are cases where it would be useful to get the element before the current one, etc., and there's just no way to do that.

Unfortunately, other than DictionaryIndex, there are really no examples of any of this being used. Everything else in the stdlib is indexed with Int, except for CollectionOfOne, which is indexed with Bit, which is just another kind of RandomAccessIndex that only has two values (the enum values zero and one).

Finally: In Python, if you write your own Integer-like type and want it to be usable as an index to lists and similar classes, you implement the __index__ method. You do the same thing in Swift by implementing ArrayBound, defining an associated type ArrayBoundType, and adding a method getArrayBoundValue. (Oddly, all of the standard number types define ArrayBoundType=self, and a getArrayBoundValue that returns self, but none of them inherit from ArrayBound.) Anyway, it looks like this is also usable for defining types that can be adapted to index values for other classes, not just arrays, although there aren't any examples of such a thing, so it's hard to say how it works.

Slicing

In Python, slicing (e.g., a[1:5]) works by calling the same __getitem__ method with a special value of type slice.

In Swift, the solution is basically the same: a[1..5] calls subscript with a special value of type Range.

(Python has different types for slice and range. The reason for this is negative indexing; -4:5:1 means all the indices starting 4 from the end and going until 5 from the start, and there's no way to know how many elements that is without knowing the length of the Sequence it's being applied to. So, the slice type has an indices method that takes a length and gives you back the positive start, stop, and step for that length. Because Swift doesn't have negative indexing—which sucks, but that's another story—it can use the same type for ranges and slices, and doesn't need a method to convert between the two.)

And this is one of those cases where static typing rocks. In Python, you get a single __getitem__ method. If you want to support slicing, you have to check isinstance(idx, slice) inside that method and do the appropriate thing. In Swift, there are two overloads for subscript; one takes IndexType, the other takes Range<IndexType>.

Swift also has a separate protocol, Sliceable, that defines this overload, along with MutableSliceable. It's a little strange that Sliceable doesn't inherit from Collection (especially since this means a Collection with Element type Range<T> could accidentally be used as a Sliceable with Element type T…). Even more so given that MutableSliceable does inherit from MutableCollection. Even more so that, while ArrayType inherits from MutableSliceable, concrete types like Array<T> and ContiguousArray<T> instead inherit from MutableCollection and Sliceable directly. (Then again, they also don't implement the subscript overloads as settable properties, and the fact that the former is defined as Array<T> and then extended as T[] implies that there's some compiler magic under the hood here…)

Also, for some reason, most of the Collection types that could trivially be Sliceable, like Repeat and ReverseView, are not. Some of them can slice anyway, some can't. (Compiler magic again?) And there doesn't seem to be anything that can take any randomly-accessible Collection and give you a Sliceable wrapper, which seems like it would be simple.

Numbers

Earlier, I was looking for something like Addable, and annoyed that it looked like I'd have to define it myself, guess all the appropriate stdlib types to add it to, and then extend them all before I could write a sum method.

There are actually a whole bunch of protocols for numeric and similar behavior, they're just not documented, or very well organized. So, here's a list:
  • LogicValue: getLogicValue (allows your type to be used in boolean contexts like if)
  • Equatable: == (but not !=)
  • Hashable: Equatable plus hashValue
  • Comparable: Equatable plus <=, >, >= (but not <)
  • IntegerArithmetic: Comparable plus +, -, *, /, %, toIntMax
  • BitwiseOperations: &, |, ^, ~, allZeros
  • SignedNumber: unary -
  • AbsoluteValuable: SignedNumber plus abs
  • FloatingPointNumber: nan/inf-related methods
  • Integer: RandomAccessIndex
  • SignedInteger: Integer
  • UnsignedInteger: Integer
The standard integral types (Int8/16/32/64, UInt8/16/32/64, Int, and UInt—the latter two are separate types from all of the specific-sized types, defined as equivalent to long and unsigned long in C) all implement Hashable, either SignedInteger or UnsignedInteger, RandomAccessIndex (even though that's already taken care of by the last one), BitwiseOperations, and, for signed types, SignedNumber. None of them claim to implement Equatable, Comparable, IntegerArithmetic, AbsoluteValuable, or ArrayBound. They do all implement all of the relevant methods. And casting to those types succeeds anyway. But you get garbage when you try to print the casted value, and none of the methods actually work if you try to call them on it.

The standard floating-point types (Float, Double, Float80; Float32 and Float64 are aliases for the first two) implement Hashable, Comparable, AbsoluteValuable and RandomAccessIndex. They do not claim to implement IntegerArithmetic or FloatingPointNumber; again, they implement the methods, casting appears to succeed anyway, you get garbage when you try to print the value, and none of the methods work.

Bool is (intentionally) not an integral, or even numeric, type; there are no types that are both a LogicValue and a number. But Bit is a similar type that inherits from Int and only has the values zero and one. (And, unlike all the actual numeric types, it implements IntegerArithmetic.)

So, none of this is really usable, at least in the beta; it looks like you do have to define your own protocols and extension them on.

And as for the Convertible protocol I was hoping for, there's this intriguing function: "numericCast<T, U>(x: T) -> U". It's defined four times in a row. But… how could you possibly use that? You can't explicitly specialize functions. And I can't find any situation in which the compiler can infer the U type. I suppose it's defined 4 times in a row because April is the 4th month, and they're April-fooling us?

No DefaultConstructible, ZeroConstructible, etc. either, needless to say.

What itertools-like functionality is already there?

There are four functions with the same name, and basic functionality, as the Python builtins: enumerate, filter, map, and reduce. The first three are all documented. And there are a few more than may or may not be interesting.

Looking at how these functions are implemented (or at least what types they use) may be a good guide for how to write more such functions. (Or, in my case, maybe how to scrap them and rewrite them…) Are they all using GeneratorOf, or defining custom structs? Do they have separate overloads for Generator and Sequence?

Unfortunately, they all seem to be different.
  • enumerate: Always takes a Sequence, returns a Generator, of a custom type.
  • filter: Takes a Collection and returns a forward-only Collection, or takes a Sequence and returns a (reusable) Sequence, both custom types (with View stuck in their names), along with a third custom type for a Generator.
  • map: Identical to filter, except with a Collection it returns a similarly-indexable Collection. (Another overload also takes an Optional and returns an Optional.)
  • PermutationGenerator: A single Generator struct without a wrapper function, which takes a Collection and a Sequence, but is so far from being understandable by a human or a compiler that I have no idea what they mean.
  • Repeat: A single random-access-indexable Collection struct without a wrapper function (meaning of course that it can't handle infinite sequence, and that end is not optional, making it a lot less useful).
  • reverse: Takes a Collection, returns a lazy bidirectional-only Collection of custom type.
  • Zip2: A Sequence struct with a separate Generator struct (which isn't a Sequence), which takes two Sequences and zips them into a Sequence of pairs, ending as soon as the shorter Sequence ends. (If you happen to have a Generator that isn't a Sequence—which shouldn't be possible, but this very function shows that it can happen—you can use the Generator, named ZipGenerator2 rather than Zip2Generator, directly to zip together two Generators into another Generator.) (Oddly, the variable names here refer to "Streams" rather than "Generators". I'm guessing after a dev meeting where the Python fans and C++ fans came to blows over what "iterator" is supposed to mean, someone suggested "stream" for the Python thing and "index" for the C++ thing and that stuck for a while, and then there was a Great Renaming, possibly to punish the leader of the Pythonista faction for some transgression like forgetting to restart the coffee, and someone missed this function.) 
It's pretty obvious why filter and map do different things with Collections (you can't randomly-access a collection that's being filtered on the fly). And those are probably the most carefully-designed functions. And, sadly, there's no CollectionOf helper that will give me a collection of the same element type and index type, much less a CollectionOrSequenceOf that will give me whichever is appropriate based on input. So, it looks like I may need more boilerplate, not less. A code generator is starting to look better and better…

On the other hand, it looks like I've been overly verbose with my associated type matching. For example, Sequence<T> defines init overloads for G: Generator where T == T, not where G.Element == T.

There are some non-itertoolsy functions that at least consume Sequences or Generators worth looking at:
  • join: Takes a Sequence for the parts to be joined. And a Collection for the join-string (which, obviously, doesn't have to be a string) and returns a Collection of the same type.
  • maxElement, minElement: Take a Sequence.
  • reduce: Takes a Sequence.
There are also some misleading functions that initially look itertoolsy (and that will get in the way, too, until Swift handles namespaces properly):
  • count: Just returns the size of a Range of RandomAccessIndexes, which hardly seems worth a builtin function.
  • countElements: Sort of like ilen, in that it returns the length of anything, even if it has to count the whole thing, except for Collections, not Sequences. (And I'm not sure how this works; it returns T.IndexType.DistanceType without putting any requirements on T that would make it have such a thing. The compiler won't let me write code like that.)
  • dropFirst: Like but_first, or islice(1), except that it only works on Sliceable types. So it's really just x[1..x.count]. I suppose the fact that Swift doesn't have open-ended slices (which sucks) makes this not completely useless, but only just barely.
  • dropLast: As with dropFirst, only works on Sliceable types, meaning it's just a minor convenience.
  • sort: Only works on arrays (not even randomly-accessible Collections?). Also, both mutates and returns the array, just to confuse people coming from other languages. There's also insertionSort for bidirectional Collections and quickSort for SignedInteger-indexed (?) Collections, with different APIs.
  • split: Takes a Sliceable, returns an Array.

Optional

Optional is a lot closer to the Haskell model than it appears. It really is just an enum of Some(T) and None, with some minor just syntactic sugar that maps T? to Optional<T> and some stuff to make None values show up as and compare equal to nil.

So, does that mean that you can distinguish None from Some(None) in a T??, exactly as you would with Nothing and Just Nothing in a Haskell Maybe Maybe t, instead of trying to monkey around with let-binding to avoid the ambiguous/confusing nil check? Hopefully. (Or I suppose you could fmap the == nil into the monad, if you want to get silly…)

Other stuff

There seems to be some thing like Python named tuples, or the record types that decay to tuples as seen in many functional languages. For example, enumerate's Element is defined as a typealias for (index: Int, element: Base.Element). As the docs show, you can assign this to a 2-tuple, and as with a normal tuple you can access the values as .0 and .1, but you can also access them as .index and .element. Is that in the docs?

Printing can also be customized in many different ways. (There's an OutputStream protocol, to customize from that side, but normally you want to do it from the other side.) So, if you implement Streamable, you'll get writeTo called with the OutputStream; if not, but you implement Printable, you'll get description called, and the string you return gets written to the OutputStream; if not, some default value gets generated and written to the OutputStream. You can also implement DebugPrintable and debugDescription will be called if you're being printed to the debug stream (and this will also be used as a fallback if you didn't implement Streamable or Printable.) There's also a StringInterpolatingConvertible, with one method that takes an array of strings of type Self, and another that takes an expr of some unspecified type T, and I assume this is for customizing string interpolation, but I don't know how to use it. (Normally, interpolation just does the Printable thing.)

There's a reflection API (Mirror and Reflectable), but any attempt to use this seems to hang.

The way you allow "coercion" for literals to your type is with CharacterLiteralConvertible, IntegerLiteralConvertible, StringLiteralConvertible, ArrayLiteralConvertible, and DictionaryLiteralConvertible. The first four have FooLiteralType as an associated type; Array has Element; Dictionary has Key and Value. Presumably the literal gets coerced to that type before your convertFromFooLiteralValue gets called with the result.

Some types seem to have a protocol that's only implemented by that one type. Besides separating interface and implementation, this also seems to be a way to make most of the methods non-generic (by using associated types instead, of course). Since I already know that Sequence and Generator work this way, I shouldn't have been surprised to find ArrayType, etc.

Despite the docs going out of their way to say that overloading is a feature of methods and not functions, it works exactly the same way in functions, and the stdlib uses it all over the place, and of course it's used to solve exactly the same things I wanted it for. For example, sum1 and sum can be overloads instead of having stupid separate names (just like, e.g., the two versions of quickSort with and without less, which have different requirements on T).

The comments make it clear that the designers of the Swift stdlib were looking at all of C++, Python, and Haskell for inspiration; e.g., for HeapBufferStorage they show the C++ template equivalent, while for Optional.map, the comment is "Haskell's fmap, which was mis-named".

One last interesting note is how farsighted the Unicode support is, given that the ObjC runtime is basically UCS-2 clumsily faking itself as UTF-16. Quoting the comments, "Character represents some Unicode grapheme cluster as defined by a canonical, localized, or otherwise tailored segmentation algorithm." If you want to deal with grapheme clusters directly, you can, but if you just want to think in terms of characters, you can do that. (For example, you can define ExtendedGraphemeClusterLiteralConvertible if you want; if not, you can just define CharacterLiteralConvertible and let the stdlib do the conversion to characters and then hand them off to you.)

Friday, June 27, 2014

Doubly optional values

In Swift, it's perfectly legal to have a value of type T??, and can be very useful.

For example, if you take a sequence like a: Int?[] = [0, nil, 1, nil], and attempt to iterate over it, the generator has to give you values of type Int??. And, not only that, it has to be able to distinguish between an Int?? value that's nil (meaning iteration is done) and an Int?? value that has a real Int? value that happens to be nil (meaning it's just giving you a[1]).

This is a lot easier to think about in a language that doesn't try to hide the nilness. For example, in Haskell, if you have a Maybe Maybe Int, it could have any of these values:

  • Nothing
  • Just Nothing
  • Just Just 0
But in Swift, the equivalent Int?? values are:
  • nil
  • nil
  • 0
Even when displayed with the types, there's no help:
  • Int?? = nil
  • Int?? = nil
  • Int?? = 0
So, how can you distinguish them? Just comparing to nil won't help, as both compare equal to nil. But let binding works—bind an Int?? to an Int?, and the binding is false for the first case, but true for the second:
    if let a = foo { println("\(a)") } else { println("really nothing") }
This will print:
  • really nothing
  • nil
  • 0
So, everything's good. For example, these code fragments do the same thing, as they should:
    let a: Int?[] = [0, nil, 1, nil]

    for i in a { println("\(i)") }

    var g = a.generate()
    while let i = g.next() { println("\(i)") }
Except that once generic functions get involved, everything breaks. Try either of the following:
    func printem(s: S) {
        var g = s.generate()
        while let i = g.next() { println("\(i)") }
    }
    printem(a)

    func printem(s: S) {
        for i in s { println("\(i)") }
    }
    printem(a)
If you try to call either version with a sequence whose element type is optional, you get a compiler crash. In the first case, I think it's crashing while trying to infer types to compile the let binding, but putting an explicit i: T there doesn't change anything, so… who knows.

Oddly, if the type of a isn't an Array of optionals, but a custom struct, instead of a crash, it seems to successfully compile the wrong code, trying to bind the T?? directly to a T, and therefore treating both nil values and the end of the sequence the same.

Sunday, June 15, 2014

Troubles with zip

A part of porting itertools, I've had to figure out how various functions have to be limited because Swift just isn't as flexible as Python.

Python's zip takes N iterables, and generates N-tuples. Like this:
    >>> a = [1, 2, 3]
    >>> b = ["spam", "eggs", "spam"]
    >>> list(zip(a, b))
    [(1, "spam"), (2, "eggs"), (3, "spam")]
If you've never used a language with zip, you may not realize how useful it is. If you have, you're probably surprised that it doesn't come built in with Swift. But we can build it ourselves… sort of.

Variadic arguments

Swift—like Python, C++, ObjC, and even C—allows you to take variadic arguments. And at first glance, it seems like they've captured everything you need with a syntax that's as simple as Python's, and maybe even more readable:
    def zip(*sequences) # sequences is a list of all arguments
    func zip(sequences: Sequence...) # sequences is an array of all arguments
So, we're set, right?

Of course not. That just specifies one or more sequences of type Sequence. All of them have to be the same type. This is a direct consequence of the fact that variadic arguments are passed to your code as an Array (just as they're passed to Python code as a list… but of course Python lists are heterogeneous arrays, so that's not a problem). And there's really no other option, given the rest of Swift's design.

Why couldn't they have used a tuple instead of an array? Well, there's no way to iterate or index a tuple in Swift, so that would have been completely useless. And, even if there were, how would you get at the types of the values? Well, C++ has a solution to that: you get a tuple of values, and a type tuple of types.

In fact, even C solves this problem. In a horribly unsafe way, of course (you have to somehow pass the type information out-of-band, and then switch on that to pull out each value correctly). But still…

The code above will actually compile, because I've made the sequences all of type Sequence, but that isn't a useful type. (As explained somewhere in here, you need a generic function or type that can supply the associated types for Sequence if you want to do anything with it.)

Does zip really need to be heterogeneous?

Yes. That's the whole point of zip. You can take an Int[] and a String[] and get back an (Int, String)[].

(Transposing a T[][] is a useful thing to do, and zip happens to work for that in Python. But it's not the main function of zip, and I don't see any problem writing it as a separate function with a different name.)

Building tuples

Given that it's impossible to access a tuple dynamically, it's probably no surprise that it's also impossible to build one dynamically. So, even if you could somehow access a heterogenous tuple of sequences, you couldn't actually return anything useful from the next method.

Any way around the problem?

The way C++ used to handle this kind of thing, before parameter packs, was to use overloads: you define 10 functions with the same name, taking from 0 to 9 parameters, copying and pasting and modifying the code for each one. This is hideous, but between the preprocessor (which Swift doesn't have) and some clever compile-time metaprogramming with templates (which Swift doesn't have) you can hide the hideousness. But, even if you were willing to live with that hideousness, you don't have a choice, because in Swift, functions don't have overloads, only methods.

Another way to handle this in C++ is with default values. In C++, you had to define a special singleton type to use as a sentinel that could be detected via type inference. You could then either switch on that sentinel with if statements, or use partial specialization to get a similar effect to overloads. The good news is that Swift already provides the perfect value, nil, which you can use just by making all of your types optional. The OK news is that if statements are your only option. The bad news is that it won't work anyway. Here's an example:
    func foo(_ t0: T0?=nil, _ t1: T1?=nil, _ t2: T2?=nil) {
        if t0 {
            if t1 {
                if t2 {
                    println("\(t0) \(t1) \(t2)")
                } else {
                    println("\(t0) \(t1)")
                }
            } else {
                println("\(t0)")
            }
        } else {
            println("nothing")
        }
    }
If you try to call this with one, or two arguments, you'll get an error "cannot convert the expression's type '()' to type 'Int'" (or 'StringLiteralConvertible', or whatever the type of the last argument is). That error doesn't make a lot of sense, but it's clearly a type-inference problem, and if you think about it, it makes sense: The compiler has no problem figuring out that T1 is an Int if you pass an Int for t1, but if you leave t2 off (or if you pass an untyped nil literal, which has the same effect), how can it possibly figure out what type T2 is? It's just tried to print the error for the wrong type.

If you try to call it with zero or three arguments, you get a compiler crash instead. Which I assume is the same error handled even worse.

Falling back to dynamic (ObjC) types?

Needless to say, everything is a lot more successful if you try to define things in terms of Sequences of Any (or of AnyObject, or NSArrays, or…), except for the slight problem that you're very unlikely to actually have sequences of Any lying around that you want to zip up, so you still wouldn't be able to call zip on, say, the examples given at the top.

Fill value

What should zip do if given sequences of different lengths? There are three basic possibilities:
    • Stop as soon as any of the sequences stops.
        Keep going until all sequences stop, filling in missing values with some specified fill value.
          Keep going until all sequences stop, generating short tuples. The third one sounds like a good idea at first, but it turns out to be weird and rarely useful—which is good, because it would be very hard to implement in Swift. Python provides the other two as separate functions, zip and zip_longest. For zip_longest, the fill value is an extra keyword parameter after the sequences.

          The problem should be immediately obvious: If the two sequences are of different types, you need two different fill values, right?

          But often, there's a value that will work for both types, possibly with a cast. The obvious case here is nil, which you might even want to be the default.

          In order to specify something like that in a static language, you'd need some way to constrain the value to be convertible to both of the other types. Something like this:
              func zip_longest, U: Convertible>
                  (s0: S0, s1: S1, #fillvalue: U) { … }
          
          Except that protocols aren't generics, they're simple types with associated types that get matched up separately. So, U has to implement Convertible with associated type T0, and also implement Convertible with associated type T1, which… doesn't work. Unlike Java/C#-style generic interfaces, where being Convertible to T0 and Convertible to T1 are separate types that can be specified separately, Swift's associated-type-based protocols mean they're both the same type, Convertible, with the same associated type, U.ConvertibleTo, which has to be either T0 or T1.

          Could we specify two separate Convertible types? Sure:
              func zip_longest
          
          But this won't do any good, because fillvalue has to be either U0 or U1, it can't be both.

          I tried adding another type in, U, specifying U: U0 and U: U1, but (a) that leads to a compiler segfault, and (b) I don't think there's any way the compiler could infer the type anyway, meaning you'd have to figure out and type out all that mess every time you wanted to call the function.

          Conclusions

          Ultimately, I went with zip with exactly two sequences, zip_longest with exactly two sequences that must be optional (filling with nil), and zip_fill with exactly two sequences plus two fillvalues, not because that's a good design, but because it seems to be the only thing even remotely possible in Swift.
  • Sequences aren't first-class sequences

    Here's some simple code:
        var a2d: Int[][] = [[], []]
        for a1d in a2d { a1d.append(1) }
    
    You might think this would change a2d to [[1], [1]], just as the equivalent code in Python, C++, or almost any other language would. You'd be wrong. It does this:
        :63:18: error: immutable value of type 'Array' only has mutating members named 'append'
    
    Apparently iterating over a Sequence gives you immutable values. You can try to write your own Sequence and Generator that doesn't work that way, and it'll work that way anyway, just as the built-in ones do.

    So, what's the workaround? For arrays, of course, you can always write code like this:
        for i in 0..a2d.count { a2d[i].append(1) }
    
    But for a general-purpose Sequence, there's just nothing you can do.

    Well, I suppose you could give up on mutability and write everything in Swift immutably. In Python or Haskell, that tends to be briefer and often faster. What about Swift? Well, here's the code to build a new 2D array that's a copy of the old one but with 1 appended to every row:
        let tmp: Int[][]
        for a1d in a2d {
            var row = []
            for val in a1d {
                row = row + [val] // don't use += here!
            }
            row = row + [1]
            tmp = tmp + [row]
        }
    
    So, only 9x as much code. And from a quick test, it only takes 23x longer to run. Great!

    Crashity crash crash

    As I've been going along trying to port itertools to Swift, I've gotten a lot of compiler crashes (see radar 5057608666841088 for details), which look like this:

    teetest.swift:8:9: error: unimplemented IR generation feature non-fixed class layout
        var helper: TeeHelper<S, T>
            ^
    LLVM ERROR: unimplemented IRGen feature! non-fixed class layout
    retina:swift-itertools abarnert$ swift teetest.swift
    0  swift                    0x00000001058b6608 llvm::sys::PrintStackTrace(__sFILE*) + 40
    1  swift                    0x00000001058b6af4 SignalHandler(int) + 452
    2  libsystem_platform.dylib 0x00007fff888b35aa _sigtramp + 26
    3  swift                    0x0000000104d045e7 profileArchetypeConstraints(swift::Type, llvm::FoldingSetNodeID&, llvm::DenseMap<swift::ArchetypeType*, unsigned int, llvm::DenseMapInfo<swift::ArchetypeType*> >&, unsigned int) + 295
    4  swift                    0x00000001057b31bf llvm::StoreInst::StoreInst(llvm::Value*, llvm::Value*, bool, llvm::Instruction*) + 63
    5  swift                    0x0000000104d07e32 swift::irgen::IRBuilder::CreateStore(llvm::Value*, llvm::Value*, swift::irgen::Alignment) + 66
    6  swift                    0x0000000104cc5b4c emitNominalMetadataRef(swift::irgen::IRGenFunction&, swift::NominalTypeDecl*, swift::CanType) + 1724
    7  swift                    0x0000000104cb441c llvm::Value* swift::CanTypeVisitor<(anonymous namespace)::EmitTypeMetadataRef, llvm::Value*>::visit<>(swift::CanType) + 124
    8  swift                    0x0000000104cb4395 swift::irgen::IRGenFunction::emitTypeMetadataRef(swift::CanType) + 21
    9  swift                    0x0000000104cf76a4 swift::irgen::WitnessSizedTypeInfo<(anonymous namespace)::NonFixedStructTypeInfo>::getAlignmentMask(swift::irgen::IRGenFunction&, swift::CanType) const + 20
    10 swift                    0x0000000104cae562 swift::irgen::HeapArrayInfo::getLayout(swift::irgen::IRGenFunction&) const + 226
    11 swift                    0x0000000104cae938 swift::irgen::HeapArrayInfo::getPrivateMetadata(swift::irgen::IRGenModule&) const + 504
    12 swift                    0x0000000104caf9a9 swift::irgen::HeapArrayInfo::emitUnmanagedAlloc(swift::irgen::IRGenFunction&, llvm::Value*, swift::irgen::Address&, llvm::Twine const&) const + 73
    13 swift                    0x0000000104d23304 swift::SILVisitor<(anonymous namespace)::IRGenSILFunction, void>::visit(swift::ValueBase*) + 30884
    14 swift                    0x0000000104d1b266 swift::irgen::IRGenModule::emitSILFunction(swift::SILFunction*) + 8678
    15 swift                    0x0000000104c9c6f8 swift::irgen::IRGenModule::emitGlobalTopLevel() + 184
    16 swift                    0x0000000104d086e3 performIRGeneration(swift::IRGenOptions&, swift::Module*, swift::SILModule*, llvm::StringRef, llvm::LLVMContext&, swift::SourceFile*, unsigned int) + 1859
    17 swift                    0x0000000104d09033 swift::performIRGeneration(swift::IRGenOptions&, swift::SourceFile&, swift::SILModule*, llvm::StringRef, llvm::LLVMContext&, unsigned int) + 51
    18 swift                    0x0000000104c7b65a frontend_main(llvm::ArrayRef<char const*>, char const*, void*) + 4842
    19 swift                    0x0000000104c7a35d main + 1533
    20 libdyld.dylib            0x00007fff894c55fd start + 1
    Stack dump:
    0. Program arguments: /Applications/Xcode6-Beta.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swift -frontend -c -primary-file teetest.swift -enable-objc-attr-requires-objc-module -target x86_64-apple-darwin13.2.0 -module-name teetest -color-diagnostics -o /var/folders/gx/cs8knrj93rlc76rz7rg706dm0000gn/T/teetest-4b2ff9.o
    1. While emitting IR SIL function @_TF7teetest3teeUSs8Sequence__USs9Generator__FTQ_1nSi_GSaGVS_3TeeQ_Q0___ for 'tee' at teetest.swift:30:1
    <unknown>:0: error: unable to execute command: Segmentation fault: 11
    <unknown>:0: error: swift frontend command failed due to signal (use -v to see invocation)
    
    Apparently the compiler is trying to either infer or check my types, in this case in the tee function shown here:
    struct TeeHelper<S: Sequence, T where T == S.GeneratorType.Element> {
        var gen: S.GeneratorType
        var queues: T[][]
    }
    
    struct Tee<S: Sequence, T where T == S.GeneratorType.Element>
            : Sequence, Generator {
        var helper: TeeHelper<S, T>
        var i: Int
        func generate() -> Tee<S, T> { return self }
        mutating func next() -> T? {
            return nil
            /*
            if helper.queues[i].isEmpty {
                if let val = helper.gen.next() {
                    /* iterating over queues gives us immutable copies of each! */
                    for i in 0..helper.queues.count {
                        helper.queues[i].append(val)
                    }
                } else {
                    return nil
                }
            }
            return helper.queues[i].removeAtIndex(0)
            */
        }
    }
    
    func tee<S: Sequence, T where T == S.GeneratorType.Element>
            (sequence: S, n: Int = 2) -> Tee<S, T>[] {
        return []
    /*
        var gen = sequence.generate()
        var queues: T[][] = []
        var helper = TeeHelper<S, T>(gen: gen, queues: queues)
        var tees: Tee<S, T>[] = []
        for i in 0..n {
            tees.append(Tee(helper: helper, i: i))
            helper.queues.append([])
        }
        return tees
    */
    }
    
    Notice that I've commented out all the code that looks like it should matter.

    When the functions are simple enough, I can often figure out what I've done wrong, or what I've done right that apparently doesn't work right but I can change easily, or what actually isn't possible if you think it through. (Most commonly, the fix is to replace a Sequence type with S, which is a generic type parameter that's restricted to Sequence but also has a where clause on it.)

    But in this case, I'm baffled. Maybe I'm missing something stupid, but I don't see anything here that's any different from what I've done in a dozen other functions that all work, except for returning my Tee objects in an array. If you change the type of tee to Tee<S, T>? and return nil, the crash does go away, but why should it? Especially since I've stored Count, TakeWhile, etc. objects in arrays (even arrays of less exact types, like the Generator protocol) with no problem. (Anyway, that obviously wouldn't be a solution here; the whole point of tee is to return n separate generators…)

    If I change the tee function to return a Generator[] or Sequence[] instead of an array of my specific type, the compiler crash goes away, but then you can't actually use the values. The error message when you try is misleading: "could not find member 'next'" (or 'generate'). As far as I can tell, the problem is that protocols that have associated types can't be used to call any methods (or access any values) that refer to those associated types, unless you've bound them somehow in the local (generic) type or function. (Which seems to be a pretty serious problem in its own right, assuming they actually intended protocols to be as usable as ObjC protocols, Java interfaces, etc.—as in things you can store, pass around, and actually call methods on… But let's ignore that.)

    It may or may not be relevant that changing either struct to a class (for Tee of course you also need to remove the mutable keyword on next and add an explicit initializer) replaces the compiler crash with a compiler data error ("unimplemented IR generation feature non-fixed class layout"), both in this case and in most of the other crashes I've been before.

    I could deal with any one of incomplete documentation, an incomplete standard library, or a buggy compiler that crashes on me, but all three at once is starting to make this seem like a fool's errand.

    Repetition, repetition

    In my previous post, I started porting parts of itertools from Python to Swift. Everything went pretty smoothly, but not exactly fun, because there's a ton of boilerplate and no way that I can see to factor it out.

    Weak types

    Each itertools function requires creating both a struct and a function, a minimum of 17 lines, 14 of which are identical, the other 3 identical but for naming our struct. This is exactly the kind of thing you'd want to factor out. But can you?

    First, every Sequence-transforming function actually requires a struct for the new Sequence, plus a function that constructs that struct. This takes a minimum of 17 lines of code, of which 14 are always identical, and 3 are identical but for the name of the struct. And there doesn't seem to be any way to remove that by inheriting from a base struct or mixin, or by writing a function that generates the struct and function and returns the function, or in any other way. So you have to write them every time. (You could, of course, write a code generator, but Swift isn't exactly Lisp, so that is of little help.)

    Even in the simplest case, the generate method always looks like this:
        func generate() -> Cycle {
            return self
        }
    
    There's no way to specify Cycle<S, T>, the type currently being defined. Although self does mean "the current type", you can't use it here. In C++, you'd make each subclass define a typedef this_type, then use that in the base/mixin, but that doesn't seem to work with Swift.

    The lack of a way to write forwarding functions makes this even more painful. You can declare an associated type and where-bind it to the type of a function that you take as an initializer argument; you can even where-bind its argument type tuple and its result type separately. But that still doesn't help you define a new function of the same type that takes all of its parameters, forwards them to the other function, and returns its return value. ython makes this possible through duck typing (just accept *args, **kw and call the target function with *args, **kw) and dynamic attributes (copy the signature-related attributes from the target function for better introspection). C++ and Haskell make this possible through strong types and type inference. (The fact that generic forwarding wasn't perfect in C++ (because of the issue of reference vs. value parameters) is one of the major reasons C++11 was necessary.)

    The struct's initializer always takes the arguments and stores them, except for storing sequence.generate() in the case where one or more of the arguments is a sequence (which could be moved later if it helped), so you'd think you could just use a default initializer. But no, you can't, because there are almost always additional ("private") members that shouldn't be initialized that way, but would be if you tried.

    Let's take two simple functions, takewhile and filter (of course filter already exists, but pretend it didn't), whose difference is entirely in the next function, and yet which still require repeating 17 lines of code (replacing the generic name in 3 of those lines) just so you can write the 4 different lines:
        struct TakeWhile
                : Sequence, Generator {
            var gen: S.GeneratorType
            let pred: T->Bool
            init(sequence: S, predicate: T->Bool) {
                gen = sequence.generate()
                pred = predicate
            }
            func generate() -> TakeWhile {
                return self
            }
            mutating func next() -> T? {
                if let val: T = gen.next() {
                    if pred(val) { return val }
                }
                return nil
            }
        }
    
        func takewhile
                      (sequence: S, predicate: T->Bool) 
                      -> TakeWhile {
            return TakeWhile(sequence: sequence, predicate: predicate)
        }
    
        struct Filter
                : Sequence, Generator {
            var gen: S.GeneratorType
            let pred: T->Bool
            init(sequence: S, predicate: T->Bool) {
                gen = sequence.generate()
                pred = predicate
            }
            func generate() -> DropWhile {
                return self
            }
            mutating func next() -> T? {
                while true {
                    if let val: T = gen.next() {
                        if pred(val) { return val }
                    } else { 
                        return nil
                    }
                }
            }
        }
    
        func filter
                      (sequence: S, predicate: T->Bool) 
                      -> Filter {
            return Filter(sequence: sequence, predicate: predicate)
        }
    
    In Python, of course, you'd write these as generator functions, eliminating almost all of the boilerplate. And, even if you didn't, other differences (e.g., not needing types with wrapper functions, exceptions vs. checking and explicitly propagating failures) would eliminate a lot of it. But even if you wanted to write them in the same style and verbosity that Swift requires, you could still factor out the boilerplate after the fact. For example:
        class PredicateIterator:
            __slots__ = (gen, pred)
            def __init__(self, sequence, predicate):
                self.gen = iter(sequence)
                self.pred = predicate
            def __iter__(self):
                return self
    
        class TakeWhile(PredicateIterator):
            def __next__(self):
                try:
                    val = next(self.gen)
                except StopIteration:
                    raise
                if pred(val):
                    return val
                raise StopIteration
        def takewhile(sequence, predicate):
            return TakeWhile(sequence, predicate)
    
        class Filter(PredicateIterator):
            def __next__(self):
                while True:
                    try:
                        val = next(self.gen)
                    except StopIteration:
                        raise
                    if self.pred(val): return val
    
    Or:
        def make_predicate_iterator(nextfn):
            class PredicateIterator:
                __slots__ = (gen, pred)
                def __init__(self, sequence, predicate):
                    self.gen = iter(sequence)
                    self.pred = predicate
                def __iter__(self):
                    return self
                __next__ = nextfn
            def predicate_iterator(sequence, predicate):
                return PredicateIterator(sequence, predicate)
            return predicate_iterator
    
        def _takewhile_next(self):
            try:
                val = next(self.gen)
            except StopIteration:
                raise
            if pred(val):
                return val
            raise StopIteration
        takewhile = make_predicate_iterator(_takewhile_next)
    
        def _filter_next(self):
            while True:
                try:
                    val = next(self.gen)
                except StopIteration:
                    raise
                if self.pred(val): return val
        filter = make_predicate_iterator(_filter_next)
    
    And so on. Any way you can think of to refactor things—using a base class or mixin or metaclass, or a function that creates a class or closure and returns it, etc.—will work. Not so in Swift.

    Defaults and types

    Consider a standard fold/reduce/sum function. The fully general version takes a sequence, a combining function, a start value, and a transforming function, like this:
        func reduce(sequence, combine, start, transform) {
            for value in sequence {
                start = combine(start, transform(value))
            }
            return start
        }
    
    But often you don't want to specify all of those values. For example, it's very convenient to sum an array without specifying 0 as the starting value. Or to reduce a list without transforming any of its elements. And so on.

    In a language without neither overloads nor default values, your only choice is to figure out all the useful combinations, come up with a name for each, and write a whole slew of separate functions:
    • sum1(sequence)
    • sum(sequence, start)
    • sum1t(sequence, transform)
    • sumt(sequence, start, transform)
    • fold1(sequence, combine)
    • fold(sequence, combine, start)
    • fold1t(sequence, combine, transform)
    • foldt(sequence, combine, start, transform)
    But Swift has both overloads and C++-style default values, right? Well, no, that's for methods; for functions, there are no overloads, and default values work like Python instead. But that's fine. Could you write this all as one function with default values in Python? Sure:
        _sentinel = object()
        def fold(sequence, combine=lambda x, y: x+y, start=sentinel, transform=lambda x: x):
            sequence = iter(sequence)
            if start is _sentinel:
                start = transform(next(sequence))
            for value in sequence:
                start = combine(start, transform(value))
            return start
    
    You might want to split off the sum and reduce cases anyway, just so sum([]) can be sensible (and zero—by specifying a default start of 0 instead of the first element), as Python's stdlib in fact does, but the point is, you have the choice to split things up however you want. In Swift, you don't. Let's try:
        func fold
                (sequence: S, combine: (T, T)->T = { $0+$1 }, 
                 start: T? = nil, transform: T->T = { $0 }) -> T {
            var gen = sequence.generate()
            var current = (start == nil) ? transform(gen.next()!) : start!
            while let value = gen.next() {
                current = combine(start!, transform(value))
            }
            return current
        }
    
    Notice a few other changes I had to make, from the good through the bad to the worse:
    • Swift's optional values mean I didn't need to define a custom sentinel (which is necessary in Python whenever the usual sentinel, None, could be a valid value in your domain).
    • Swift can't iterate over a generator, only a sequence, if you want to pop a value off before the loop, you have to replace the for loop with a while-let-next loop. (The average Python programmer doesn't know how to map a for statement to a call to iter and a loop over next; in Swift, it's pretty much necessary information.)
    • Swift won't let you assign to an in parameter. Parameters might be reference types, in which case you'd be changing the caller's variable, or they might be value types, in which case you'd just be changing a local—and this is pretty much invisible at the point of function definition, because it's up to the parameter type. This seems like an unavoidable consequence of other design decisions in Swift.
    • We can test that start is nil, and we can test that the generator is empty… but what happens if start is nil and the generator starts out empty?
    In the code above (if it worked at all, that is), calling fold with no start and an empty generator would be a fatal error. That can't be right. Unexpected data shouldn't be a fatal error. In Haskell, Python, C++, C#, etc., it would be an exception. Clearly, then, the intention is that we're supposed to be using optional values a lot more in Swift than we'd use Maybe in Haskell, None in Python, nil in ObjC, etc. But look at the line of code. We can't use an if-let to try to bind current to start, because if that fails we have no way to bind it to gen.next(). Whatever we use, it can't be an expression (unless we cant to make current a T? just so we can check it immediately after binding it and return nil if it's nil and otherwise ! it everywhere). The most concise possibility I've been able to come up with is this hideous mess:
        var current: T
        if start == nil {
            if let firstval = gen.next() { 
                current = firstval
            } else {
                return nil
            }
        } else {
            current = start!
        }
    
    This may be a sign of my lack of understanding of the language, but this, together with the while-let-next loop, seems to point to the fact that Swift is not actually a language designed to be concise, like Python or Haskell; it's a verbose language like Java or C++, with bits of syntactic sugar that paper over that verbosity in some cases, which means you're frequently going to be forced to write—and, worse, read—the long versions anyway, and that even a novice has to understand how to map it pretty early on.

    But let's ignore all that. Does our function work? Nope: "could not find an overload for '+' that accepts the supplied arguments". The only way to fix that is to restrict it (via more where clauses for associated types) to only work on numbers. You can't write a fold function that has a default function for numbers, but not for other types. You have to write two separate functions, with separate names. In Python, of course, you'd just duck type it, and anyone who tried to fold a non-addable type without specifying a combine function would get a TypeError exception at runtime. In C++, you'd have many options—partial specialization, overloads and SFINAE, or a trait class template—so that the function would be perfectly usable with non-addable types (as long as you specify a combine function, directly or indirectly) and with addition as a default combine function (as long as the type is addable). Other strongly-typed languages like Haskell can do effectively the same as in C++, while most other modern static languages that don't have strong enough types fake it by raising an exception dynamically, just as Python would (theoretically a terrible solution, but practically, better than nothing).

    I think this is the biggest problem with Swift. The intent was obviously to build a strictly statically-typed language that doesn't rely on exceptions, but without all the complexity of C++ templates or Haskell abstract data types. But the reality is that Java-style generics, especially interacting with C++-style associated types, aren't strong enough to make that actually work. Again, maybe I just don't know the language well enough, but I'm not encouraged.

    Anyway, how does this fit into my attempt to port itertools? Well, it just means accumulate has to be split into two functions, accumulate (which requires a reducing function), and cumulative_sum (which requires addable types).

    Saturday, June 14, 2014

    The trouble with weak static types

    I got to this conclusion in a roundabout way, then I discovered there's a much simpler way to get there, so let me start there first.

    JSON

    In a dynamic language like Python, JavaScript, Smalltalk, or the non-C part of Objective C, dealing with JSON is so easy you barely even notice you're doing it. You have heterogeneous arrays and hetereogeneous dicts, so you can parse JSON to native objects, process them with the usual subscript operators, and generate JSON from them.

    In a static language with a strong type system like Haskell or OCaml, it's arguably even easier. You have a recursively-defined ADT which is any of the base types, an array of this type, or a dict mapping strings to this type. You can parse JSON to instances of that ADT, process it with the usual subscript operators, generate JSON from them.

    Even without ADTs, C++ templates are powerful enough to achieve the same idea, albeit more verbosely.

    Whichever way you get there, your collections are real collections that can be handled by the same duck-typed, monadic, or STL functions as any other types. Duck typing or type inference works exactly as you'd expect—e.g., when you call foo on every member of bar and bar[1] is a string, you get foo<string>(bar[1]). The dynamic version gives you a TypeError exception (or calls out to an application-supplied hook) if you try to do something illegal, like JSON-ifying a collection with a function or socket in it. The static versions are even better: anything that could conceivably be caught at compile time is a compiler error, so you only have to check for runtime errors for bad data, not for code errors.

    But Java's type system can't even come close here. Unlike C++ templates, Java generics are no help at all. Even besides the lack of inference, it's just not possible to describe the types you need. The best you can do is store a dictionary or array of "variant" objects, which basically means you cast everything to void* and back, and then fake dynamic types on top of that, then use checked casts to do all of the type checking Haskell or C++ would do explicitly, and at runtime. This is a gigantic pain.

    Of course you can remove some of that machinery and just have unsafe code that crashes with the wrong data and is still a big pain. Or you can try to build opaque JSON-related types with special methods to access and manipulate them, which means the objects aren't even collections at all anymore. But there's nothing you can do that doesn't suck.

    The joke is that languages like Java give you the safety of dynamic typing with the flexibility of static typing, but the reality is even worse than the joke; a dynamic language with properly duck-typed APIs is a lot safer than a static language that has to cast.

    From what I can tell, Swift is just like Java here. To deal with JSON, you effectively fall back to ObjC and its dynamic types, and then either explicitly cast values, or call special accessor methods, to get back to static-land.

    Accumulate

    Getting back to the first place where I ran into this problem: I've been trying to convert Python's itertools module to Swift, more as an exercize in learning the language than for any particular purpose, and I've run into a few stumbling blocks. The first insurmountable one was with accumulate.

    Accumulate takes a Sequence (an Iterable, in Python terms) and a function and generates a running total: x[0], func(x[0], x[1]), func(func(x[0], x[1]), x[2]), etc. In other words, it's a cumulative reduce/fold function. The function is optional, defaulting to addition. Here are examples of the four use cases, including the one that's nonsensical:
        >>> list(accumulate(range(1, 6))
        [1, 3, 6, 10, 15]
        >>> list(accumulate(range(1, 6), mul)
        [1, 2, 6, 24, 120]
        >>> list(accumulate([{1,2,3,4,5,6}, range(5), {i for i in range(100) if i%2}], set.intersection))
        [{1, 2, 3, 4, 5, 6}, {1, 2, 3, 4}, {1, 3}]
        >>> list(accumulate([{1,2,3,4,5,6}, range(5), {i for i in range(100) if i%2}]))
        TypeError: unsupported operand type(s) for +: 'set' and 'range'
    
    In Python, as you'd expect, if you try the nonsensical use case, you get a TypeError. Of course accumulate doesn't have to do anything special; adding two non-addable values always raises a TypeError, and it passes through accumulate because that's the point of exceptions.

    What about C++? Well, you could implement this with partial specialization, overloads with either + or a trait, or parameter with a default value of add; thanks to type inference (and SFINAE, in some cases), all of these will generate working code, with no runtime checks, for all three valid uses, and fail at compile time for the invalid use. That's ideal. And all without you having to write the more hideous of the types yourself, thanks to inference.

    It's a little harder to compare languages like Haskell, because most of the well-known ones have neither overloads nor default parameter values. But it should be pretty obvious that if you found a way to add some overload system to a language like Haskell (presumably giving up currying) so the four separate functions had the same name, you'd get a compile-time error trying to call the one with the default function with a non-addable type.

    In Java or C#, the type system isn't powerful enough to handle this directly, but you can at least fake it and get the Python-style behavior with runtime checking. It sucks, but not too badly.

    Swift and accumulate

    In Swift, the type system has the same problem as Java's, but you can't even fake it with runtime checking either.

    Partly this is because Swift doesn't do exceptions, which are exactly what you want here (if you can't get compile-time errors). Your Sequence can't produce nil, because generate returns GeneratorType, not GeneratorType?. Maybe your Generator could produce nil (although I haven't been able to get that to work), but that would just be wrong; that would mean that you successfully generated an empty sequence, not that you failed. So, the right thing to do is a (runtime) fatal error. In most cases, that seems like the wrong thing, but trying to accumulate a sequence of non-addable objects with the addition function is basically a fatal error in your code.

    Unfortunately, there's no way to do that either. Here's the obvious attempt:
        struct Accumulate
                : Sequence, Generator {
            var generator: S.GeneratorType
            var value: T?
            var f: ((T, T)->T)
            init(sequence: S, f: (T, T)->T) {
                self.generator = sequence.generate()
                self.value = generator.next()
                self.f = f
            }
            func generate() -> Accumulate {
                return self
            }
            mutating func next() -> T? {
                let thisvalue = value
                if let nextvalue = generator.next() {
             value = f(value!, nextvalue)
                } else { value = nil }
                return thisvalue
            }
        }
    
        func accumulate
                (sequence: S, f: (T, T)->T = { $0 + $1 }) -> Accumulate {
            return Accumulate(sequence: sequence, f: f)
        }
    
    But this will just fail with "could not find an overload for '+' that accepts the supplied arguments". Why? Well, you're creating a single generic Sequence type, and a generic function that tries to instantiate that type with a function that adds, no matter what the associated types are. That's illegal. (In C++, it isn't a problem, for two reasons. First, you're defining a schema of separate types, and a schema of separate functions, and instantiating the types that have addable elements is perfectly legal with the adding function. Second, depending on which of the mechanisms you use for defining things, you're going to end up with either a function or a member that will raise a compiler error if instantiated on the wrong type, but that will only happen if you actually try to call accumulate with the default function on a non-addable type; a function you never instantiate never raises an error.)

    So, how could you fix this? Well, you can eliminate that error just by adding more restrictions to the types. For example, define a Protocol for Addable, and require T to support Addable in Accumulate, and now everything compiles. But then you can't use Accumulate with non-addable types at all, even with a non-adding function.
    Maybe you could define a type-dependent default function, using a trait class that does different things for different types? Nope; that would work with C++ templates, but not with Swift generics. Besides, what would you want the trait-selected function to do for non-addable types? There's no way to force an error if it gets called at compile time, or at runtime.

    What about falling back to dynamic (ObjC) types here? Maybe if you took a function of (AnyObject, AnyObject)->AnyObject and converted each generated value to AnyObject and converted back in next you could get a fatal error or nil at runtime somehow. But that's not what you want. For the normal use, you want to be able to take an explicit (T, T)->T function the user has already built, or an inline closure that will be inferred as that type.

    What about having a ((T, T)->T)? function, and if it's nil, calling a function that checks the types or converts to and from AnyObject, to hide the attempt to add possibly non-addable types from the compiler? That seems like it should work, but I haven't been able to get close enough to find out, I think because of 1.0-beta issues. For example, at least one of my attempts (either if-let-binding a function constant or calling a function with ! on the function variable, I forget which) crashed the compiler even if the rest of the function was empty. If it did work, we're just back to the question of what should happen here. I guess forcing a fatal error should be doable? Is there an idiomatic way to do that? Anyway, until I can write that, it doesn't really matter.

    Solution

    The only solution I can see is to define two completely separate functions: accumulate works on any type, but requires an explicit function; cumulative_sum is exactly the same, except it works only on addable types, and has adding as a default function.

    It's also worth noting that defining the concept of "addable" isn't trivial in the first place. You'd think there would be some way to make the compiler infer it. But I haven't been able to figure it out. I can write a Protocol and explicitly register all the types I can think of, and explain to my users how to register any additional types. Or I could do a C++98-style traits class, with effectively the same results. But I can't just say "iff x+y compiles when x: T, y: T, T is addable". (Of course this has been a big problem in other languages as well—Concepts were supposed to be the coolest and most important feature in C++11, and instead they got bumped to a future version. But you'd think that designing a language from scratch would make it easier to solve. And there are languages that have solved it. And, of course, half the point of duck typing is to make this problem not come up in the first place…)

    Fold/reduce/sum

    In fact, even without trying to create a generator, you've got the same problem. Consider a standard fold/reduce/sum function. The fully general version takes a sequence, a combining function, a start value, and a transforming function, like this:
        func reduce(sequence, combine, start, transform) {
            for value in sequence {
                start = combine(start, transform(value))
            }
            return start
        }
    
    But often you don't want to specify all of those values. For example, it's very convenient to sum an array without specifying 0 as the starting value. Or to reduce a list without transforming any of its elements. And so on.

    In a language without neither overloads nor default values, your only choice is to figure out all the useful combinations, come up with a name for each, and write a whole slew of separate functions:
    • sum1(sequence)
    • sum(sequence, start)
    • sum1t(sequence, transform)
    • sumt(sequence, start, transform)
    • fold1(sequence, combine)
    • fold(sequence, combine, start)
    • fold1t(sequence, combine, transform)
    • foldt(sequence, combine, start, transform)
    But Swift has both overloads and C++-style default values, right? Well, no, that's for methods; for functions, there are no overloads, and default values work like Python instead. But that's fine too. Could you write this all as one function with default values in Python? Sure:
        _sentinel = object()
        def fold(sequence, combine=lambda x, y: x+y, start=sentinel, transform=lambda x: x):
            sequence = iter(sequence)
            if start is _sentinel:
                start = transform(next(sequence))
            for value in sequence:
                start = combine(start, transform(value))
            return start
    
    You might want to split off the sum and reduce cases anyway, just so sum([]) can be sensible (and zero—by specifying a default start of 0 instead of the first element), as Python's stdlib in fact does, but the point is, you have the choice to split things up however you want; you could make them a single function, as Python does with the cumulative iterator version, accumulate. In Swift, you don't have the choice. Let's try:
        func fold
                (sequence: S, combine: (T, T)->T = { $0+$1 }, 
                 start: T? = nil, transform: T->T = { $0 }) -> T {
            var gen = sequence.generate()
            var current = (start == nil) ? transform(gen.next()!) : start!
            while let value = gen.next() {
                current = combine(start!, transform(value))
            }
            return current
        }
    
    Notice a few other changes I had to make, from the good through the bad to the worse:
    • Swift's optional values mean I didn't need to define a custom sentinel (which is necessary in Python whenever the usual sentinel, None, could be a valid value in your domain).
    • Swift can't iterate over a generator, only a sequence, so if you want to pop a value off before the loop, you have to replace the for loop with a while-let-next loop. (The average Python programmer doesn't know how to map a for statement to a call to iter and a loop over next; in Swift, it's pretty much necessary information.)
    • Swift won't let you assign to an in parameter, even though that parameter is just a copy of the passed-in value. Parameters might be reference types, in which case you'd be changing the caller's variable, or they might be value types, in which case you'd just be changing a local—and this is pretty much invisible at the point of function definition, because it's up to the parameter type. This seems like an unavoidable consequence of other design decisions in Swift.
    • We can test that start is nil, and we can test that the generator is empty… but what happens if start is nil and the generator starts out empty?
    In the code above (if it worked at all, that is), calling fold with no start and an empty generator would be a fatal error. That can't be right. Calling with no function and a generator of non-addable values, sure, that's clearly a type error in your code, but trying to sum up a generator of addable values that happens to be empty is either perfectly reasonable, or an error in the data rather than the code. In Haskell, Python, C++, C#, etc., it would be an exception. Clearly, the intention is that we're supposed to be using optional values a lot more in Swift than we'd use Maybe in Haskell, None in Python, nil in ObjC, etc. But look at the line of code. We can't use an if-let to try to bind current to start, because if that fails we have no way to bind it to gen.next(). Whatever we use, it can't be an expression (unless we cant to make current a T? just so we can check it immediately after binding it and return nil if it's nil and otherwise ! it everywhere). The most concise possibility I've been able to come up with is this hideous mess:
        var current: T
        if start == nil {
            if let firstval = gen.next() { 
                current = firstval
            } else {
                return nil
            }
        } else {
            current = start!
        }
    
    That's 10 lines of code to do one thing: fill in a missing start from the generator, returning nil if empty. And maybe this is a sign of my lack of understanding of the language rather than of the language itself, but I don't think so. Swift is actually a very verbose language, on the level of Java, but it has some syntactic sugar to provide shortcuts in a few cases, like if-let binding or nil propagation, and in cases where they don't apply, you have to write things out the (very) long way. And the fact that these cases come up as soon as you go beyond the examples in the beginner's guide means that even novices are going to have to understand what the shortcuts are doing under the hood right from the start, and treat them as sugar rather than as abstractions.

    But let's ignore all that. Does our function work now? Nope: "could not find an overload for '+' that accepts the supplied arguments". The same problem we had with accumulate, even in this simpler case. And again, the only way to fix that is to restrict it (via more where clauses for associated types) to only work on addable values. You can't write a fold function that has a default function for numbers, but not for other types. You have to write two separate functions, with separate names.

    I think this is the biggest problem with Swift. The intent was obviously to build a strictly statically-typed language that doesn't rely on exceptions, but without all the complexity of C++ templates or Haskell abstract data types. But the reality is that Java-style generics, especially interacting with C++-style associated types, aren't strong enough to make that actually work. Falling back on dynamic ObjC types instead of always needing casts may make this better than Java (although I honestly haven't tried it enough to know that's true for sure), but it's in the same family, making the same mistakes that should have been obvious before they even started

    Anyway, how does this fit into my attempt to port itertools? I guess it just means accumulate has to be split into accumulate and cumulative_sum. And I'll see if there are similar problems later.