orphan: |
---|
Authors: | Chris Lattner, Joe Groff, Dave Abrahams |
---|---|
Summary: | Unifying a fast C-style array with a Cocoa class cluster that can represent arbitrarily complex data structures is challenging. In a space where no approach satisfies all desires, we believe we've found a good compromise. |
A successfully-bridged array type would be both "great for Cocoa" and "great for C."
Being "great for Cocoa" means this must work and be efficient:
var a = [cocoaObject1, cocoaObject2] someCocoaObject.takesAnNSArray(a) func processViews(_ views: [AnyObject]) { ... } var b = someNSWindow.views // views is an NSArray processViews(b) var c: [AnyObject] = someNSWindow.views
Being "great For C" means that an array created in Swift must have C-like performance and be representable as a base pointer and length, for interaction with C APIs, at zero cost.
Array<T>
, a.k.a. [T]
, is notionally an enum
with two cases; call them Native
and Cocoa
. The Native
case stores a ContiguousArray
, which has a known, contiguous buffer representation and O(1) access to the address of any element. The Cocoa
case stores an NSArray
.
NSArray
bridges bidirectionally in O(1) [1] to [AnyObject]
. It also implicitly converts in to [T]
, where T is any class declared to be @objc
. No dynamic check of element types is ever performed for arrays of @objc
elements; instead we simply let objc_msgSend
fail when T
's API turns out to be unsupported by the object. Any [T]
, where T is an @objc
class, converts implicitly to NSArray.
Any type with more than one representation naturally penalizes fine-grained operations such as indexing, because the cost of repeatedly branching to handle each representation becomes significant. For example, the design above would pose significant performance problems for arrays of integers, because every subscript operation would have to check to see if the representation is an NSArray, realize it is not, then do the constant time index into the native representation. Beyond requiring an extra check, this check would disable optimizations that can provide a significant performance win (like auto-vectorization).
However, the inherent limitations of NSArray
mean that we can often know at compile-time which representation is in play. So the plan is to teach the compiler to optimize for the Native
case unless the element type is an @objc
class or AnyObject. When T
is statically known not to be an @objc
class or AnyObject, it will be possible to eliminate the Cocoa
case entirely. When generating code for generic algorithms, we can favor the Native
case, perhaps going so far as to specialize for the case where all parameters are non-@objc
classes. This will give us C-like performance for array operations on Int
, Float
, and other struct
types [2].
To implement this, we'll need to implement a new generic builtin, something along the lines of "Builtin.couldBeObjCType<T>()
", which returns a Builtin.Int1
value. SILCombine and IRGen should eagerly fold this to "0" iff T
is known to be a protocol other than AnyObject, if it is known to be a non-@objc
class, or if it is known to be any struct, enum or tuple. Otherwise, the builtin is left alone, and if it reaches IRGen, IRGen should conservatively fold it to "1". In the common case where Array<Element>
is inlined and specialized, this will allow us to eliminate all of the overhead in the important C cases.
For hardcore systems programming, we can expose ContiguousArray
as a user-consumable type. That will allow programmers who don't care about Cocoa interoperability to avoid ever paying the cost of branching on representation. This type would not bridge transparently to Array, but could be useful if you need an array of Objective-C type, don't care about NSArray compatibility, and care deeply about performance.
We considered an approach where conversions between NSArray
and native Swift Array
were entirely manual and quickly ruled it out as failing to satisfy the requirements.
We considered another promising proposal that would make [T]
a (hand-rolled) existential wrapper type. Among other things, we felt this approach would expose multiple array types too prominently and would tend to "bless" an inappropriately-specific protocol as the generic collection interface (for example, a generic collection should not be indexable with Int
).
We also considered several variants of the approach we've proposed here, tuning the criteria by which we'd decide to optimize for a Native
representation.
[1] | Value semantics dictates that when bridging an NSArray into Swift, we invoke its copy method. Calling copy on an immutable NSArray can be almost cost-free, but a mutable NSArray will be physically copied. We accept that copy as the cost of doing business. |
[2] | Of course, by default, array bounds checking is enabled. C does not include array bounds checks, so to get true C performance in all cases, these will have to be disabled. |