Saurabh 😎

WWDC 2016: Understanding Swift Performance

3 performance dimensions when picking abstractions (e.g. classes vs structs):

Memory allocation

Allocating on call stack is much much faster than allocating on heap, since it just requires incrementing/decrementing the stack pointer

Generally, struct is allocated on stack and class is allocated on heap

Useful tip: since structs are value types, can use structs as keys in dictionaries
Whereas if you use a string as a key, then every time you access a dictionary you have to create a string which first requires a heap allocation (whereas using a struct as a key can be allocated on the stack)

Reference counting

More costly than just a simple increment/decrement: also have a layer of indirection and some overhead for thread safety

Somewhat counterintuitively, structs that have multiple references can have more ref counting overhead since retain/release needs to be done for each struct member reference

Method dispatch

Swift defaults to dynamic dispatch, and only does static dispatch if guaranteed safe at compile-time

Static dispatch also enables further optimizations, e.g. inlining

struct methods are statically dispatch since don't support inheritance

Main advantage of dynamic dispatch is it enables polymorphism

Can mark an entire class as final to force static dispatch

Protocol types

Even though structs don't support inheritance, can still acheive polymorphism for structs using protocol types

2 implementation challenges with structs that conform to protocols:

Protocol witness table (PWT) enables dynamic dispatch without a V-table
Every type that conforms to a protocol has its own PWT, and table entries link to implementations of protocol methods in the concrete type

Instead of storing structs in an array, Swift boxes values of protocol types in an existential container

First 3 words of an existential container are the valueBuffer
If the entire struct can fit in 3 words, then struct is inlined entirely in valueBuffer
Otherwise, allocate the struct on the heap and store a pointer in the first word of valueBuffer

Value witness table (VWT) is created per value type and has methods like allocate: and copy: that can interpret that type's existential container layout

Every existential container has a reference to its type's Value witness table and Protocol witness table after valueBuffer

Takeaway: to get best performance with protocol-value conforming types, declare small enough structs that can fit inline in the existential's container valueBuffer and has no references to avoid reference counting overhead (can't avoid however the overhead of dynamic dispatch via the PWT)
Larger structs which don't fit inline in the existential container will require a heap allocation on every initialization, assignment, and copy

Generic code

So far, been using protocols to achieve polymorphism
Generic code supports a more static version version of polymorphism called parametric polymorphism

When dispatching to a generic function, Swift will bind the generic type to the concrete type used in the dispatch
This means that inside the generic function body, the type is not an abstract protocol type, but a concrete type

To implement dispatch, generic code also uses the Protocol witness table
However instead of passing an existential container (which contains a reference to the PWT), Swift extracts and passes pointers to the PWT and VWT as additional arguments to a generic function
The object itself is passed as the valueBuffer (which either contains the struct inline or a pointer to a heap-allocated struct)

Main advantage of generics over protocols is enabling of aggressive compiler optimizations called specialization of generics
Swift will create type-specific versions of generic methods, with one function version for each concrete type
This (usually) doesn't increase code size because compiler can now can use aggressive optimizations like inlining and dead code elimination

When stored as properties in structs, generics are more space efficient than protocol-conforming value types because no need to store the existential container - can instead just directly create different specializations of the struct layout for each combination of generic types used
However you then lose the ability to later change which concrete value type is stored in that property (but you often don't need that feature anyways)