WWDC 2010: Core Animation in Practice, Part 2

New and Overlooked APIs

Drop shadows - shadowOpacity, shadowRadius, shadowOffset
shadowPath was necessary for bringing drop shadows to iPhone (was originally only on Mac) since it lets the compositor cache a shadow bitmap, otherwise would have to recompute shadow on every frame

CAShapeLayer - more efficient than drawing into bitmap

Bitmaps can't be scaled up without re-drawing, and can't animate contents
Because drawing is deferred until composite-time, and you know resolution, the path can stay sharp
Can also interpolate/animate within two path states

Performance tradeoffs of CAShapeLayer:

Pro: Low memory
Pro: No cost for transparent areas (unlike custom views, where you carefully look for layer blending, since core animation knows the exact pixels that are opaque)
Con: More CPU to render (since drawing is deferred, may also run on every frame)

Overall, best for a few semi-large elements - won't work with 1000s of them (CPU will be bottleneck)

Bitmap caching - motivated by larger resolution of iPad

shouldRasterize to request that layer subtree is flattened to a bitmap
bitmap version will be reused when possible - may improve performance

Caveats to bitmap caching:

Can lead to memory pressure on device
Caching and not reusing much worse than not caching
Locks layer image to particular size
Rasterization occurs before mask is applied

Cubic keyframes - kCAAnimationCubic
Uses Catmull-Rom spline

Key takeaways:

Use shadowPath for high-performance shadows (very important)
Use CAShapeLayer for scalable, animatable vector content (performance can handle a few shape layers)
Set shouldRasterize for cached layers (ideally last resort)

Mental model of performance

At high-level, GPU converts triangles to pixels

Each triangle has color fill or texture image
Triangles can "blend" over background (vs opaque, where you can just write pixel directly to final screen buffer)
Destination can also be an image for another triangle

How does Core Animation use GPU?

Each layer is translated to triangles
Each rectangle is 2 triangles, e.g. backgroundColor is two colored triangles and contents is two triangles with an image
More complex compositing (e.g. caching or masking) uses offscreen rendering
Core Animation won't even send content under opaque regions to GPU, e.g. your app sits on top of SpringBoard, but SpringBoard layers aren't sent to GPU

GPU performance model:

Write bandwidth - how many destination pixels?
Goal is to minimize alpha-blended pixels
- For images: Ensure opaque CGImageRef's have no alpha channel
- For drawing: Set layer.opaque = YES when drawing opaque content
- Cut layers with opaque regions into multiple sublayers, e.g. if you have a mostly opaque image with some non-opaque parts, then instead of making the entire image non-opaque, stitch together the opaque piece with non-opaque pieces
- Use "Color Blended Layers" Instruments option, or CA_COLOR_OPAQUE=1 environment variable
Read bandwidth - how many source pixels?
- Use images that match layer size (both when drawing and loading images)
- "Color Misaligned Images" in Instruments, or CA_COLOR_SUBPIXEL=1
Rendering passes - how many buffers? (e.g. switching buffers for offscreen rendering)
- Ideally one rendering pass per frame, but sometimes requires some clever tricks
- Complex composition features (masking, group opacity, filters) often require multiple passes
- Use "Color Offscreen" in Instruments, or CA_COLOR_OFFSCREEN=1
- Layer bitmap caching can hide extra passes (but remember it requires one extra render pass itself)

Too much non-opaque content => limited by write bandwidth
Too many large images => limited by read bandwidth
Too many masked layers => limited by rendering passes

General performance methodology:
Measure fps
If fps < 60, then eliminate render passes, reduce read bandwidth, reduce write bandwidth (in that order)
Repeat until 60 fps

High-DPI

2x scale factor applied to root of UIWindow layer tree

When rasterizing, use layer.rasterizationScale = 2

contentsScale layer property - scaling factor between layer geometry and screen geometry

For precise native pixel layout, add 0.5x scale layer