GCD and Blocking Kernel Calls 2009/09/09

One of the great features of Grand Central Dispatch is that it can make thread usage decisions on a system-wide basis. Without GCD, running several threaded applications can lead to lower performance due to extra context switching. On the other end of the spectrum, pessimistically threaded applications might under utilize your available cores. Achieving a perfect balance in the presence of a complex workload spread across multiple processes was impossible.

NSOperationQueue, which now uses GCD as its work horse, recommends that you leave its maximum concurrent operation count on the default setting, but you can limit the width of the queue manually if you want. NSOperations can have dependencies and priorities, and between these you might think that the operations will simply get scheduled to threads in priority order, once all their dependencies are satisfied. And, you’d be correct as far as that goes, but GCD is more clever than that!

Consider a silly case like:

  dispatch_apply(100, queue, ^(size_t i){
        sleep(1);
    });

If GCD just followed the dependency, priority and strictly limited the number of threads to the core count, this could fill up all the threads and effectively block GCD from completing any work. System-wide services shouldn’t be this easily b0rked. Luckily, GCD has magical powers that seem to let it know when a thread performs a blocking kernel call, making a core available. GCD will then spin up another operation thread, leaving you with more blocks in flight than you have cores!

On the whole, this seems great, but in specific cases it may bite developers that don’t expect it. In my real code, I’m processing a couple thousand compressed files. At the top of each block, I was opening one compressed file and reading its contents. These I/O calls are blocking and would let GCD spin up more operations. Having 1000 threads all fire up and allocate all the memory in the system wedged my machine for a while and confused me since I thought GCD was supposed to handle this.

It just turns out that GCD has more magic than I expected.

I’m not yet sure how I should best address this issue. GCD has support for monitoring readability of non-blocking file descriptors via dispatch_source_create which might serve, or I might create a separate queue, with a fixed concurrent operation count, for the reading portion of the operation. Suggestions welcome!

In retrospect, this tie between GCD and the kernel is obviously the right thing, and was hinted at WWDC, but I’ve not yet seen it specifically documented (pointers to such docs also welcome).