@synchronized swimming (part 2)

Tuesday, November 07, 2006 at 10:52 AM

Posted by: Dave MacLachlan, Member of Technical Staff, Mac Team

Previously we addressed the problem of optimizing around a shared resource. We came up with one solution, but it was kind of messy, and we wondered if there might be a better way. And now: the conclusion.

There is at least one more elegant solution, but it is slightly less safe. So far I've assumed we're using Objective-C, but what happens if we use Objective C++, specifically Objective C++ with gcc 4? According to the GCC4 porting notes:

GCC 4.0 automatically adds locks around any code that initializes local static variables in C++. If you do not need this protection and want to reduce your code size slightly, you can disable the locking behavior by passing the -fno-threadsafe-statics option to the compiler.

This appears to be backed up by the gcc 4.0 release notes and the C++ ABI. So this implies that if we just change our compiler from standard Obj-C to Obj-C++ we should be able to do the following:

+(id)fooFerBar:(id)bar {
static NSDictionary *foo = [NSDictionary dictionaryWithObjects:...];
return [foo objectWithKey:bar];
}

which is certainly nice and clean. Let's take a quick look at the disassembly:

cxa_guard_acquire

my actual code

cxa_guard_release
cxa_guard_abort
Unwind_Resume
...

and by scanning the code for cxa_* (ADC registration required) we can see that it's doing almost exactly what we want. The only pitfalls here are if we somehow attempt to compile with a gcc version less than 4.0 (we can put in guards against this happening) or we use -fno-threadsafe-statics in a threaded environment, in which case we're asking for trouble, and trouble will certainly follow (we won't do that).

So, we've got a thread-safe shared resource that does what we want with a minimal amount of code. One last tiny issue remains. What happens if we accidentally mix our Objective-C @synchronized with C++ dynamic initialization of local statics?

+(id)fooFerBar:(id)bar {
@synchronized(self) {
static NSDictionary *foo = [NSDictionary dictionaryWithObjects:...];
return [foo objectWithKey:bar];
}
}

and the disassembly shows:

...
objc_sync_enter
objc_exception_try_enter
setjmp
objc_exception_extract
cxa_guard_acquire

my actual code

cxa_guard_release
cxa_guard_abort
objc_exception_try_exit
objc_sync_exit
objc_exception_throw
...

Yes, ladies and gentlemen, you get to pay for 4 lock/unlocks and two
exception stacks to protect your wee shared resource, so you may want
to watch for this pattern in your performance-sensitive code when
porting to Objective C++.

Feedback from part 1

Thanks to reader Bill Bumgarner, who came up with an interesting solution to my problem that has a distinctive Obj-C feel to it:

static NSDictionary *foo = nil;

+(id)fooFerBar:(id)bar {
@synchronized(self) {
if (!foo) foo = [NSDictionary dictionaryWithObjects:...];
ReplaceMethodImplementationWithSelector
([self class], @selector(fooFerBar), @selector(fooFerBar2));
}
return [foo objectWithKey:bar];
}

+(id)fooFerBar2:(id)bar {
return [foo objectWithKey:bar];
}

where ReplaceMethodImplementationWithSelector swizzles fooFerBar with fooFerBar2. So, the first time fooFerBar is called we get our slow case, and any later calls get the fast case, assuming the first thread has completed fooFerBar. This solution provides great performance, and you only have to pay for the synchronize once. Very nice!