Tuesday, November 07, 2006 at 10:52 AM
Posted by: Dave MacLachlan, Member of Technical Staff, Mac TeamPreviously we addressed the problem of optimizing around a shared resource. We came up with one solution, but it was kind of messy, and we wondered if there might be a better way. And now: the conclusion.
There is at least one more elegant solution, but it is slightly less safe. So far I've assumed we're using Objective-C, but what happens if we use Objective C++, specifically Objective C++ with gcc 4? According to the GCC4 porting notes:
GCC 4.0 automatically adds locks around any code that initializes local static variables in C++. If you do not need this protection and want to reduce your code size slightly, you can disable the locking behavior by passing the
-fno-threadsafe-statics
option to the compiler.This appears to be backed up by the gcc 4.0 release notes and the C++ ABI. So this implies that if we just change our compiler from standard Obj-C to Obj-C++ we should be able to do the following:
+(id)fooFerBar:(id)bar {
static NSDictionary *foo = [NSDictionary dictionaryWithObjects:...];
return [foo objectWithKey:bar];
}
which is certainly nice and clean. Let's take a quick look at the disassembly:
cxa_guard_acquire
my actual code
cxa_guard_release
cxa_guard_abort
Unwind_Resume
...
and by scanning the code for cxa_* (ADC registration required) we can see that it's doing almost exactly what we want. The only pitfalls here are if we somehow attempt to compile with a gcc version less than 4.0 (we can put in guards against this happening) or we use
-fno-threadsafe-statics
in a threaded environment, in which case we're asking for trouble, and trouble will certainly follow (we won't do that).So, we've got a thread-safe shared resource that does what we want with a minimal amount of code. One last tiny issue remains. What happens if we accidentally mix our Objective-C
@synchronized
with C++ dynamic initialization of local statics?
+(id)fooFerBar:(id)bar {
@synchronized(self) {
static NSDictionary *foo = [NSDictionary dictionaryWithObjects:...];
return [foo objectWithKey:bar];
}
}
and the disassembly shows:
...
objc_sync_enter
objc_exception_try_enter
setjmp
objc_exception_extract
cxa_guard_acquire
my actual code
cxa_guard_release
cxa_guard_abort
objc_exception_try_exit
objc_sync_exit
objc_exception_throw
...
Yes, ladies and gentlemen, you get to pay for 4 lock/unlocks and two
exception stacks to protect your wee shared resource, so you may want
to watch for this pattern in your performance-sensitive code when
porting to Objective C++.
Feedback from part 1
Thanks to reader Bill Bumgarner, who came up with an interesting solution to my problem that has a distinctive Obj-C feel to it:
static NSDictionary *foo = nil;
+(id)fooFerBar:(id)bar {
@synchronized(self) {
if (!foo) foo = [NSDictionary dictionaryWithObjects:...];
ReplaceMethodImplementationWithSelector
([self class], @selector(fooFerBar), @selector(fooFerBar2));
}
return [foo objectWithKey:bar];
}
+(id)fooFerBar2:(id)bar {
return [foo objectWithKey:bar];
}
where
ReplaceMethodImplementationWithSelector
swizzles fooFerBar
with fooFerBar2
. So, the first time fooFerBar
is called we get our slow case, and any later calls get the fast case, assuming the first thread has completed fooFerBar
. This solution provides great performance, and you only have to pay for the synchronize
once. Very nice!