Diacritical insensitive comparison, string folding

One of the common problems when dealing with language containing diacritical marks is matching an improper writing of a word missing the diacritical marks.

This is especially true for Romanian, as most Romanians (myself included) don't use diacritical marks when writing in Romanian.

I'm not here to pass judgment on writing with or without diacritics. There are others much better informed on the subject.

The reality is improper use of diacritics will not disappear in the near future and the good news is that Cocoa has built-in support for dealing with this:


NSString * stringDia = [NSString stringWithString:@"Mămăligă"];
NSString * string = [NSString stringWithString:@"Mamaliga"];

int res = [stringDia compare:string options:NSDiacriticInsensitiveSearch];
NSLog(@"The strings are %@equal", res == NSOrderedSame ? @"" : @"not ");



There's even an implementation for string folding which may be worthwhile if you're building an index of place names:

CFStringFold

Discussion
Character foldings are operations that convert any of a set of characters sharing similar semantics into a single representative from that set.

You can use this function to preprocess strings that are to be compared, searched, or indexed.[..]


PS In care you were wondering: Mămăligă.