R – Determining if a Unicode character is visible

cocoahidden-charactersunicode

I am writing a text editor which has an option to display a bullet in place of any invisible Unicode character. Unfortunately there appears to be no easy way to determine whether a Unicode character is invisible.

I need to find a text file containing every Unicode character in order that I can look through for invisible characters. Would anyone know where I can find such a file?

EDIT: I am writing this app in Cocoa for Mac OS X.

Best Answer

Oh, I see... actual invisble characters ;) This FAQ will probably be useful:

http://www.unicode.org/faq/unsup_char.html

It lists the current invisible codepoints and has other information that you might find helpful.

EDIT: Added some Cocoa-specific information

Since you're using Cocoa, you can get the unicode character set for control characters and compare against that:

NSCharacterSet* controlChars = [NSCharacterSet controlCharacterSet];

You might also want to take a look at the FAQ link I posted above and add any characters that you think you may need based on the information there to the character set returned by controlCharacterSet.

EDIT: Added an example of creating a Unicode string from a Unicode character

unichar theChar = 0x000D;
NSString* thestring = [NSStirng stringWithCharacters:&theChar length:1];