Oh, Unicode combining characters and normalization. I classify that as "somebody else's problem." Specifically, if you're writing a font rendering engine, it's your problem. If you're writing an Arc compiler, it's not your problem. If you want complete Unicode library support in your language (like MzScheme's normalization functions string-normalize-nfd, etc.), then you just use an existing library such as ICU, and it's not your problem. ICU: http://www-306.ibm.com/software/globalization/icu/index.jsp