Python – How to reverse Unicode decomposition using Python


Using Python 2.5, I have some text in stored in a unicode object:

Dinis e Isabel, uma difı´cil relac¸a˜o
conjugal e polı´tica

This appears to be decomposed Unicode. Is there a generic way in Python to reverse the decomposition, so I end up with:

Dinis e Isabel, uma difícil relação
conjugal e política

Best Solution

I think you are looking for this:

>>> import unicodedata    
>>> print unicodedata.normalize("NFC",u"c\u0327")