By Peter Finch

Archive for April 13th, 2007

Unescape a Python “escaped” string

Posted by pcfinch on April 13, 2007

Python has a very useful regular expression function to escape special characters out a string. Oddly, there is no reverse function. Note that python itself will automatically escape the backslash when printing out the string. e.g..

>>> a = re.escape('Special \\#`1\\')
>>> a
'Special\\ \\\\\\#\\`1'\\\\

A simple way to “unescape” the string is to use a regular expression again. The following RE searches the string to a backslash followed by any character and replaced it with that character. The RE selects the character in to a group (.) and then uses that group in the substitution string 1.

>>> z = re.sub(r'\\(.)', r'\1', a)
>>> z
'Special \\#`1\\'

The tick here is the back reference to the character following the escaping “\”. You may think that all you need to do is replace all the “\” characters in the escaped string with nothing “”. Unfortunately, that doesn’t work correctly when you have an escaped “\” e.g. “\\” (which results in an escaped version “\\\\”) … confused?. Performing a simple replace on “\” will result in empty string.


