bistring
The bistring library provides non-destructive versions of common string processing operations like normalization, case folding, and find/replace. Each bistring remembers the original string, and how its substrings map to substrings of the modified version.
For example:
>>> from bistring import bistr
>>> s = bistr('πΏππ πππππ, πππππ π¦ πππππ ππππ πππ ππππ πΆ')
>>> s = s.normalize('NFKD') # Unicode normalization
>>> s = s.casefold() # Case-insensitivity
>>> s = s.replace('π¦', 'fox') # Replace emoji with text
>>> s = s.replace('πΆ', 'dog')
>>> s = s.sub(r'[^\w\s]+', '') # Strip everything but letters and spaces
>>> s = s[:19] # Extract a substring
>>> s.modified # The modified substring, after changes
'the quick brown fox'
>>> s.original # The original substring, before changes
'πΏππ πππππ, πππππ π¦'
Demo
Try selecting some of the βoriginalβ or βmodifiedβ text below, or editing the code block!
import BiString, * as bistring from "bistring"; let s = (function() { })();
s.original == ""
s.modified == ""