Last active
February 15, 2023 10:39
-
-
Save nash403/7d182fb217250ac95fa2a9256fbae4eb to your computer and use it in GitHub Desktop.
Normalize a character/string to remove accents. (For comparison)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
Two things will happen here: | |
1) normalize()ing to NFD Unicode normal form decomposes combined graphemes into the combination of simple ones. The è of Crème ends up expressed as e + ̀. | |
2) Using a regex character class to match the U+0300 → U+036F range, it is now trivial to globally get rid of the diacritics, which the Unicode standard conveniently groups as the Combining Diacritical Marks Unicode block. | |
*/ | |
export const normalizeChar = char => char.normalize('NFD').replace(/[\u0300-\u036f]/g, '') | |
export const normalizeStr = str => str.split('').map(normalizeChar).join('') |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment