You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For example, the native split function '😀-hi-🐅'.split('') will break your string compared to lodash _.'😀-hi-🐅' because it failed to recognize emojis as a single symbol and instead splits its surrogate pairs into two pieces. It is the same reason why calling length on emojis returns two instead of one '😀'.length
Lodash takes special care if your string has non-BMP symbols for example emojis. To correctly split '😀-hi-🐅'; you can use the spread operator: [...'😀-hi-🐅']
But even the spread operator does not handle grapheme clusters. For that, you need the Unicode Text Segmentation algorithm. Chrome already implemented the algorithm in Intl.Segmenter in 87. You can use the algorithm like this:
The text was updated successfully, but these errors were encountered:
laithshadeed
changed the title
Replacing lodash string functions with native one requires special care for Unicode strings
Replacing lodash string functions with native one requires special care for Unicode strings with non-BMP symbols
Aug 19, 2021
For example, the native split function
'😀-hi-🐅'.split('')
will break your string compared to lodash_.'😀-hi-🐅'
because it failed to recognize emojis as a single symbol and instead splits its surrogate pairs into two pieces. It is the same reason why calling length on emojis returns two instead of one'😀'.length
Lodash takes special care if your string has non-BMP symbols for example emojis. To correctly split '😀-hi-🐅'; you can use the spread operator:
[...'😀-hi-🐅']
But even the spread operator does not handle grapheme clusters. For that, you need the Unicode Text Segmentation algorithm. Chrome already implemented the algorithm in Intl.Segmenter in 87. You can use the algorithm like this:
[...(new Intl.Segmenter).segment('😀-hi-🐅')].map(x => x.segment)
More about Unicode issues in Javascript in: https://mathiasbynens.be/notes/javascript-unicode
Happy passing emojis around 😀
The text was updated successfully, but these errors were encountered: