Unicode has the following zero-width characters:
- U+200B zero width space
- U+200C zero width non-joiner Unicode code point
- U+200D zero width joiner Unicode code point
- U+FEFF zero width no-break space Unicode code point
To remove them from a string in JavaScript, you can use a simple regular expression:
var userInput = 'au200Bbu200Ccu200DduFEFFe';
console.log(userInput.length); // 9
var result = userInput.replace(/[u200B-u200DuFEFF]/g, '');
console.log(result.length); // 5
Note that there are many more symbols that may not be visible. Some of ASCII’s control characters, for example.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…