unicode - Is this case a weird UTF-8 encoding conversion?

Question

Welcome To Ask or Share your Answers For Others

unicode - Is this case a weird UTF-8 encoding conversion?

asked Jan 31, 2022 in Technique[技术] by 深蓝 (71.8m points)

unicode - Is this case a weird UTF-8 encoding conversion?

I am working with a remote application that seems to do some magic with the encoding. The application renders clear responses (which I'll refer as True and False), depending on user input. I know two valid values, that will render 'True', all the others should be 'False'.

What I found (accidently) interesting is, that submitting corrupted value leads to 'True'.

Example input:

USER10 //gives True
USER11 //gives True
USER12 //gives False
USER.. //gives False
OTHERTHING //gives False

so basically only these two first values render True response.

What I noticed is that USERà±0 (hex-wise x55x53x45x52C0xB1x30) is accepted as True, surprisingly. I did check other hex bytes, with no such success. It leads me to a conclusion that xC0xB1 could be somehow translated into 0x31 (='1').

My question is - how it could happen? Is that application performing some weird conversion from UTF-16 (or sth else) to UTF-8?

I'd appreciate any comments/ideas/hints.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2022-01-31T07:26:50+0000

C0 is an invalid start byte for a two-byte UTF-8 sequence, but if a bad UTF-8 decoder accepts it C0 B1 would be interpreted as ASCII 31h (the character 1).

Quoting Wikipedia:

...(C0 and C1) could only be used for an invalid "overlong encoding" of ASCII characters (i.e., trying to encode a 7-bit ASCII value between 0 and 127 using two bytes instead of one....

Categories

unicode - Is this case a weird UTF-8 encoding conversion?

unicode - Is this case a weird UTF-8 encoding conversion?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags