其他分享
首页 > 其他分享> > ASCII or UTF-8?

ASCII or UTF-8?

作者:互联网

ASCII or UTF-8?

问题

Long long time ago before world scripts birth, text files are all ASCII.
Nowadays, we have world scripts.
I would like to ask if I open up a text file in a hex editor, is there a way to tell its code page is in ASCII or UTF-8?

 

回答1

UTF-8 is backwards compatible with ASCII: an ASCII text file is also a UTF-8 text file.

If a file contains bytes starting with 8 through F it's not ASCII.

If a file is not ASCII, it may be UTF-8 if every byte that starts with C, D, E, or F is followed by one to three bytes that start with 8, 9, A, or B. If any of these bytes appears in any other context it's not UTF-8.

There are a few more requirements for valid UTF-8, but they are harder to glean with a hex editor. See https://en.m.wikipedia.org/wiki/UTF-8

标签:UTF,bytes,hex,file,text,ASCII
来源: https://www.cnblogs.com/chucklu/p/16394370.html