Character set

A character set is an encoding system to let computers know how to recognize {{Glossary("Character")}} , including letters, numbers, punctuation marks, and whitespace.

In earlier times, countries developed their own character sets due to their different languages used, such as Kanji JIS codes (e.g., Shift-JIS, EUC-JP, etc.) for Japanese, Big5 for traditional Chinese, and KOI8-R for Russian. However, {{Glossary("Unicode")}} gradually became most acceptable character set for its universal language support.

If a character set is used incorrectly (For example, Unicode for an article encoded in Big5), you may see nothing but broken characters, which are called Mojibake.

docs.rodeo

MDN Web Docs mirror

Character set

See also

In this article