Character set
{{GlossarySidebar}}
A character set is an encoding system to let computers know how to recognize {{Glossary("Character")}}
, including letters, numbers, punctuation marks, and whitespace.
In earlier times, countries developed their own character sets due to their different languages used, such as Kanji JIS codes (e.g. Shift-JIS, EUC-JP, etc.) for Japanese, Big5 for traditional Chinese, and KOI8-R for Russian. However, {{Glossary("Unicode")}}
gradually became most acceptable character set for its universal language support.
If a character set is used incorrectly (For example, Unicode for an article encoded in Big5), you may see nothing but broken characters, which are called Mojibake.
See also
- Character encoding (Wikipedia)
- Mojibake (Wikipedia)
- Related glossary terms:
{{Glossary("Character")}}
{{Glossary("Unicode")}}