Character reference
An {{glossary("HTML")}} character reference is an {{glossary("escape character", "escape sequence")}} of {{glossary("character", "characters")}} that is used to represent another character in the rendered web page.
Character references are used as replacements for characters that are reserved in HTML, such as the less-than (<) and greater-than (>) symbols used by the HTML parser to identify element {{Glossary('tag','tags')}} , or " or ' within attributes, which may be enclosed by those characters.
They can also be used for invisible characters that would otherwise be impossible to type, including non-breaking spaces, control characters like left-to-right and right-to-left marks, and for characters that are hard to type on a standard keyboard.
There are three types of character references:
-
Named character references
- : These use a name string between an ampersand (
&) and a semicolon (;) to refer to the corresponding character. For example,<is used for the less-than (<) symbol, and©for the copyright symbol (©). The string used for the reference is often a{{glossary("Camel case","camel-cased")}}initialization or contraction of the character name.
- : These use a name string between an ampersand (
-
Decimal numeric character references
- : These references start with
&#, followed by one or more ASCII digits representing the base-ten integer that corresponds to the character’s{{glossary("Unicode")}}{{glossary("code point")}}, and ending with;. For example, the decimal character reference for<is<, because the Unicode code point for the symbol isU+0003C, and3Chexadecimal is 60 in decimal.
- : These references start with
-
Hexadecimal numeric character references
- : These references start with
&#xor&#X, followed by one or more ASCII hex digits, representing the hexadecimal integer that corresponds to the character’s Unicode code point, and ending with;. For example, the hexadecimal character reference for<is<or<, because the Unicode code point for the symbol isU+0003C.
- : These references start with
A very small subset of useful named character references along with their unicode code points are listed below.
| Character | Named reference | Unicode code-point |
|---|---|---|
| & | & |
U+00026 |
| < | < |
U+0003C |
| > | > |
U+0003E |
| " | " |
U+00022 |
| ’ | ' |
U+00027 |
|
U+000A0 | |
| – | – |
U+02013 |
| — | — |
U+02014 |
| © | © |
U+000A9 |
| ® | ® |
U+000AE |
| ™ | ™ |
U+02122 |
| ≈ | ≈ |
U+02248 |
| ≠ | ≠ |
U+02260 |
| £ | £ |
U+000A3 |
| € | € |
U+020AC |
| ° | ° |
U+000B0 |
The full list of HTML named character references can found in the HTML specification here.
See also
- Related glossary terms:
{{glossary("Character")}}{{glossary("Escape character")}}{{glossary("Code point")}}{{glossary("Unicode")}}