See "LINKS/ OTHER RESOURCES" below for complete character lists.
Selected character reference:
ISO 8859-1
|
||||||
Symbol / Rendering |
Description | In HTML, type this: | ...or this: | |||
  | no-break space/ non-breaking space |
|   | |||
¢ | ¢ | Cent sign | ¢ |
¢
|
||
£ | £ | Pound sterling | £ | |||
¤ | ¤ | General currency sign | ¤ | |||
¥ | ¥ | Yen sign | ¥ | ¥ |
||
¦ | ¦ | Broken vertical bar | ||||
Note that all ISO-8859-1 characters are included in ANSI / Windows-1252. Use the following header tag in HTML documents to specify that the document was saved/encoded
|
||||||
UNICODE | ||||||
Symbol / Rendering |
Description | In HTML, type this: | ...or this: (HTML Entity, decimal) |
...or this: (HTML Entity, hex) |
Official | |
• | • | bullet | • |
•
|
||
≠ | ≠ | (value) ... does not equal ... (value) | ≠ | ≠ | ≠ | NOT EQUAL TO (U+2260) |
≈ | ≈ | ... equals approximately... ... is almost equal to ... |
≈ (from "asymptote") |
≅ | view more like this | |
"approximately equal to": alternatives to the 'wiggily equals' include:
{ ~ } or { = (approx) } Q: Why 'asymptote'? A: Because, presumedly, the ~ symbol was often used to indicate that one function is asymptotic to another, and the ~ and ≈ symbols were often used interchangeably. One might, for example, write f(x)~ g(x) if the ratio of f(x) and g(x) approach 1 as x -> infinity. In other words, two values which ALMOST meet or are ALMOST the same... [s] |
||||||
∴ | ∴ | (something-something) ... therefore ... (something) |
||||
⁄ | ⁄ |
solidus
|
HTML Entity (decimal) ⁄
HTML Entity (hex) ⁄ |
'FRACTION SLASH' (U+2044) How to type in Microsoft Windows Alt +2044 |
||
⁄ | ⁄ | solidus ("Fraction Slash") |
HTML Entity (named) ⁄ | see also: "division slash"; "solidus" |
||
F8; %29F8; | BIG SOLIDUS 29F8 |
(see also this 1998 proposal and the then-current reference tables)
"therefore" symbol ∴ none ∴ ∴
"varies with / similar to" ∼ none ∼ ∼
"almost equal to" ≈ none ≈ ≈
"not equal to" ≠ none ≠ ≠
"equivalent to" ≡ none ≡ ≡
"less-than or equal to" &le none ≤ &le
"greater-than/equal to" ≥ none ≥ ≥
Character Encoding Options | |||
Common Name | Equivalent to | To use for HTML page authoring | |
ISO-8859-1 |
Western (Latin 1)
|
(Western Alphabet, latin1, us-ascii, windows-1252, x-ascii) |
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> |
UTF-8 | Unicode | <meta http-equiv='Content-Type' content='text/html; charset=utf-8'> | |
ISO 10646 | Unicode | ||
Japanese (Shift JIS) | <meta http-equiv="Content-Type" content="text/html; charset=shift_jis"> | ||
Windows-1252 | Western European (Windows) | ANSI* (windows-1252) | |
<meta http-equiv="Content-Type" content="text/html; charset=gb2312"> | |||
[s]
|
Note that you must actually save the file (.txt, .html) using the proper encoding; the HTML "Content-Type" tag will not change
the encoding of a .html file (for example) saved as ANSI* to (for example) a document with Unicode encoding.
But you can use the
decimal numbers (e.g. in an ascii document and specify the UTF-8 encoding (see STAR example, below).
*ANSI character set / ANSI encoding — The ANSI-standard character set defines 256 characters. The first 128 are ASCII, and the second 128 contain math and foreign language symbols, which are different than those on the PC.
Dumb tricks with Unicode:
Putting a star in your web page title:
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>★ I'm a star baby!★</title>
Result (if page uses UTF-8/Unicode)
★I'm a star baby!★
Codes in URLs: (URIs)
%E5%A4%A7%E9%98%AA%E5%B8%82 %3F %3D %3C %3F
symbol | for HTML | for CGI | for URLs | ||
%2526 | |||||
' | ' | %2527 | %27 | decimal 39 | example: (Opera url) http://www.example.com/Elune's_Grace is equivalent to (IE url) http://www.example.com/Elune%27s_Grace |
= | = | %253D | |
decimal 61 | hex (byte) 3D hex indicator for CGI: %3D |
Codes to be passed to CGI sripts in URIs
sample:
to send '&' or ' = ' as data to the CGI you must encode them as %2526 and %253D, respectively. Note that when = is used between a field name and its value it does not need to be encoded, and when & is used as the first character of a character entity (such as < ) it does not need to be encoded. Also note that field/value pairs are delimited by & (the character entity for &). For example: <form method="post" action="ROFM.acgi?_action=Add&...&Subject=OneAmpersandInQuotes"%2526"">
REFERENCE | SYMBOL ("GLYPH") |
DESCRIPTION | windows-1252 | ENTITY | ASCII | UNICODE (in hex as 0xXXXX) |
|
Dec | Hex * | ||||||
| (carriage return) | ||||||
  | (space) | ||||||
! | ! | Exclamation Mark | ! | 33 | 21 | ||
" | " | Quotation Mark | " | " | 34 | 22 [0x22] | 0x0022 |
# | # | Number Sign, Octothorp, "pound" |
# | 35 | 23 | ||
$ | $ | Dollar Sign, USD | $ | 36 | 24 | ||
% | % | . | % | ||||
& | & | Ampersand | & | & | |||
' | ' | apostrophe | ' | ||||
( | ( | ( | |||||
) | ) | _ | ) | ||||
* | * | _ | * | ||||
+ | + | _ | + | ||||
, | , | _ | , | ||||
- | - | ||||||
. | . | ||||||
\ | \ | back-slash or backslash | \ | ||||
/ | / | forward-slash or slash |
Note: "slash" does not equal "solidus" (fraction [⁄] or division [∕] slash) i.e. [ / ≠ ⁄ ] & [ / ≠ ∕ ] |
||||
¡ | ¡ | Inverted Exlamation Mark | ¡ | ||||
¬ | ¬ | Not sign | ¬ | ||||
ç | ç | cedilla | ç | ||||
é | é | Small e, acute accent | é | ||||
a |
REFERENCE DESCRIPTION | |||||||
_ | _ | ||||||
­
|
, Soft hyphen
|
Ú
|
Ú, Capital U, acute accent
|
||||
. |
®
|
®, Registered trademark
|
Û
|
Û, Capital U, circumflex accent
|
|||
/ | / |
¯
|
¯, Macron accent
|
Ü
|
Ü, Capital U, dieresis or umlaut mark
|
||
: | : |
°
|
°, Degree sign
|
Ý
|
Ý, Capital Y, acute accent
|
||
; | ; |
±
|
±, Plus or minus
|
Þ
|
Þ, Capital THORN, Icelandic
|
||
< | < |
²
|
², Superscript two
|
ß
|
ß, Small sharp s, German (sz ligature)
|
||
= | = |
³
|
³, Superscript three
|
à | à, Small a, grave accent | ||
> | > |
´
|
´, Acute accent
|
á | á, Small a, acute accent | ||
? | ? |
µ
|
µ, Micro sign
|
â | â, Small a, circumflex accent | ||
@ | @ |
¶
|
¶, Paragraph sign
|
ã | ã, Small a, tilde | ||
[ | [ |
·
|
·, Middle dot
|
ä | ä, Small a, dieresis or umlaut mark | ||
\ | \ |
¸
|
¸, Cedilla
|
å | å, Small a, ring | ||
] | ] | ¹ | ¹, Superscript one | æ | æ, Small ae dipthong (ligature) | ||
^ | ^ | º | º, Masculine ordinal | ç | ç, Small c, cedilla | ||
_ | _ | » | », Right angle quote, guillemotright | è | è, Small e, grave accent | ||
` | ` | ¼ | ¼, Fraction one-fourth | é | é, Small e, acute accent | ||
{ | { | ½ | ½, Fraction one-half | ê | ê, Small e, circumflex accent | ||
| | | | ¾ | ¾, Fraction three-fourths | ë | ë, Small e, dieresis or umlaut mark | ||
} | } | ¿ | ¿, Inverted question mark | ì | ì, Small i, grave accent | ||
~ | ~ | À | À, Capital A, grave accent | í | í, Small i, acute accent | ||
  | (non-breaking space) | Á | Á, Capital A, acute accent | î | î, Small i, circumflex accent | ||
¡ | ¡, Inverted exclamation | Â | Â, Capital A, circumflex accent | ï | ï, Small i, dieresis or umlaut mark | ||
¢ | ¢, Cent sign | Ã | Ã, Capital A, tilde | ð | ð, Small eth, Icelandic | ||
£ | £, Pound sterling | Ä | Ä, Capital A, dieresis or umlaut mark | ñ | ñ, Small n, tilde | ||
¤ | ¤, General currency sign | Å | Å, Capital A, ring | ò | ò, Small o, grave accent | ||
¥ | ¥, Yen sign | Æ | Æ, Capital AE dipthong (ligature) | ó | ó, Small o, acute accent | ||
¦ | ¦, Broken vertical bar | Ç | Ç, Capital C, cedilla | ô | ô, Small o, circumflex accent | ||
§ | §, Section sign | È | È, Capital E, grave accent | õ | õ, Small o, tilde | ||
¨ | ¨, Umlaut (dieresis) | É | É, Capital E, acute accent | ö | ö, Small o, dieresis or umlaut mark | ||
© | ©, Copyright | Ê | Ê, Capital E, circumflex accent | ÷ | ÷, Division sign | ||
© | ©, Copyright | Ë | Ë, Capital E, dieresis or umlaut mark | ø | ø, Small o, slash | ||
Ì | Ì, Capital I, grave accent | ù | ù, Small u, grave accent | ||||
Í | Í, Capital I, acute accent | ú | ú, Small u, acute accent | ||||
Î | Î, Capital I, circumflex accent | û | û, Small u, circumflex accent | ||||
Ï | Ï, Capital I, dieresis or umlaut mark | ü | ü, Small u, dieresis or umlaut mark | ||||
Ð | Ð, Capital Eth, Icelandic | ý | ý, Small y, acute accent | ||||
Ñ | Ñ, Capital N, tilde | þ | þ, Small thorn, Icelandic | ||||
Ò | Ò, Capital O, grave accent | ÿ | ÿ, Small y, dieresis or umlaut mark | ||||
Ó | Ó, Capital O, acute accent | Ā | A | ||||
Ô | Ô, Capital O, circumflex accent | ā | a | ||||
Õ | Õ, Capital O, tilde | Ă | A | ||||
Ö | Ö, Capital O, dieresis or umlaut mark | ă | a | ||||
× | ×, Multiply sign | &260; | A | ||||
× | ×, Multiply sign | &261; | a | ||||
&261; | a |
* hexadecimal numbers: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F {conversion}
REFERENCE | SYMBOL | DESCRIPTION | |||
    |
e m e m |
em space (at left between e and m) | HTML4 | missing in 8859-1 | |
  | e m | three-per-em space | HTML4 | missing in 8859-1 | |
  | e n | en space | HTML3 | missing in 8859-1 | |
۴ | ۴ | Extended Arabic - Indic Digit Four | UTF-8 | (1780 in hexadecimal: 06F4 ) |
|
#%028D; | &invw; ʍ | #028D; ʍ | |||
[ |
[
|
91 | 0x5B (U+005B) | left square bracket | Basic Latin |
Browser / Encoding Render Check:
The following table is used to see if the character code
INDICATED IN THE FIRST COLUMN renders correctly in the second column.
For example,
you should see the character for em dash in the center column in the first
two rows.
(Note: The character set specified for this html page is " http-equiv="Content-Type" content="text/html; charset=UTF-8" .)
— | | em dash |
— | — | em dash — |
– | – | en dash |
– | – | en dash |
— | — | em dash |
– | |
Other links:
LINKS/ OTHER RESOURCES:
ASCII - ISO 8859-1 (Latin-1) Table with HTML Entity Names
• Martin Ramsch's "ISO 8859-1 Table" [http://www.ramsch.org/martin/uni/fmi-hp/iso8859-1.html]
• http://www.unicode.org/unicode.css
• Character Encoding Options
ASCII-EBCDIC Character Set
The HTML Document Character Set
About the use of the em dash and the en dash (and similar characters)
click here for full Data on languages
[new!] HTML Syntax (with browser support list)
On Windows systems, you can (usually - some application programs may override this) produce any character in the Windows character set (naturally, in its Windows encoding) as follows: Press down the (left) Alt key and keep it down. Then type, using the separate numeric keypad (not the numbers above the letter keys!), the four-digit code of the character in decimal. Finally release the Alt key. Notice that the first digit is always 0, since the code values are in the range 32 - 255 (decimal). For instance, to produce the letter "Ä" (which has code 196 in decimal), you would press Alt down, type 0196 and then release Alt. Upon releasing Alt, the character should appear on the screen. In MS Word, the method works only if Num Lock is set. This method is often referred to as Alt-0nnn. (If you omit the leading zero, i.e. use Alt-nnn, the effect is different, since that way you insert the character in code position nnn in the DOS character code! For example, Alt-196 would probably insert a graphic character which looks somewhat like a hyphen. There are variations in the behavior of various Windows programs in this area, and using those DOS codes is best avoided.) [s]
Example, Windows XP.
Using the Code to type the letter.
Open MS word or Notepad.
Click to position your cursor in the text
area of the application (e.g. white box area of Notepad).
Press and hold the ALT KEY,
continue to hold the ALT key as you type "0","2","2","5",
then release the ALT key.
The following character should appear on the screen: á
As shown in the table {http://www.obkb.com/dcljr/chars.html},
'small a, acute ' uses the code
'á' in HTML. So the HTML decimal code
is the same as the decimal code used for entering special
characters in Windows; the only difference being
that the Windows code is delimited by the ALT key
(instead of "&","#", and ";"), and the Windows code
uses the leading zero (i.e. "0225" instead of just "225").
Reference:
http://www.obkb.com/dcljr/chars.html
http://www.unicode.org/unicode/uni2book/ch06.pdf
The official registry of "charset" (i.e., character encoding) names, with references to documents defining their meanings, is kept by IANA at
http://www.iana.org/assignments/character-sets
(According to the documentation of the registration procedure, RFC 2978, it should be elsewhere, but it has been moved.) I have composed a tabular presentation of the registry, ordered alphabetically by "charset" name and accompanied with some hypertext references.
keywords: Exclimation Mark
Document published: Jan. 09, 2000
Last modified or updated: Nov. 14 2005
By lyberty