Unicode
- A 16-bit coded character set containing most of the characters used in the major languages of the world
- Unicode Standard 5.0.0
- Unicode Standard Annex #9: The Bidirectional Algorithm
- Unicode Technical Report #20: Unicode in XML and Other Markup Languages
UTF-8 (Unicode Transformation Format 8-bit Encoding Form)
- A 7-48 bit character encoding scheme of Unicode using an 8-bit base
- ASCII in its original 7-bit format
- The most frequent form of Unicode used
- The default encoding in XML
Recommended Offline Reading
- Haralambous, Y. 2007. Fonts & Encodings. Sebastopol, CA: O'Reilly.
- Korpela, J. K. 2006. Unicode Explained. Sebastopol, CA: O'Reilly.