Handling primary sources in TEI XML
3. Letter forms
- Unicode (ISO 10646) defines computer codepoints for most, though
not all, of the abstract characters recognized by modern scholars when
reading ancient sources.
- Different fonts realise those codepoints in different
styles; however the underlying character remains the same.
- Data entry of Unicode characters can be
- direct: some key combination or menu-selection generates the character
ć for us
- indirect, using a numeric character entity reference such
- indirect using a mnemonic character entity reference such as
æ (this requires every document to carry a DTD
Up: Contents Previous: 2. Letter forms Next: 4. Non-Unicode characters