Handling primary sources in TEI XML

4. Non-Unicode characters

Nevertheless, sometimes Unicode is not enough…
  • … if your character doesn't exist
  • … if you want to distinguish letter forms that Unicode regards as identical e.g. for statistical analysis.

The <g> (gaiji) element stands for any non-Unicode character. Its content can be a local approximation to the desired letter (or nothing); its @ref attribute points to a definition for the required character or glyph.

<!-- in text --><g ref="#x123"/>
<g ref="#x123">x</g>
in header:
<char xml:id="x123"> <!-- character definition here --> </char>

