capitalization rendition case
case smallcaps allcaps mixed

Use of the case keyword to capture case, and approaches to transcription

Because the case (upper case, lower case) of text is so deeply tied to meaning in the modern western tradition, it may seem odd to treat it as a renditional fact: that is, as something that may be treated variably. In practice it is useful to distinguish between two different aspects of case:

We recommend treating the first kind of capitalization as part of the fundamental information of the text and transcribing it as originally presented. For the second category we recommend treating case as a renditional feature of the text’s presentation, to be encoded using the rend attribute, and transcribing the text with the capitalization that is appropriate to the context. In most cases, this means lower case with capitals used for proper nouns and for the first word in a sentence; for headings and titles, this means standard titling case (all significant words capitalized). The goal is to represent the text in a comparatively neutral manner that allows for a meaningful presentation of the text even when the original case is not reproduced. Thus if the word Britain appears in all capitals, it would be encoded as <placeName rend="case(allcaps)">Britain</placeName>, but if it has only an initial capital, it would be considered to have no special rendition: <placeName>Britain</placeName>. This approach has the important benefit that the text can be presented unmodified in contexts such as a list of search results without appearing out of place. The renditional encoding allows the text to be presented as in the source in contexts where that is desirable, but does not rule out other presentational options. If words set in all capitals are transcribed in all capitals (e.g. <placeName>BRITAIN</placeName>), the display options are much more limited.

Words set in large and small capitals present an additional challenge, since there is no way to transcribe small capitals directly. We recommend transcribing words and phrases in large and small capitals using mixed case, with capital letters representing the full-size capitals and lower-case letters representing the small capitals; the word or phrase should be encoded with an appropriate element (or hi if no other element is appropriate) and the rend attribute should capture the case: hi rend="case(lscaps)".

This approach preserves the distinction between large and small capitals, without requiring markup of individual letters. It also creates a transcription that can be used flexibly as described above. It should only be used in cases where the text uses large and small caps in the manner of mixed case. It should not be used in cases where different lines of text are set in all capitals, with the size differing from line to line; in these situations the text should be treated as if it were all in upper-case letters.

This approach ignores the fact that small caps fonts often exist as distinct from the ordinary full-size capital letters at a given type size. It treats the distinction between small caps and ordinary capitals contextually rather than as an intrinsic fact about a given letter. If a more fine-grained approach is needed (and if it is important to you to distinguish a small caps font from large capitals in all cases), then an additional keyword smallcaps could be used for true small capitals wherever they appear. For most purposes, this level of detail may be unnecessary.

In the rendition ladder, case is described using the case keyword, with values allcaps, lscaps, smallcaps, lower, and mixed.


Example 1.

By the <emph rend="case(allcaps)">grace</emph> of our <emph rend="case(allcaps)">Lord</emph> <persName rend="case(mixed)">Jesus <hi rend="case(allcaps)">Christ</hi></persName>@/p@