Names: general notes

name personification rendition proper name regularization phrase-level encoding
placeName orgName rs persName roleName surname forename

Overview of the encoding of names, including personal names, place names, organizational names, and the names of objects

Encoding of names in TEI can be done very simply, using the name element, or in detail using the suite of specific elements described in TEI Chapter 20.

In an important sense, these two approaches are equivalent: the various specific name elements such as persName, roleName, placeName are just syntactic sugar for name type="person", etc. However, using more specific elements allows you to constrain their usage more precisely: for instance, by requiring that roleName and surname be nested within persName rather than being used on their own. It also simplifies data entry, if you have a text editor that will auto-complete element names, but this is a minor point. If you like the constraint of specialized elements for some purposes, but need a more general encoding for other purposes, it is trivial to convert from one to the other. Our recommendation is to use at least persName and placeName. If your project has a more specific interest in names, then roleName, genName, forename, surname, and addName may also be useful to capture the internal structure of names more precisely.

Encoding names is useful even if all it accomplishes is the identification of a given phrase as a name; for instance, this allows you to exclude those phrases from word frequency counts or spell-checking attempts, or to handle them appropriately in parsing a sentence. However, there are further goals to encoding names which are worth considering:

The four most common categories of name which we have encountered in early modern texts are personal names of humans, place names, names of non-human creatures and things, and names of organizations and institutions. These are conceptual categories which make a certain amount of sense on their own, but which also correspond to functional differences in how names might be treated in the encoding. Proper names of humans are more likely than others to be given a unique key (for instance, to link them to a name authority file or biographical information); place names might be linked to geographical information or maps; organizational names might be linked to a different kind of explanatory resource.

Because of these differences, we find it useful to use a different element for each, as follows:

These name elements are found in the TEI special tagset for names.

We have found it useful to limit the domain of names to proper nouns: terms which refer to a specific unique individual, thing, or group, and which intend to designate that individual uniquely. Terms which refer to a person’s role or title, without a proper noun, are not names. Thus the President is not a name, but President Lincoln is a name; the Earl is not a name, but John, Earl of Norfolk is a name. Your project may find it more useful to define names more broadly or more loosely than this, for reasons having to do with your particular materials or audience.

If you want to apply the key attribute not only to proper names, but also to a wider range of references to individuals (e.g. by title, by pronoun), the rs (referring string) element would be appropriate. Referring to William Shakespeare as the Bard is an example of the use of a referring string in naming, as is referring to Queen Elizabeth as the Queen or her Highness. It is a member of the same class of elements as name, persName, etc., and it can carry a key attribute, but it has a much broader range of application.

References to personified qualities (such as Love, Virtue, Temptation) should usually not be treated as names, since they are not strictly speaking proper names and since in general they do not refer to people. However, personification of this sort sometimes does shade into the realm of names when, for instance, a character in a play or a novel is given a name such as Despair (as in Bunyan’s Pilgrim’s Progress). The decision whether to treat a given case as a name or not should take into account whether it refers to a character with a persistent existence within the work (rather than being a passing poetic reference), and also whether the term itself is really a proper name or not. The rs element would be a good way to encode these kinds of references.