This is more of a brief reference sheet than an exhaustive list of TEI elements: it
is
intended to provide you with a way to look up the most commonly used elements, grouped
together for the exercises in which we’ll be encountering them. For detailed
information about the contents and semantics of these elements (and for other more
arcane
elements), have a look at the TEI Guidelines.
Element groups
- structure
- TEI, back, body, front, group,
teiHeader, and text
- general purpose block-level
- ab, argument, div, head, item,
label, list, p, quote, and said
- general purpose phrase-level
- bibl, date, distinct, emph, foreign,
hi, mentioned, name, q, quote, rs,
said, seg, soCalled, and term
- poetry
- l, lg, and rhyme
- drama
- castGroup, castItem, castList, role,
roleDesc, sp, speaker, and stage
- diary entries, letters, etc.
- closer, dateline, opener, postscript,
salute, signed, and trailer
- alternative transcriptions
- abbr, choice, corr, expan, orig,
reg, and sic
- manuscripts and physicality of documents
- add, addSpan, cb, del, delSpan,
handShift, lb, milestone, and pb
- editorial annotation
- app, damage, gap, lem, rdg,
restore, subst, supplied, and unclear
- hypertextual
- anchor, note, ptr, and ref
Elements (in alphabetical order)
- TEI
- The outermost (or root) element
for any TEI P5 conformant document. It groups together the
TEI header and the document text. It must have the TEI
namespace specified, and should have an xml:lang
attribute, i.e. TEI xmlns="http://www.tei-c.org/ns/1.0"
xml:lang="en".
- ab
- An anonymous block, that is, a paragraph-like chunk that does
not carry the semantic weight of a paragraph. Use type and maybe
subtype to categorize.
- abbr
- An abbreviation; may be used alone or, when inside choice, in combination
with expan which holds an expanded reading.
- add
- A handwritten addition. The hand attribute indicates the handwriting in
which the addition is made. This attribute contains an identifier which points to
a
hand element in the profileDesc of the TEI header; this
hand element contains an extended description of the handwriting, ink, and
other details.
- addSpan
- An empty element which marks the starting point for a handwritten addition that
either is too long to be encoded with add, or overlaps an element boundary. Its
spanTo attribute points to an anchor element which marks the
endpoint of the added material. The hand attribute indicates the handwriting
in which the addition is made (see above for details).
- anchor
- An anchor point, usually used as a place for some other element (such as a note) to
point to, using the anchor’s xml:id attribute.
- app
- Contains one entry in a critical apparatus, with an optional lemma and at least one
reading.
- argument
- A short summary or description of the contents of the following section. Contains
one or more p or lg elements.
- back
- Contains the back matter of the document, if any: indices, appendices, epilogues,
colophons, errata lists, etc. May be subdivided into div elements if
necessary.
- bibl
- Used to encode bibliographical references, either in a list (using
listBibl) or in running prose.
- body
- Contains the main body of the document, not including front matter and back matter.
The body element typically includes one or more div elements. It may
start with a head. (Think about where the head belongs—is it the
heading for the body, or the heading for the first division?)
- castGroup
- A grouping of related items in a cast list, containing one or more castItem
elements and an optional head and trailer.
- castItem
- An item in a cast list, containing a role and an optional
roleDesc.
- castList
- A cast list in a dramatic text, listing the roles in the drama. It consists of one
or more castItem or castGroup elements.
- cb
- An empty element which marks the break between one column and the next. Equivalent
to milestone unit="column".
- choice
- Groups together two or more alternate encodings of a phrase-level passage, using the
elements listed below.
- closer
- Very similar to opener, but located at the end of the div instead
of at the beginning.
- corr
- A corrected reading of a typographical error or oddity in the original; may be used
alone or, when inside choice, in combination with sic, which holds the
original reading.
- damage
- A damaged portion of the original text; the type attribute allows you to
classify the damage, and the extent attribute allows you to indicate the
extent of the damage.
- date
- Used to encode dates. The when attribute can be used to encode a
regularized form of the date (e.g. <date when="2001">The first year
of the new century</date> or <date
when="2005-05-29">Sun, 29 May 05</date>).
- dateline
- Used within opener and closer to encode the date and place of
writing. Contains words and phrase-level encoding.
- del
- A deletion. The hand attribute indicates the handwriting in which the
addition is made (see above for details).
- delSpan
- An empty element which marks the starting point for a deletion that is either too
long to be encoded with del or that overlaps an element boundary. Its
spanTo attribute points to an anchor element which marks the
endpoint of the deleted material. The hand attribute indicates the
handwriting in which the deletion is made (see above for details).
- distinct
- Used for linguistically distinct words (e.g. dialect words, regionally accented
words).
- div
- A division of a text: for instance, an act, a chapter, a section, a poem, a letter…
Use the type attribute to indicate what kind of division.
- emph
- Used to encode linguistically (as opposed to just
typographically) emphasized words or phrases.
- expan
- The expanded reading of an abbreviation; typically used inside choice, in
combination with abbr which holds the corresponding abbreviated reading. Rarely
used alone.
- foreign
- Used for foreign-language words when no other element (e.g. quote) is
already present.
- front
- Contains the front matter of the document, if any: title pages, tables of contents,
introductory essays, and so forth. The front element contains an optional
titlePage and may be subdivided into div elements.
- gap
- A gap in the original text (either from damage, deletion, excerption, or some other
cause). The desc child element provides a description of what is missing, and
the reason attribute provides the reason for the omission.
- group
- This element is used to represent documents which contain more than one independent
text. It appears instead of body in the overall TEI document structure, and
groups together multiple text elements, with an optional front and
back.
- handShift
- An empty element which marks the boundary point at which a change of handwriting
takes place. Its new attribute indicates the handwriting that begins at the
point being marked. The new attribute functions just like the hand
attribute, in pointing to a hand element in the TEI header, which provides
detailed information on the handwriting in question.
- head
- The heading of a division: contains words and phrase-level encoding. head
may appear at the start of div, but also at the start of body,
front, back, list, and lg.
- hi
- Used to encode words or phrases which are highlighted
for reasons which the encoder either does not know or
chooses not to analyze.
- item
- An item in a list: contains an optional label followed by words and
phrase-level encoding, or a series of paragraphs.
- l
- A single verse line: contains words and phrase-level elements. May have a
met attribute to formally specify the metrical pattern.
- label
- The label of an item (e.g. a letter, number, or word indicating its order or other
facts about it): contains words and phrase-level encoding. Note that label can
also be the first element inside a paragraph.
- lb
- An empty element which marks a typographical line break. Equivalent to
milestone unit="line".
- lem
- A lemma; e.g., the reading from the base text.
- lg
- A group of verse lines: contains one or more l elements. May have a
rhyme attribute to formally specify the rhyme scheme, e.g. lg
rhyme="ABAB".
- list
- A list: contains a series of item elements.
- mentioned
- Used for words which are mentioned but not used (for instance, for spelling or
definition purposes).
- milestone
- An empty element which marks a boundary point in the text according to some standard
reference system, such as signatures, scrolls, leaves. Use the unit attribute
to indicate the reference system whose units are being marked at this point.
- name
- Used to encode all kinds of names, i.e. proper nouns
and noun-phrases. If you want to distinguish between
different kinds of names, you can use the type
attribute (e.g. name type="person"). TEI also
includes specific elements for different kinds of names
(e.g. persName) for projects that need more
detailed encoding. The rs element is a more generic
version of name, which may be used to encode common
nouns and noun phrases.
- note
- A note (a footnote, endnote, marginal note, or inline note). Link the note to the
point where it’s anchored using xml:id and target. note
contains most anything, including words and phrase-level encoding, or one or more
p elements.
- opener
- This element may appear at the start of a div, text,
front, or back, and it groups together the elements that appear at
the start of a letter or similar document: the date and place of writing (using
dateLine, and the salutation to the person being addressed (using
salute).
- orig
- An unmodernized reading in the original; may be used alone or, when inside
choice, in combination with reg, which holds a regularized
reading.
- p
- A prose paragraph: contains words and phrase-level encoding.
- pb
- An empty element which marks the break between one page and another. By convention,
information stored in the attributes of pb refer to the page that
follows the break. Equivalent to milestone unit="page".
- ptr
- Indicates a reference to some other XML element (either in the current document or
some other accessible document) by pointing to it with a URI on the target
attribute. Must not have content. E.g., ptr
target="#art08_sec08"/.
- postscript
- Used to encode a postscript, e.g. of a letter.
- q
- Used to encode passages surrounded by quotation marks, when you don’t want to bother
with a more precise element like said. Roughly the same as hi
rend="surrounded-with-quotation-marks".
- quote
- Used to encode quotations from other sources; contains words and phrase-level
encoding.
- rdg
- A single reading, e.g. from a particular witness.
- ref
- Indicates a reference to some other XML element (either in the current document or
some other accessible document) by pointing to it with a URI on the target
attribute. May (and probably should) have content. E.g.,
<ref target="#art08_sec08">the <soCalled>IP</soCalled> clause</ref>.
- reg
- A modernization of a reading in the original; may be used alone or, when inside
choice, in combination with orig, which holds the corresponding
unmodernized reading.
- restore
- Indicates restoration of text to an earlier state by cancellation of a marking or
instruction; in particular, useful to indicate that a deletion was restored, e.g.
by the
notation stet.
- rhyme
- May be optionally used to indicate the portion of the metrical line that rhymes, and
with its label attribute which part of the rhyme scheme is in play.
- role
- The name of a role in a cast list.
- roleDesc
- The description of a role in a cast list.
- rs
- Used to encode all kinds of references to people,
places, and things; i.e., nouns and noun phrases. If you
want to distinguish between different categories of entity
being referred to, you can use the type attribute
(e.g. rs type="person"). The name
element is a more specialized version of rs,
reserved for proper nouns and noun-phrases.
- said
- Passages spoken aloud or thought, e.g. by a character in a novel.
- seg
- General-purpose phrase-level segment: use
type and maybe subtype to
categorize.
- salute
- Used within opener and closer to encode the salutation to the
person being addressed (e.g. Dear Sir, or I remain faithfully yours…).
Contains words and phrase-level encoding.
- sic
- A typographical error or oddity in the original; may be used alone or, when inside
choice, in combination with corr, which holds a corrected
reading.
- signed
- Used within closer to encode the signature or name of the person writing.
Contains words and phrase-level encoding.
- soCalled
- Used to encode (or express) authorial distance; e.g., phrases that were or should
be
in scare quotes.
- sp
- A dramatic speech; usually begins with a speaker element, followed by a
p or lg.
- speaker
- A speaker identification printed in the text.
- stage
- A stage direction. The type attribute may be used to identify the kind of
stage direction; suggested values include:
- business
- costume
- delivery
- entrance
- exit
- location
- narrative
- novelistic
- subst
- Groups together an add and a del so that the addition is
understood as being a substitution for the deletion.
- supplied
- Indicates that a given word or passage cannot be read in the original and is being
supplied (either through editorial judgment or from some other textual source).
- teiHeader
- The wrapper for all of the document’s metadata. The elements that go inside the TEI
header are too numerous to list usefully here; see the templates for details.
- term
- Used to encode specialized terminology; often associated with a
gloss.
- text
- The wrapper element which contains all of the document’s content. The text
element is most often used for a single work (i.e. a single published document, or
a
single aesthetic unit such as a play or a work of fiction). Terms like single
work and aesthetic unit need to be defined by the individual
project. A text element contains an optional front, a mandatory
body, and an optional back.
- trailer
- This element is used for things that come at the very end of the document or
section, such as The End.
- unclear
- Indicates that a given word or passage is unclear, but not entirely illegible
(expresses uncertainty rather than absolute lack of information); multiple alternative
readings may be grouped in a choice element.
Attributes (in alphabetic order)
- met
- May be used to specify the metrical pattern of a verse line (or line group).
- n
- Provides a label or identifier for this particular element, not necessarily
unique.
- next and prev
- Allow what is logically a single text object (e.g. a quotation) to be encoded as a
series of two or more discrete XML elements, as a work-around for overlap problems.
These attributes represent the connections between these fragmentary elements, by
pointing to a prior or subsequent element in the chain of fragments. They do so by
referring to that element’s xml:id value. That is, if next is
specified on a said element, then its value should be a hash mark
(#) followed by the value of the xml:id of another
said element, the one that is the next part of the spoken passage. For
example, <said xml:id="s01" next="#s02">Hey</said>, he
said, <said xml:id="s02" prev="#s01">What’s
up?</said>.
- rend or style
- May be used to specify how the element looked in the
source. style uses CSS, whereas rend
is open. E.g., head rend="align(centre)" or
head style="text-align: center;".
- rhyme
- May be used to specify the rhyme scheme of a line group.
- target
- Provides a URI (e.g.
https://bauman.zapto.org/~syd/temp/2012-12-02T18:29:17.jpg
or #sect08) that points to either another document or an element within an
XML document (including the current one).
- xml:id
- Provides a unique identifier for this particular element, thus allowing other
elements to point to it (using their target, next,
prev, etc.).
- xml:lang
- Used to indicate the language of an element’s content. Its value conforms to BCP 47
(a standard system for defining language codes). For information on how BCP 47 codes
are
constructed, see the note in the data.language documentation. Some sample values for the xml:lang
attribute are:
English |
en |
French |
fr |
German |
de |
Italian |
it |
Latin |
la |
Arabic as spoken in Iraq |
ar-IQ |
Chinese |
zh |
simplified Chinese |
zh-Hans |
Taiwanese |
zh-TW |
If further explanation is required, a language element with an
ident attribute of the same BCP 47 code can be specified in the TEI header.
Copyleft 2008 Syd Bauman and Julia
Flanders; source available at
http://www.wwp.neu.edu/outreach/seminars/_current/handouts/elementList.tei.