Page breaks and page numbering

tipped-in pages page number milestone front matter reference system
pb n

Encoding of page breaks and page numbering using the pb element and its n attribute, including guidelines for creating idealized page number sequences

Page breaks are encoded using the TEI pb element. This is an empty element, and has no content. By convention, the pb element is understood to mark the start of a new page, so each page of text should be preceded by a pb element, even the first page. The pb element goes before any other information about the page, including collation, forme work, etc.

The page number is encoded in two ways. The actual printed page number is part of the forme work and is encoded using fw type="pageNum". An idealized page number is also captured on the n attribute of the pb element.

Idealization of the page number means correcting errors in sequencing, omitting casual variations in the way page numbers are printed (brackets, etc.), and supplying page numbers which are not printed on the page. The idealized page numbers will usually have the same form (arabic numbers, roman numerals, etc.) as the actual page numbers, unless there are overriding reasons to do otherwise. If the document has a separate numbering system for the front or back matter, the idealized numbers should do the same.

Every page in the document including the title page should have an idealized page number, recorded on the pb for that page. Page numbering of this sort should start with the first page of the text, which will usually be the title page but might be a frontispiece or some other page before the title page. It is up to the individual project to decide how to treat preliminary blank pages. If you do plan to include all blank pages, then when transcribing from microfilm or photocopies it’s important to be alert for omitted blank pages that will need to be included in the idealized pagination, so that the book itself is accurately represented.

Special cases:


Example 1.

A work which has frontmatter numbered in little roman numerals, followed by the body which is numbered in arabic numbers: the frontmatter should be numbered as follows:

<pb n="i">, <pb n="ii">, etc.,

followed by the body numbered

<pb n="1">, <pb n="2">, etc. 

Example 2.

A work in which the frontmatter does not have numbers printed on the page, followed by a body numbered in arabic numbers: the frontmatter should be numbered

<pb n="i">, <pb n="ii">, etc.

followed by the body numbered

<pb n="1">, <pb n="2">, etc. 

Similarly, if the body does not have numbers printed on the page, the page numbering should still be recorded in arabic numerals on the n of PB:

n=“1”, n=“2”, n=“3”, etc.

Example 3.

A work in which no page numbers appear at all, which contains a title page, frontmatter, and a body: number the title and frontmatter continuously with small roman numerals: number the body 1, 2, 3, etc. If there were a frontispiece preceding the title page, the page numbering for the frontmatter should start with the frontispiece (or, if that is on a verso, with its hypothetical recto; see above).

Example 4.

A document which has separately numbered subsections such as plays (e.g., 1-30 for the first play, 1-25 for the second, 1-34 for the third...): the numbering for each section should be encoded as n="1", n="2", etc. It is not necessary that each n value be unique in the document.