Entry | Abstract |
---|---|
Unitary and composite texts | Using <text> to encode both individual texts and groups of texts |
Groups of texts | Criteria for deciding whether to use <group> |
Embedding <text> | Using <text> for encoding embedded narratives, letters, and other documents |
Letters embedded within other works | More details on encoding embedded letters, a special case of embedded texts. |
Unique identification of the <text> element | Details on using the id= attribute on <text>, both for independent documents and for embedded texts |
Front matter | General notes on encoding front matter, including the various types of prefatory material |
Back matter | General notes on encoding back matter, including the various types of concluding material |
Divisions of the text: general | General notes on the function and use of <div>, including its basic internal structure |
Divisions of the text: types of <div> | Specific discussion of possible types of <div> |
Numbering: general | Use of the n= attribute and handling of line numbering |
Numbering of divisions | Use of numbered versus unnumbered <div> elements |
Frontispieces | Use of <div type="frontispiece"> |
Advertisements | Encoding of two different kinds of advertisements: advertisements for other books, and preliminary addresses to the reader |
Tables of contents | Encoding of tables of contents with <list> inside <div type="contents">, with internal encoding to capture the functional parts of the table of contents information, such as page numbers and titles. |
Comparison of indexes and tables of contents | Differences between tables of contents (ordered by location in the book) and indexes (ordered by topic). |
Headings | Use of <head> to encode headings, and permissible values for type= attribute |
Heads and labels | Comparison of headings and labels, and the use of <head> and <label> |
Argument | Definition of argument, and use of <argument> element |
Epigraphs | Encoding of epigraphs with either <epigraph> or <div type="epigraph"> depending on their structural location. |
Dedications | Encoding dedications using <div type="dedication">; distinguishing between dedications and other prefatory material |
Openers | Using <opener> to group together information at the top of a division (especially in letters); the usual contents of <opener> |
Closers and trailers | Using <closer> to group together information at the bottom of a division (especially in letters); the usual contents of <closer>; difference between <closer> and <trailer> |
Salutes and signatures | Encoding of salutations and signatures in letters using <salute> and <signed> |
Lists: subscriber lists | Encoding subscriber lists using <list type="subscriber">, including handling of “ditto” used for repeated items |
Lists: errata | Encoding of errata lists with <div type="corrigenda"> and a nested <list type="errata">. Within each list item, further encoding captures the functional components such as the error, the page number, etc. |
Overlapping and fragmented elements | Strategies for handling overlapping textual features, particularly quotations and poetry |
Letters: general notes | General information on encoding letters in three different contexts: within collections, embedded in other works, and as prefatory material |
Letters as prefatory material | Encoding letters which serve as prefatory material |
Collections of letters | Encoding collections of letters using <div type="letter">, and description of internal structure |
Postscripts, <ps> | Encoding of postscripts using the WWP element <ps> |
Essays | Identification and encoding of essays |
Journal and diary entries | Encoding of diary and journal entries using <div type="entry"> |
Lists: general notes | Encoding lists, including discussion of criteria for identifying lists |
Quotations | Encoding of quotations, distinction between use of <q> and <quote>, treatment of quotation marks |
Figures | Encoding of figures and illustrations using <figure>; handling of text within figures; discussion of the WWP’s changes to the content model of <figure> |
Encoding verse: general notes | General discussion of encoding poetry, including the use of <text>, <div>, and <lg> to encode basic poetic structures |
Verse lines | Encoding of verse lines and line breaks within verse lines |
Types of poem | Encoding basic verse types, including possible values for type= on <div> for poetry |
Fixed poetic forms | Encoding of clearly defined poetic forms such as sonnets, including possible values for type= on <div>, and instructions for marking internal subdivisions |
Stanzas and generic forms | Encoding of stanz |
Specific line groups | Encoding of specific types of line group, such as couplets, quatrains, Spenserian stanzas, etc. |
Excerpted and quoted poems | Handling of excerpted and quoted poems, including cases where the full extent of the original poem is unknown |
Poetry and drama | Discussion of the intersection between poetry and drama, including verse drama, poetry and songs in drama, and dramatic verse |
Line breaks in verse | Line breaks within individual verse lines should be encoded with <lb>. |
Acts and scenes | Encoding of dramatic acts and scenes, using <div type="act"> and <div type="scene"> |
Cast lists | Encoding of cast lists using <castList>, situations where the original cast list is missing or incomplete, discussion of WWP changes to the TEI DTD |
Speeches and speakers | Encoding of dramatic speeches and speakers using <sp> and <speaker>, use of the who= attribute. |
Stage directions | Encoding of stage directions in drama and verse dialogues, position of stage directions, identification of speakers within stage directions |
<stage> type= attribute | Categorization of stage directions using the type= attribute, with a list of permissible values |
Simultaneous action in drama | Handling of simultaneous action in drama, in particular the encoding of cases where simultaneous action is marked with a printed brace |
Principles of transcription: general | General principles of transcription, including details of what is and is not captured, and the order in which it is represented |
Regularization: silent | Features which the WWP silently regularizes, including details of spacing, delimiters, type size, and typography |
Regularization: <orig> | Explicit regularization using <orig> |
Punctuation: general | Transcription of punctuation, including treatment of hard and soft hyphens |
Punctuation and elements | Position of punctuation relative to element boundaries |
Transcription of primary sources | Use of elements from the TEI tagset on transcription of primary sources |
Features omitted from transcription | Use of <gap/> to encode explicit omissions from the transcription, and cases where silent omission is allowed |
Typography: I, J, U and V, general | Transcription and encoding of early typography using <orig> |
Tagging the letter, tagging the word | Application of <sic>, <orig>, and <abbr> at the word and letter level |
Typography: recognizing difficult letter forms | Discussion of specific letterforms in the WWP collection, including long s, disambiguation of I and J, U and V |
Special characters: entity references | Use of entity references for special characters, boilerplate, and decorative features of the text |
Special characters: ordinary characters requiring special treatment | Further detail on ordinary characters which must be encoded with entity references in particular contexts or because they serve special functions |
Special characters: brevigraphs and diacritical marks | Using entity references to transcribe brevigraphs and characters with diacritical marks |
Special characters: miscellaneous | Details of various kinds of special characters not covered elsewhere |
Ellipsis | Encoding of ellipsis using the entity reference … |
Roman numerals | Transcription of roman numerals, and regularization of roman numeral dates |
Errors in the original | Encoding of errors in the document source using <sic>; situations where corr= is and is not used; distinguishing between error and old spelling |
Sequencing errors | Encoding of errors in sequencing, such as scene or page numbering |
Reading order | Discussion of the principle of “reading order” to guide the order of transcription in cases where the text flow contains parallel or non-sequential segments |
Handwriting: the hand= attribute and the <hand> element | Identification of handwriting using <hand> and the hand= attribute |
Handwriting: additions and deletions | Encoding handwritten additions and deletions using <add>, <addSpan>, <del>, and <gap/> |
Unclear text | Handling damaged, unclear, or illegible text, including missing or deleted letters, damage to the original, or unclarity in the reproduction, using <sic>, <del>, <unclear>, <supplied>, and <gap/> |
Gap: general | General notes on the use of <gap/> to encode material omitted from transcription |
Gap: use of the extent attribute | Detailed notes on the use of the extent= attribute on <gap/> to indicate the extent of text being omitted from transcription |
Gap: use of the extent attribute, advanced | Excruciatingly detailed information on the use of the extent= attribute on <gap/> to encode the signature sequences of pages omitted from transcription |
Encoding document appearance: renditional information | General notes on what kinds of renditional information we do and do not capture |
Renditional distinction: overview | The WWP uses a decision tree to help determine how to encode different kinds of renditionally distinct phrase-level text features. |
Special typography | General notes on what aspects of typography the WWP does and does not capture |
Small caps | Encoding of small capital letters, including notes on how the WWP defines and recognizes small capitals, and how they should be transcribed |
Spacing and sizing | Regularization of sizing and spacing, including regularization of vertical and horizontal space and of type size |
Decorative capitalization | Decorative capitalization should be encoded with <hi>, with an optional type= attribute if categorization is useful. |
Dashes | Encoding of dashes, including em-dashes and en-dashes, using entity references |
Special characters: inverted characters | Treatment of characters which are printed upside down in the source |
Rules and ornaments: definitions | Transcription of rules and ornaments using an entity reference |
Quotation marks | Quotation marks should be captured where possible as renditional information modifying the element that motivates their appearance. |
Punctuation and quotes | Transcription of punctuation in relation to quotation marks and the <q> and <quote> elements |
Punctuation and font | Treatment of the font of punctuation, particularly in cases where the font is not accurately captured by the element context |
Font of numbers | Treatment of the font of numbers |
Leaders | Transcription of leaders (separators within column-formatted lists such as tables of contents) |
Columns | Encoding of multi-column layouts using the columns keyword in the renditional ladder |
Special characters: superscription | Treatment of superscripted characters using the rend= attribute or an entity reference |
Indentation | Treatment of indentation using the indent keyword in the rendition ladder; handling of indentation resulting from an enlarged initial capital letter |
White space | Treatment of vertical and horizontal white space |
Type size and face | Treatment of type size and type face, information the WWP does and does not record |
Rules and ornaments: use as delimiters | Encoding of rules and ornaments as delimiters on elements, using the rend= attribute |
Renditional defaults | Methods of setting renditional defaults, using the <tagsDecl> in the TEI header |
Encoding document appearance: rendition ladders overview | General notes on the use of the rendition ladder in the rend= attribute, overview of keyword/value structure |
Rendition ladders: common keywords and values | Overview of keywords and values used in the rendition ladder |
Renditional keywords: break, and line break defaults for WWP elements | Use of the break keyword to capture line breaks between elements, including defaults assumed in WWP practice |
Renditional keywords: slant and weight | Use of the slant and weight keywords to capture italicization and bold type |
Renditional keywords: pre, post | Use of the pre and post keywords to capture characters printed before or after an element (used as delimiters) |
Renditional keywords: case | Use of the case keywordto capture case, and approaches to transcription |
Renditional keywords: general points on indentation | Overview of the encoding of indentation, including absolute and relative indentation, first-line indentation, and negative indentation |
Renditional keywords: indent | Specifics on the use of the indent keyword to encode indentation (absolute and relative) |
Renditional keywords: first-indent and right-indent | Specifics on the use of the first-indent keyword and right-indent keyword to encode first-line indentation and right indentation |
Renditional keywords: alignment | Use of the align keyword to encode horizontal alignment of elements whose position on the page is vertically constrained |
Renditional keywords: place | Use of the place keyword to encode the vertical and horizontal position of elements whose position on the page is unconstrained |
Renditional keywords: sub and sup | Use of the sub and sup keywords to encode subscription and superscription of letters |
Renditional keywords: columns | Use of the columns keyword to indicate the number of columns in a page layout |
Renditional keywords: pos | Use of the pos keyword to indicate the position of the remainder of a verse line which is printed on the line above or below |
Renditional keywords: braced | Use of the braced keyword to encode bracing used to group together multiple lines (e.g. lines of poetry) |
Renditional keywords: fill | (Non)use of the fill keyword |
Rendition ladders: border | Use of the border keyword to encode borders around elements |
Renditional keywords: bestow and bequeath | Use of the bestow and bequeath keywords to propagate renditional information from an element to its children or descendants |
Renditional keywords: get | (Non)use of the get keyword to duplicate the renditional features of a given element on other elements |
Phrase-level encoding: general notes | The WWP does not include phrase-level encoding in textual apparatus that duplicates content elsewhere in the text. |
Names: general notes | Overview of the WWP’s encoding of names, including personal names, place names, organizational names, and the names of objects |
Names of humans | Discussion of the encoding of human names using <persName>, including criteria for identifying creatures as human, and guidelines for nesting name elements |
Names of places | Discussion of encoding the names of places using <placeName>, including definition of “place” and relationship between place names and personal names |
Names of non-humans and things | Discussion of the encoding of the names of non-human creatures, things, and events using <name> |
Names of collectivities and organizations | Discussion of encoding the names of collectivities and organizations using <name> and <orgName>, including distinctions between collectivities and organizations |
Names: difficult cases | Discussion of some difficult cases in the encoding of names, including lists of boundary cases |
Names: problems of multiple reference | Discussion of encoding personal names that refer to more than one person |
Names: abbreviations | Encoding of abbreviated versions of names |
Name keys | Use of the key= attribute on <persName> to uniquely identify individuals |
Special terminology, irony, and other forms of textual highlighting | Encoding of specialized language, including technical terminology, ironic usage, and words which are being discussed as words rather than used |
Emphasis | The <emph> element should be used for linguistic emphasis, where that can be distinguished from casual or decorative highlighting and from other motivating factors such as titles, foreign words, and so forth. |
Abbreviations | Encoding of abbreviations using <abbr>, including a list of common abbreviations which are not tagged, and treatment of punctuation |
Abbreviations and <orig> | Use of the <abbr> element in connection with old-style typography |
Authors in the main text | Encoding of authors in bibliographic entries, using <author> and <persName> |
Titles in the main text | Encoding of titles in bibliographic entries and in running prose, using <title>, including criteria for identifying titles |
Foreign words and phrases | Encoding foreign-language words and phrases using the lang= attribute on existing elements, and the <foreign> element when necessary |
<mcr> | The WWP uses <mcr> to encode phrase-level renditionally distinct words and phrases that cannot be assigned to any more specific category. |
Simple highlighting | Encoding of simple renditional highlighting using <hi> |
Proper adjectives | Encoding of proper adjectives using <mcr> |
Referencing strings (the <rs> element) | Use of the <rs> element |
Measures and numbers | Encoding of numbers and measurements using <measure> |
Dates: general | Encoding dates using <date> and the value= attribute, including detailed instructions on the ISO8601 standard for date values |
Dates: date ranges | Encoding date ranges using the <date> element rather than <dateRange> |
Dates, errors in | Encoding errors in dates |
Dates: BC dates | Encoding of BC dates |
Dates: Julian calendar and old-style dates | Encoding of old-style dates and dates expressed in the Julian calendar |
Time | Encoding of time using <time> and the value= attribute; our usage limited to cases which are used to structure a set of entries in a journal or log |
<unknown> | Use of the <unknown> element as a placeholder to flag textual features for which the correct encoding is uncertain |
Title pages | Encoding of title pages using <titleBlock>, including a description of possible values for type=, and the various parts of the title page and how to encode them |
Document titles | Encoding of document titles on the title page using <docTitle> and <titlePart>, including possible values for the type= attribute of <titlePart> |
Authorship of the document | Encoding attributions of responsibility using <respLine> |
Colophons | Encoding colophons using <titleBlock type="colophon"> |
Forme work (metawork): general | Encoding various types of forme work (including page numbers, line numbers, catchwords, press figures, signatures, and a few other features) using the <mw> element |
Forme work, encoding within | Discussion of the types of encoding which may appear within the <mw> element |
Forme work: renditional issues | Encoding renditional distinctions within <mw> |
Page breaks and page numbering | Encoding of page breaks and page numbering using the <pb/> element and its n= attribute, including guidelines for creating idealized page number sequences |
Signatures | Encoding of the collation of the document, recording both printed signatures as they appear on the page using <mw type="signature"> and also an idealized signature sequence using <milestone unit="sig"/> |
Line breaks: general | Line breaks in general are encoded with <lb>, with the exception of verse lines. |
Line numbers | Encoding line numbers that are printed in the original text using <mw type="lineNum"> |
Milestones, <mw>, and <div> boundaries | Discussion of the order and location of elements associated with page breaks (catchwords, milestones, page numbers, etc.) |
Catchwords | Encoding of catchwords using <mw type="catch">, including how to handle discrepancies between the catchword and the matching word in the text |
Running headers | (Non)encoding of running headers |
Hyphens, soft and hard | Encoding of hard and soft hyphens, including guidelines for determining when a line-end hyphen is soft |
Notes, endnotes, and footnotes: overview | General overview of encoding notes, including footnotes, endnotes, marginal notes, and inline notes, and giving a summary of how notes are linked to the main text |
Endnotes | Encoding of endnotes using <note>, within a separate <div type="endnotes"> |
Inline notes | Encoding of inline notes, using <note rend="place(inline)"> |
Notes in the TEI header | Encoding of textual notes in the <notesStmt> of the TEI header |
Notes: encoding the note itself | Encoding the text of a note, including details of the WWP’s changes to the content model of note, and discussion of the various things that appear in notes |
Notes: linking the note and the text | Notes (including footnotes, endnotes, and marginal notes) should be linked to their anchor point using a bi-directional link which explicitly identifies both the anchor point and the note. |
Notes: resp= and type= | Use of the resp= and type= attributes on <note> to capture the authorship of the note |
Notes, page breaks within | Encoding page breaks within footnotes, using a second <pb/> element which points to the main <pb/> element |
Notes: revised content model | Description of the WWP’s revised content model for notes |
<hyperDiv> | Use of the WWP <hyperDiv> element as a container for notes and other hypertextual components of the text, such as acrostics and supplemental cast lists |
Bibliographic references | Encoding of bibliographic references using <bibl>, including guidelines for identifying bibliographic references and when not to encode them. |
Links and cross-references | Encoding of links and cross-references using <ref> and <xref> |
<xref>, <xptr> | Details of the WWP’s use of <xref> and <xptr/> |
Repetitions in lists | Encoding repetitions signalled with “ditto”, “ibid”, and similar markers, using the sameAs= attribute |
Acrostics | Encoding acrostics, and in particular capturing the encrypted word or words from the acrostic in a searchable form, using the WWP <acrostic> element |
TEI Header | General notes on the TEI header and its use at the WWP |
TEI Header, ID attribute of | Use of the id= attribute on <teiHeader>, including the format for its value |
Titles in TEI header | Encoding two different forms of the document title in the TEI header, in <titleStmt> and in <sourceDesc> |
Key= in TEI Header | Encoding key= values on names in the TEI header |
ID values: general notes | General information on use of id= in WWP texts, including guidelines for choosing values |
ID values of texts and sub-texts | Guidelines for encoding id= values on <text> |
list all entries