Skip to page content

WWP Presentations for Teaching and Learning Text Encoding and Digital Humanities

Interested in teaching or learning TEI? The slides, lecture notes, and other materials here were developed by the WWP for the workshops we teach, but they can also be a starting point for self-guided study. They are made available here for public reuse under a Creative Commons license. For versions of these materials that were used in a specific seminar or workshop, please visit the site for the event in question.

These presentations are authored in a customized version of TEI. You can view the slides and lecture notes, and also download or view the source TEI. More information about the schema and stylesheets used for authoring and using these materials is available at the WWP presentation system page.

Table of Contents

Types of Resources

A presentation may contain the following links:

slides
Slides used for presentation during workshops.
notes
Lecture notes used by workshop presenters.
tutorial
The tutorial format combines the presentation slides with lecture notes refined for asynchronous use. When possible, a slide will appear on the left, and accompanying notes on the right. On smaller screens or if a slide needs to take up more width, the notes will appear below instead.
source
The source XML from which the other formats are derived.

Core Topics in Text Encoding

These presentation materials form a rough sequence that covers basic concepts of XML, introduction to TEI, and specialized topics in TEI (such as encoding manuscript materials). If you start at the beginning of the list, no knowledge of XML or TEI is assumed. The later topics assume familiarity with XML and basic TEI. Working through all of these materials will provide a fairly solid grounding in TEI encoding. We also offer accompanying resources including crib sheets and a package of schemas and templates to get you started.

  1. Thinking about Digital Research Materials
    Description

    This unit is intended to serve as a framework for discussion of what we mean by “digital research”. It situates text encoding within the broader landscape of digital scholarly research, raises issues about data modeling and representation, and asks questions about what we need from our digital resources to support high-quality digital research.

  2. Modeling Humanities Data
    Description

    This unit is intended to serve as a framework for discussion of what we mean by “digital research.” It situates text encoding within the broader landscape of digital scholarly research, raises issues about data modeling and representation, and asks questions about what we need from our digital resources to support high-quality digital research.

  3. Overview of the TEI
    Description

    This unit is an introduction to the TEI. It covers our motivation for text encoding, what is the TEI, the TEI Guidelines, and the international scope and use of the TEI. It also addresses the different areas of usage of the TEI, the basics of customizability, and where to find out more.

  4. Motives for Text Encoding
    Description

    This introductory unit, typically used at the beginning of workshops and seminars, covers motives for text encoding and basic concepts of descriptive markup; and discusses why text encoding is never simple, and some essentials about of TEI.

  5. Brief Introduction to XML
    Description

    This unit is intended as a “lightning” introduction to XML, directed at non-technical audiences in cases where time is tight and only a conceptual introduction to XML is needed.

  6. More Details on XML
    Description

    This unit is also an introduction to XML, but unlike the “Brief Introduction to XML”, above, is aimed at an audience that is either more technical or will actually have to use XML. It contains the most fundamental introduction to XML. It includes an overview of the rules and structure of XML, which is necessary information for those looking to learn TEI. It covers the characters and syntax necessary to create a well-formed XML—and by extension, TEI—document, validity against a given XML schema, and the namespaces that differentiate one schema from schemas in other languages.

  7. Basic TEI Encoding
    Description

    This unit describes the basic elements used to encode a TEI document, focusing on the fundamental structural elements for marking up your text (in particular, for basic prose, poetry, and drama). Building from these foundational elements, the tutorial discusses phrase-level elements, such as names, references, and linguistic features. These slides also cover: how to correct, regularize, or modernize the text while still acknowledging the original; how to encode authorial or editorial deletions and revisions of the text; and how to show uncertainty about your reading of the text.

  8. Basic TEI Encoding (manuscript emphasis)
    Description

    This unit covers the basic elements used to encode a TEI document, focusing on the fundamental structural elements for marking up your text (in particular, for basic prose, poetry, and drama). Building on these foundational elements, it discusses: Encoding phrase-level elements, like names, references, and linguistic features; correcting, regularizing, or modernizing the text, while still acknowledging the original; encoding authorial or editorial deletions and revisions of the text; and showing uncertainty about your reading of the text.

  9. Next Steps: More Advanced Markup
    Description

    This unit builds upon the basic encoding unit to cover topics such as how to demonstrate connections between various parts of the text—such as a note and the material it is annotating—through linking. This unit also covers: displaying page images (facsimiles) with your markup; linking between fragmented textual features (i.e. features where your markup and the textual divisions don’t line up perfectly); and encoding the appearance of the text (through marking changes in rendition and handwritten additions).

  10. Basic Contextual Encoding
    Description

    This unit defines the TEI’s mechanism for contextual encoding, providing information on how to create structured data about certain things contained within your texts—persons, places, organizations, etc.—using TEI elements. Also covered is the encoding of various interpretations of a text through the creation of thematic or interpretive keywords. Similarly, this section will cover how to encode more structured, taxonomic information, such as genre. Finally, this unit provides recommendations for where to store the contextual and interpretive information you create.

  11. Basic Manuscript Encoding
    Description

    This unit describes the basic elements used to encode a TEI document, focusing on the fundamental structural elements for marking up your text (in particular, for basic prose, poetry, and drama). Building on these foundational elements, this unit discusses: encoding phrase-level elements, like names, references, and linguistic features; correcting, regularizing, or modernizing the text, while still acknowledging the original; encoding authorial or editorial deletions and revisions of the text; and showing uncertainty about your reading of the text.

  12. Advanced Contextual Encoding
    Description

    This unit tackles the more nuanced aspects of contextual encoding. For example: How do you record changes in a person’s identity or status over time? How do you record relationships between people, places, and things? How might one handle indirect references or references to groups? This section builds upon the material from the “Encoding Contextual Information” unit.

  13. Advanced Manuscript Encoding
    Description

    This unit is particularly important to those who want to encode manuscripts. The unit includes: how to mark different hands and to show where a given person’s handwriting starts and ends; and how to encode revisions, additions, and deletions. This unit also describes some of the particular challenges of encoding manuscripts, such as irregular and hard-to-organize structures.

  14. Metadata and the TEI Header
    Description

    This unit outlines the types of contextual information (or metadata) that one might want to provide for an encoded document. Metadata is important for many audiences of encoded documents because it can provide information that may not be explicit in the text itself. For example, one might include metadata about the birth and death dates of people in a historical novel, or provide contextual information about the publishers of a given book. This unit discusses the basic mechanisms the TEI provides for encoding such information; metadata and the encoding of other contextual information are covered more extensively in the Contextual Encoding Primer.

  15. Encoding Renditional Information
    Description

    This unit contains a discussion of how to describe rendition in your TEI document. While rendition is often simply expressed as “italic” or “boldface,” complexities often arise. Especially if you have a document where multiple renditional descriptors are required (e.g., headings are all-caps, boldface, and aligned center), you will need more robust ways of describing renditional features. This section provides methods for capturing this information using the @rend attribute.

  16. Figures and Graphics
    Description

    This unit covers the TEI encoding approaches to figures, graphics, and related encoding.

  17. Linking and Pointers
    Description

    This unit briefly covers the TEI's elements for linking and pointing within TEI documents.

  18. Overlapping Hierarchies
    Description

    This unit describes the various mechanisms available in TEI to represent multiple hierarchies that do not nest neatly together: for instance, paragraph structures and physical document structures such as pagination.

  19. Representing Non-Unicode Characters
    Description

    This short unit describes how to represent characters that are not included in the Unicode standard within a TEI file.

Specialized TEI and XML Topics

These presentation materials focus on TEI customization and XSLT.

  1. Introducing TEI Customization
    Description

    This unit provides a basic overview of TEI customization, including the role schemas play in data management and modeling, and the fundamental role customization plays in managing and constraining TEI data. It introduces the ways the TEI organizes its XML constructs through modules and classes, and it also walks through the actual process of customization and what it entails.

  2. Basic ODD-Writing
    Description

    This unit covers basic concepts and practices in creating TEI customizations directly in XML (rather than through a web-based tool like Roma).

  3. Advanced ODD-Writing
    Description

    This unit covers some more advanced topics in TEI customization, including making changes to classes and class membership, creating new elements, and version management.

  4. Basic RelaxNG for ODD-Writing
    Description

    This unit provides a basic orientation in the use of the RelaxNG schema language, for the specific purpose of writing content models in a TEI customization (ODD) file. RelaxNG can also be used on its own to write schemas for XML data, and this unit would be helpful but not a complete reference for that purpose. Now that the TEI has moved its customization language to largely eliminate the use of RelaxNG (in favor of a native TEI language for expressing content models), this unit is no longer as central as it used to be, but there are still some constructs that can only be expressed using RelaxNG.

  5. XPath and Schematron for TEI Customization
    Description

    This unit provides an overview of Schematron, an open schema language that tests conditions that you set, and provides you with error messages when a condition fails. This schema language, which can conveniently fit right into an ODD file, is useful for catching errors that a closed schema cannot. This section relies heavily on XPath, so we recommend completing the XPath tutorial before starting this section.

  6. Navigating the XML Tree with XPath
    Description

    This unit covers XPath, which is a way of navigating an XML tree: an essential operation in XSLT and other XML technologies such as Schematron. XPath allows the selection of specific elements depending on their context. So, for example, perhaps you want to render <quote> in a particular way when it comes up in <epigraph> and in another way when it occurs in <p>. XPath allows you to specify context for a given element, which makes transformations more nuanced.

  7. Introduction to XML publishing
    Description

    This unit provides an overview of XML publication platforms and outlines the basic framework for the rest of the tutorials in Transformation and Publication. We also cover why you might want to publish TEI data, and some different approaches to publishing it.

  8. Introduction to XSLT
    Description

    This unit provides an overview of XML publication platforms and outlines the basic framework for the rest of the tutorials in Transformation and Publication. We also cover why you might want to publish TEI data, and some different approaches to publishing it.

  9. Exploring the XSLT Processing Model
    Description

    This unit continues the discussion of how XSLT stylesheets process their input information to create a new output. It also describes namespaces and languages, which are important to keep in mind when transforming from one XML language into another.

  10. Conditionals in XSLT
    Description

    This unit, continuing the set of tutorials focused on XSLT, outlines how to set up conditional statements for your transformations. This tutorial, through examples, explains the two conditional branching structures available in XSL: <if> and <choose>. For an additional branching structure using XPath, see XPath if-then-else.

Text Analysis and Using Data

These presentation materials focus on using word embedding models, with a particular emphasis on TEI data. For a full set of materials including commented code walkthroughs and other resources, please visit the word vectors primer.

  1. Word Vectors: Introductions and Overview
    Description

    This unit provides an orientation in the fundamental concepts of word-embedding models, including the concept of "vectors" and the multidimensional space of the word-embedding model itself.

  2. An Introduction to Word Vectors
    Description

    This unit takes a closer look at the core concepts of word-embedding models, including the parameters we control when training a model, cosine similarity, clustering, and the process of training a model.

  3. Data Preparation and Model Training
    Description

    This unit covers the practical details of preparing your data and training a model, including some approaches to validation.

Availability and Reuse

These materials are developed by Syd Bauman, Julia Flanders, Sarah Connell, and the Women Writers Project. We welcome reuse of these materials.

Creative Commons License This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.