Getting Started

What to do...

The WWP training process involves learning several different kinds of things, all at the same time. We recognize that this is difficult! Here is a quick summary of what you’ll be focusing on as you get started:

  1. Learning how to use the Oxygen XML editor. This is the editing environment we use to do the encoding; it’s fairly intuitive at a basic level, but it has some advanced features that are worth learning as you progress. We have a crib sheet of basic Oxygen commands.
  2. Learning some basic information about the XML and the TEI. We’ll cover some of this in person, and a list for further reading is below.
  3. Working through some introductory exercises.
  4. Encoding your first text. First, choose a text in consultation with Sarah. Find the record on the text tracking board and update the record to indicate that you are claiming it. Take some time to skim through the text and understand its basic structure and features. Next, find the tadpole file for your text in the under_construction directory and begin transcription. We also have a sample to look at. As you encode, look up unfamiliar elements both in the TEI guidelines (see below) and in the WWP’s own documentation, and ask questions often.
  5. Learning to validate your XML and fix errors. As you work on your file, you should check frequently to make sure it is valid. (A good rule of thumb is to validate every time you save, and to save often!). Validate your file by typing Command-shift-v, or clicking the red check mark at the top of the window. The instructions here will tell you how to interpret any error messages you receive.
  6. Learning to use Subversion for version control. We have a crib sheet that covers the basics and we will go over this in training.
  7. When your file is as good as you can make it, Sarah will go over it with you. Update the text tracking board; the text will then be printed for proofreading.

Pointers and samples


Some people learn best by reading first, then practicing; some people prefer to practice and read when they need to find something specific. The readings below are the main sources of information on text encoding at the WWP and you’ll probably need to cover most of them during the course of your training.

Where things are

The texts we use to transcribe from are called “OTs” (“Office Texts”) and are stored in the file cabinets in the DSG office. Each one has a unique “OT number” (OT00001, etc.) that serves as a catalogue number.

All encoding progress is tracked on the WWP Text Tracking Trello board.

Meeting notes and other useful information can be found in the WWP section of the Digital Scholarship Group Wiki.