The simplest way to publish a TEI-encoded document (besides emailing it to a colleague) is to make it visible on the web directly. Because the TEI is expressed in XML, any XML-aware internet browser can read the TEI markup. If given a stylesheet that contains information about how to display this markup, the browser can format and display a TEI document just as it can HTML. At the time of writing, most major browsers are capable of displaying TEI/XML documents in this way.
There are some limitations to this approach. At the moment, there is no way of creating working hyperlinks; browsers can only process links in HTML documents. There is also no way of altering the order in which the document’s information is presented, although you can adjust the positioning of chunks of text and also suppress them altogether.
What you need to do:
<?xml version="1.0"?> <?xml-stylesheet type="text/css" href="wwpguide.css"?> <TEI> ... </TEI>The href attribute contains a path or pointer to the CSS stylesheet, wherever it lives.
XSLT (the Extensible Stylesheet Language for Transformations) makes it possible to transform XML data (such as TEI-encoded files) into any other XML format (such as XHTML) and also into other useful formats such as PDF, RTF, tab-delimited data, and many others. These transformations may result in a simple mapping of TEI elements onto elements in the target language: for instance, to enable a TEI-encoded text to be displayed as XHTML on the web, with TEI ref elements transformed into working links using XHTML a elements. However, XSLT can also be used to make more profound changes to the document structure, by selecting, omitting, reordering, or restructuring parts of the original XML document. For example, an index of first lines could be generated from a TEI collection of poems, using an XSLT stylesheet that selected only the first l element from each poem, and then sorted them into alphabetical order. Or a TEI-encoded magazine containing serial fiction could be transformed so as to extract and present each individual narrative in its entirety. In a TEI-encoded file representing a document containing hand-written revisions, XSLT could be used to generate two views of the document: the unrevised version and the final revised version. XSLT is an extraordinarily powerful tool and forms the basis of (or contributes to) most XML publication systems.
Using only XSLT, a small-scale project can develop a serviceable interface through which readers can browse and search a TEI-encoded collection. There are two main ways in which XSLT is used:
The limitations of XSLT for publishing TEI are partly those of scale: because XSLT operates by processing XML directly, it is comparatively slow as a means of working with large quantities of data. For very large projects with thousands of long documents, XSLT on its own is probably not sufficiently fast or powerful to build a working publication. Also, although it can be used to search a text, it is not really designed for this purpose and when working with any substantial amount of data will be an unworkably slow search tool.
Where it is possible for a comparative novice to simply pick up CSS informally, by experimentation, XSLT is a more complex tool and requires both more time and (for most people) more instruction to learn. For most TEI projects of any scale, the development of the necessary XSLT stylesheets will be a substantial development task and will probably be the responsibility of a programmer or technical consultant. However, it is by no means beyond the capacity of humanists to learn, and there are increasingly workshops and tools available to assist those who want to work with XSLT on their own. For example, the NINES project offers an annual summer workshop which includes a strand on XSLT, and they are also developing tools to assist scholars in building and using simple XSLT.
XML publication systems are more complex tools (or aggregations of tools) that improve on the publication methods discussed above by increasing the speed and efficiency of the processing being done (and hence of the resulting interface), and also by expanding the functions that can be provided. XML publication systems typically include some kind of search engine coupled with a system for indexing the XML data: essentially, putting the data into a form that (like an index) lends itself to speedy searching. These publication systems also usually incorporate a framework for managing the interface of the publication (through XSLT and CSS) so that the texts being published can be transformed into XHTML and displayed with the appropriate appearance. In addition, they may include modules to permit specialized functions like text analysis, visualization, linking to other projects’ data or to centralized information resources such as Google Maps.
Systems like these are typically intended for use by projects with substantial quantities of data and a need for highly functional interfaces that include complex searching and analysis. As a result they are designed to be installed and managed by programmers and those familiar with system administration, and do not usually lend themselves to use by individual humanities scholars or those who lack access to technical support. In addition, they may require significant configuration in order to function within the context of a specific project; if they operate out of the box at all, it may only be in a very plain and generic way that does not do justice to the details of the project’s own data and user needs. These configurations may include the development of project-specific stylesheets, configuration of indexing modules to deal with the specific elements that must be indexed for a specific data set, and basic setup of permissions and paths to allow the system to operate within the local server environment.
There exist a number of open-source XML publication systems as well as commercial products at various levels of expensiveness and functionality. Examples that the WWP has experimented with to some degree include:
In addition to these systems in which the various components are already bundled together, one can also build an XML publishing system that combines various open-source components as needed to provide the functionality required. The available components and possible combinations are too many to detail here, and such a system would require the work of a programmer or someone with similar expertise.