Intensive Introduction to TEI
XML exercise
This brief exercise is intended to give you a quick initial exposure to the XML editor we’ll
be using in the class (named “<oXygen/>” … hereafter “oXygen” in this document). If
you’re already familiar with XML, or even if you’ve ever edited HTML files before, it will be
very simple.
To download and install oXygen, go to the SyncRO Soft web
site and download a free trial copy of the software. Follow the instructions to install
it on your computer. (If you have access to the lab where we’ll be holding the class, you may
also be able to practice there.)
A note on oXygen. This editor has a number of advantages: it’s fairly cheap, easy to learn,
and powerful; it can help you edit not only XML files, but also XSLT stylesheets, schemas and
DTDs, and other types of files. It comes bundled with the current version of TEI and also the
previous major release. Its one real disadvantage is that it is slow and
memory-hungry. If you type quickly, you may outrun it. If it behaves oddly—for instance, if
you try inserting an element and it inserts the wrong one—the most likely explanation is that
you are moving too quickly.
If you plan to purchase oXygen (which is not necessary), note that anyone at a TEI member
institution including all students, staff, and faculty get a 20% discount. Ask the TEI for the discount code if you would like to
take advantage of the discount.
These instructions use vocabulary that is covered in the TEI’s Gentle Introduction to
XML (e.g. “element”, “tag”, “attribute”, “valid”). If your XML is rusty, it will help to
read at least section v.iii XML
structures before you start this exercise. Again, if you find this unfamiliar you should
attend the session on the 22nd. But to start you off, here is a quick basic glossary:
- Element: A single textual unit within an XML document, a conceptually
meaningful piece of the document. You might think of elements as the “nouns” of the text:
the blocks the text is made of. Elements are delimited by tags, a start-tag and an end-tag.
The element includes these two tags plus the content they enclose. Elements can also be
empty: that is, they may contain no content.
Example:
<name>Sara</name>.
A sample empty
element: <pb/> (this notation is exactly equivalent to
<pb></pb>).
- Attribute: A descriptive modifier to an element. You might think of
attributes as the “adjectives” of the encoding: they add detail and nuance to what the
elements say. Attributes can be used to describe the appearance, number, type, and many
other characteristics of the information being encoded. Attributes always appear within the
element start-tag. You can have multiple attributes for a single element, but each attribute
can only appear once per element.
Example: <name
type="person">Sara</name>
- Tag: A piece of markup indicating the start or end of an element. Tags
are delimited by angle brackets.
A sample start-tag: <name
type="person">.
A sample end-tag:
</name>.
- Valid: An XML document is said to be valid when it conforms to the rules
defined in a schema. A schema is a way of expressing the constraints on document structure
that define a particular markup language; examples you may be familiar with include EAD,
TEI, XHTML, or DocBook. Note that a “DTD” (document type definition) is a type of
schema.
Now for the exercise:
- Launch oXygen. Be patient, it may take a little while. If your computer is low on RAM, try
to keep other applications to a minimum.
- In the File menu, choose “New…”; in the resulting dialog box choose the “From templates”
tab; and from the list choose “XHTML 1.0 Strict”. (For an extra challenge choose “TEI P5
Bare”.) You’ll be given a template which contains a skeleton XML document, using the the
XHTML (or TEI Guidelines) markup language. This skeleton is already valid (that is, it
contains all the minimally essential elements required by the markup system you’ve chosen).
You can check this by clicking on the little document icon with a red check mark in the top
center of the window (mouseover says either “Reset Cache and Validate” or “Validate
Doucment”).
- Take a look at the markup that’s already there. Some of the element names are fairly
self-explanatory (you can readily understand what the <title> and
<body> elements are for…). How much of the markup can you figure out without
reading the documentation?
- Prove to yourself that this file is valid: click on the red check mark in the top center
of the window. OXygen will think for a moment and then should give a “Document is valid"
message and a green square icon in the bottom of the window.
- Now try adding an XML element to your skeleton. In the XHTML example, put your cursor in
between the start-tag and end-tag for the <body> element (i.e. right
<body>here</body>). (In the TEI example, there is a
comment between the <body> start-tag and end-tag: delete it.) Type a left angle
bracket (<) and see what happens. You should be given a menu listing all of the
elements that are permitted at the location of your cursor. Some of these have obvious
functions, others may be fairly obscure. Note: in XHTML it is often necessary to scroll to
the bottom of the yellow pop-up box to find the description of the selected element.
- Use your arrow keys to move around in the list, and double-click on your choice or type
RETURN to select it. OXygen will insert your chosen element for you.
- Validate your document again (click on the red check mark). Is it still valid?
- Type some text into the element you’ve just inserted. Now select a word (e.g. by
double-clicking on it) and from the “Document” menu choose “XML Refactoring” (a bizarre
term!). You’ll see a list of choices, ways to insert or alter the markup in your document.
Choose “Surround with tag…” (and note the keystroke sequence—much handier to use). You’ll
get the same menu as before. Select an element from the list and type RETURN to insert the
markup. Validate your document again.
- Try inserting an attribute. Put your cursor inside the start-tag for any element, just
before the closing > character. (That is, right <namehere>.)
Now type a space between the element name and the closing >. OXygen should display a
list of attribute names. Which ones you see will depend on the element you’re in; at a
minimum, you should see “id” and “lang” (in TEI that will be “xml:id” and “xml:lang”).
Again, choose the one you want and type RETURN to insert it. Now you have an attribute, but
no attribute value. Type a value in between the quotation marks. Try validating again. (If
the attribute you chose requires specific values, your document may fail validation. Can you
figure out what’s wrong?)
- If your document is still valid, try creating an error. Choose any element and delete one
of its tags (start-tag or end-tag). Now validate your document. Does the error message make
sense? (Probably not… reading validation error messages is a black art. You can pick it up
by deliberately producing errors and seeing what the error messages look like.)
Java Stack Overflow Error
If you get a “stack overflow” error message, you may be able to fix this by giving java a
larger stack. Here are detailed instructions for doing so in Mac OS X. If you are using
GNU/Linux or Windows you may wish to see
SyncRO
Soft’s documentation.
Mac OS X, for double-clicking the oXygen icon
- Find the oXygen application itself (not an alias or the dock version)
- Hold the control key down and click on it
- Select “Show Package Contents” from the pop-up menu
- From new window (of Oxygen), open “Contents”
- optional: make a backup copy of Info.plist
- Edit Info.plist in your favorite editor. This may not be as easy as it sounds, as
this is not a normally visible file and is generally not associated with an application.
Note that it is an XML file, and you can edit it in oXygen, but because it does not end in
“.xml” you cannot drag-and-drop it onto the oXygen icon. Personally I usually drag the icon
onto the main oXygen window. You may need to select “XML document” from the list of
possibilities oXygen presents to you on opening it.
- Find the string “-Xss”. It should look something like:
<key>VMOptions</key>
<string>-Xss650K -Xms32M -Xmx190M</string>
- Increase the value of the “-Xss” switch, e.g. to “-Xss2M”. You may wish to increase the
others, too, especially if your machine has a lot of RAM. See below for a more detailed
explanation and values.
- Save the Info.plist file
- If it is running, quit oXygen
Mac OS X, for issuing /Applications/oxygen/oxygenMac.sh
on commandline
- Edit the file /Applications/oxygen/oxygenMac.sh (can use oXygen as the editor, if
you like)
- Near the end, find the “-Xss” switch. It should look something
like:
java "-Xdock:name=Oxygen"\
-Dcom.oxygenxml.editor.plugins.dir="$OXYGEN_HOME/plugins"\
-Xss650k\
-Xmx256m\
- Increase the value of the “-Xss” switch, e.g. to “-Xss2M”. You may wish to increase the
others, too, especially if your machine has a lot of RAM. See below for a more detailed
explanation and values.
below
switch |
meaning |
comments |
what I use |
-Xss |
the stack size for each thread |
Each thread in the VM gets a stack. The stack size will limit the number of threads that
you can have. If stack size is too large you will run out of memory as each thread is
allocated more memory than it needs. If stack size is too small, eventually you will get a
stack overflow error. |
2M or 4M |
-Xms |
initial java heap size |
Set to a multiple of 1M that is greater than 1M. Some suggest that as a general rule,
should be set equal to the maximum heap size (-Xmx). |
16M |
-Xmx |
maximum java heap size |
If set very high, oXygen will only need to “clean out” its memory rarely, but it will
take longer |
512M |
I don’t know, but my guess is that it is a bad idea to set any value to more memory than your
machine has. To find out how much memory your machine has, select “About This Mac” from the
apple menu on the left end of your menu bar. I recommend, without any knowledge or basis, that
maximum heap size (-Xxmx) be well under 3/4 of your total real memory. (If anyone
knows better, please let me know!)
Some info from http://www.caucho.com/resin-3.0/performance/jvm-tuning.xtp and some from
http://edocs.beasys.com/wls/docs81/perform/JVMTuning.html both on 2008-03-19.