<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="../../../_utils/schema/yaps.rnc" type="application/relax-ng-compact-syntax"?>
<?xml-model href="../../../_utils/schema/yaps.isosch" type="application/xml" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<?xml-stylesheet type="text/xsl" href="yaps2slidy1.xslt"?>
<!-- $Id: xslt_processing.xml 28651 2016-05-13 09:27:29Z syd $ -->
<TEI xmlns="http://www.wwp.northeastern.edu/ns/yaps" version="5.0">
	<teiHeader>
		<fileDesc>
			<titleStmt>
				<title>The XSLT Processing Model</title>
				<author>Julia Flanders and Syd Bauman</author>
			</titleStmt>
			<editionStmt>
				<edition>DHSI 2013, University of Victoria</edition>
			</editionStmt>
			<publicationStmt>
				<distributor>Women Writers Project (via website)</distributor>
				<address>
					<addrLine>url:mailto:wwp@neu.edu</addrLine>
				</address>
				<date when="2013-11-20"/>
				<availability status="restricted">
					<p>Copyright 2012 Syd Bauman, Julia Flanders, and the Women Writers Project</p>
					<p>This TEI-encoded XML file is available under the terms of the <ref
							target="http://creativecommons.org/licenses/by-sa/3.0/">Creative Commons
							Attribution-ShareAlike 3.0 (Unported)</ref> license.</p>
				</availability>
				<pubPlace>Providence, RI USA</pubPlace>
			</publicationStmt>
			<sourceDesc>
				<p>A very gentle introduction to basic concepts of XSLT</p>
			</sourceDesc>
		</fileDesc>
		<revisionDesc>
			<change who="#jflanders.lfw" when="2012-11-29">Created file from scratch</change>
		</revisionDesc>
	</teiHeader>
	<text>
		<presentation>
			<abstract>
				<p>This tutorial continues the discussion of how XSLT stylesheets process their
					input information to create a new output. It also covers namespaces and
					languages, which are important to keep in mind when transforming from one XML
					language into another.</p>
			</abstract>


			<section>
				<head>The XSLT processing model: overview</head>
				<slide>
					<figure>
						<graphic height="600px"
							url="../../../_utils/gfx/xslt_processing_model_1.png"/>
					</figure>
				</slide>
				<lectureNote>
					<p>So let's come back now and consider in more detail how an XSLT stylesheet
						runs and what it does. What we're talking about here is something called the
							<term>XSLT processing model</term> and it is essentially the set of
						rules that direct how the stylesheet will run and what it will do, in what
						order. These rules are actually fairly simple once you're familiar with
						them, although they're not perfectly intuitive at the outset--so we're going
						to step through them in detail with an actual example and follow how they
						work.</p>
					<p>Essentially we are starting with the root of the input document, and working
						our way through that tree, based on the templates we find in the
						stylesheet.</p>
				</lectureNote>
				<tutorial>
					<p>This tutorial will cover in more detail how an XSLT stylesheet runs. What
						we're talking about here is something called the XSLT processing model:
						essentially the set of rules that direct how the stylesheet will run and
						what it will do, in what order. You can see the set of rules in the top-left
						corner of the slide. These rules are actually fairly simple once you're
						familiar with them, although they're not perfectly intuitive at the
						outset—so we're going to step through them in detail with an actual example
						and follow how they work.</p>
					<p>Basically, we are starting with the root of the input document, and working
						our way through that tree, based on the templates we find in the
						stylesheet.</p>
				</tutorial>
			</section>


			<section>
				<head>The XSLT processing model: matching the root</head>
				<slide>
					<figure>
						<graphic height="600px"
							url="../../../_utils/gfx/xslt_processing_model_2.png"/>
					</figure>
				</slide>
				<lectureNote>
					<p>So what's the first rule? we remember this from our earlier, more informal
						look: <list>
							<item>start with the root of the input document (what is it?)</item>
							<item>and then ask: is there a template that matches this element I'm
								considering? (is there?)</item>
						</list>
					</p>
				</lectureNote>
				<tutorial>
					<p>The first thing we want to do when we start an XSLT document is look at the
						root element of the input document. In the case of TEI documents, this will
						be the <gi>TEI</gi> element! Next we want to see if there is a template that
						matches the root element. In the case of this stylesheet, there
							<emph>is</emph> a template that matches <gi>TEI</gi>. The next slide
						will cover how this template is applied in an output document</p>
				</tutorial>
			</section>


			<section>
				<head>The XSLT processing model: applying a template</head>
				<slide>
					<figure>
						<graphic height="600px"
							url="../../../_utils/gfx/xslt_processing_model_3.png"/>
					</figure>
				</slide>
				<lectureNote>
					<p>If I do find a template that matches the element I'm considering, I apply
						that template: <list>
							<item>in this case, what is it doing? (writing out the first few layers
								of the output tree, and including a little bit of literal
								text)</item>
						</list>
					</p>
				</lectureNote>
				<tutorial>
					<p>If there is a template that matches the element you're considering, apply it.
						In the case above, you can see that the template applies three output
						elements: <gi>html</gi>, <gi>head</gi>, and <gi>title</gi>. Within
							<gi>title</gi> there is literal text, which is written into the output
						document.</p>
					<p>For HTML output, it is very common to have a template like this matched to
						the <gi>TEI</gi>root element, setting up the structure of the HTML
						document.</p>
				</tutorial>
			</section>


			<section>
				<head>The XSLT processing model: processing children</head>
				<slide>
					<figure>
						<graphic height="600px"
							url="../../../_utils/gfx/xslt_processing_model_4.png"/>
					</figure>
				</slide>
				<lectureNote>
					<p>So what did we just do? We applied a template, which entails: <list>
							<item>putting out the output elements</item>
							<item>writing out any literal text</item>
							<item>and one thing further: if there are instructions to apply
								templates, then process the children of the matched element</item>
						</list>
					</p>
					<p>So what is the element we just matched? (<gi>TEI</gi>)</p>
					<p>And it does include instructions to apply templates, which means that we...
						(process the children of the matched element, i.e. the children of
							<gi>TEI</gi></p>
					<p>So what are these children? <gi>text</gi>: what do we do with this? What rule
						applies in this case? (no template matches it: so we apply built-in
						processing rules, which say that we... (spit out any text, and process any
						children)</p>
					<p>So what are the children of <gi>text</gi>?...<gi>front</gi>: what do we do
						with this? What rule applies in this case?</p>
					<p>There's a template that matches <gi>front</gi>, but what does it tell us to
						do? What rule applies in this case?... there are no instructions to apply
						templates, so this part of the process stops there. There's no output from
							<gi>front</gi>.</p>
					<p>Does it stop altogether? Or are there other loose ends that will keep the
						process going?</p>
				</lectureNote>
				<tutorial>
					<p>So what did we just do? We applied a template, which entails putting out the
						output elements, writing out any literal text (in this case
						<![CDATA[<head><title>Test Document</head></title>]]>). The next step is to
						process the children of the matched element, but only if
							<gi>xsl:apply-templates</gi> element is present!</p>
					<p> Since there is the <gi>xsl:apply-templates</gi>, take a minute to think about:<list>
							<item>what element is matched in this scenario?</item>
							<item>what are the children?</item>
							<item>how would we process those children?</item>
						</list> We will discuss the answers to these questions below.</p>
					<p>In this case, the <gi>TEI</gi> element is matched. And since the
							<gi>xsl:apply-templates</gi> element is present we do process the
						children.</p>
					<p>The only child of <gi>TEI</gi> is <gi>text</gi> in this example. Note that
						there is no template that matches <gi>text</gi> in our stylesheet. However,
						the process doesn't stop there. When no template is matched for the child of
						a given element, the stylesheet uses built-in processing rules that instruct
						that we spit out any text and process any children.</p>
					<p>The first child of <gi>text</gi> in this case is <gi>front</gi>. So what
						happens here? You will notice that the <gi>xsl:template</gi> that matches
							<q>front</q> has no content (and therefore no
							<gi>xsl:apply-templates</gi>). This signifies to the processor that the
							<gi>front</gi> element should be ignored, and no children should be
						processed. Therefore, nothing correlating to <gi>front</gi> appears in the
						output document.</p>
					<p>The processor will then move to the next child of <gi>text</gi>, which will
						be discussed in the next slide.</p>
				</tutorial>
			</section>


			<section>
				<head>The XSLT processing model: chugging along</head>
				<slide>
					<figure>
						<graphic height="600px"
							url="../../../_utils/gfx/xslt_processing_model_5.png"/>
					</figure>
				</slide>
				<lectureNote>
					<p>There's another child of <gi>text</gi>, namely <gi>body</gi>, so our built-in
						stylesheet rule of <q>process the children</q> applies here and allows us to
						proceed to the <gi>body</gi> element</p>
					<p>So what happens here? (another output element is generated, and inside it,
						additional templates will be applied)</p>
				</lectureNote>
				<tutorial>
					<p>The next child of <gi>text</gi> is <gi>body</gi>. Take a minute to get a
						sense of what happens with the template matching <gi>body</gi>.</p>
					<p>This template instructs the processor to transform the TEI element
							<gi>body</gi> into the HTML element <gi>body</gi>. The
							<gi>xsl:apply-templates</gi> instruction indicates that the processor
						should process the children of <gi>body</gi> as we saw earlier with
							<gi>TEI</gi>.</p>
				</tutorial>
			</section>


			<section>
				<head>The XSLT processing model: processing more children</head>
				<slide>
					<figure>
						<graphic height="600px"
							url="../../../_utils/gfx/xslt_processing_model_6.png"/>
					</figure>
				</slide>
				<lectureNote>
					<p>Next we start processing the children of <gi>body</gi>, and we have two
						templates here that do somewhat similar things; what are we matching
						here?</p>
					<p>What if we had wanted to just match any <gi>head</gi> in the input
						document?</p>
					<p>Why do it this way? Why distinguish between two different locations for
							<gi>head</gi></p>
				</lectureNote>
				<tutorial>
					<p>Next we start processing the children of <gi>body</gi>. Note that there are
						two templates here that look somewhat similar. This example is a little bit
						more complicated than the ones we've dealt with before, but the same
						principle we've been discussing applies here. In both cases, the first part
						of the value for <att>match</att> (before the slash) indicates a context for
						the element. In the first case the context is <gi>body</gi>. So the matched
						element is the <gi>head</gi> that is the child of <gi>body</gi>. In the
						second example, the <att>match</att> value indicates the <gi>head</gi> that
						is the child of <gi>div</gi>. The syntax here is XPath (a way of navigating
						the document tree which will be introduced in the next tutorial in this
						primer).</p>
					<p>What if we had wanted to just match any <gi>head</gi> in the input document?
						In that case, we would have simply used a template that matched
							<gi>head</gi> on its own. Instead of providing context (i.e. the
						specifications for <q>body</q> and <q>div</q>), we would have simply made
						the value on <att>match</att> for <gi>xsl:template</gi> equal to
							<q>head.</q></p>
					<p>However, it may be useful to think about why we would distinguish between two
						different locations for <gi>head</gi>. As you can see from the output, the
						two different TEI <gi>head</gi> elements were converted into the HTML
						elements of <gi>h1</gi> and <gi>h2</gi>. If you are familiar with HTML, you
						probably know that <gi>h1</gi> and <gi>h2</gi> usually display differently.
						The numbered <q>h</q> elements in HTML typically indicate a hierarchy of
						headings, ie. <gi>h2</gi> is a sub-heading of <gi>h1</gi>. In this case, we
						are allowing our document heading (the TEI head that is the child of body)
						to be displayed differently than our chapter headings. There are many
						instances in which we would want elements to function differently depending
						on their context. For example, we may want the <gi>persName</gi>s in our
						structured personography to display differently than the <gi>persName</gi>s
						in the body of our text. Providing context allows us to differentiate
						between elements depending on their context.</p>
					<p>Since the <gi>xsl:apply-templates</gi> element is present in both
							<gi>head</gi> templates and neither element has any children, the
						processor simply spits out the text in those elements in the output
						document.</p>
				</tutorial>
			</section>

			<section>
				<head>The XSLT processing model: a final round</head>
				<slide>
					<figure>
						<graphic height="600px"
							url="../../../_utils/gfx/xslt_processing_model_7.png"/>
					</figure>
				</slide>
				<lectureNote>
					<p>Finally we're getting to the last of the children in the input document...
						What is happening here? What rules apply when we get to <gi>emph</gi>?
						(There's no template that matches, so we apply the built-in rules, which
						say...if what we're processing is text, spit out the text)</p>
				</lectureNote>
				<tutorial>
					<p>Finally we're getting to the last of the children in the input document...
						What is happening here? Just as with <gi>TEI</gi> and <gi>body</gi>, the TEI
							<gi>p</gi> element is matched and translated to the HTML <gi>p</gi>
						element in the output document. Since the <gi>xsl:apply-templates</gi>
						instruction is present, the children are processed.</p>
					<p>What rules apply when we get to the <gi>emph</gi> element in our input
						document? As you can see, there's no template that matches, so we apply the
						built-in rules, which say <said>process the children.</said> However there
						are no children: just text. As we saw earlier, if what we're processing is
						text, we simply spit out the text.</p>
				</tutorial>
			</section>

			<section>
				<head>The XSLT processing model: a last look</head>
				<slide>
					<figure>
						<graphic height="600px"
							url="../../../_utils/gfx/xslt_processing_model_8.png"/>
					</figure>
				</slide>
				<lectureNote>
					<p>Now we can back up again and study the whole thing as a finished product: the
						input, the stylesheet, and the output. Any questions?</p>
				</lectureNote>
				<tutorial>
					<p>Now here we see the finished product. If you find it useful, take a few
						minutes to review the preceding slides to get a sense of the process as a
						whole.</p>
				</tutorial>
			</section>
			<section>
				<head>Namespaces and languages</head>
				<slide>
					<figure>
						<graphic height="600px" url="../../../_utils/gfx/xslt_namespaces.png"/>
					</figure>
				</slide>
				<lectureNote>
					<p>You've probably already noticed that we're dealing here with three languages: <list>
							<item>The language of the XSLT stylesheet itself (which is a language
								containing elements like <gi>tempate</gi> and
									<gi>apply-templates</gi>)</item>
							<item>The language of the input document (in this case, TEI)</item>
							<item>The language of the output document (in this case, HTML)</item>
						</list>
					</p>
					<p>Within the stylesheet itself, we need to keep these three different languages
						distinct from one another, so that the processor always knows what piece of
						what tree it is dealing with. We do this with something called namespaces.
						(Does everyone understand namespaces? Quick review on the next slide if
						necessary...)</p>
					<p>Each of these languages plays a specific role in the stylesheet ecology and
						gets referenced in a distinctive way: <list>
							<item>Let's take the simplest first: the output tree, which is being
								treated transparently in our examples: it doesn't use a namespace
								prefix, and this is because we have declared that the entire
								stylesheet is in the HTML namespace (we did this with the namespace
								declaration attribute-like thingy: xmlns)</item>
							<item>The next fairly simple case is the input tree, which also looks as
								if it's not getting a namespace. How are we keeping this separate
								from the output tree? The trick here is that the input tree is
								always accessed via these <att>match</att> and <att>select</att>
								(and similar) attributes. These attributes all access the input tree
								via XPath, and up at the top, we provided a default namespace for
								all XPaths (via xpath-default-namespace)</item>
							<item>And finally, the stylesheet document has its namespace specified
								as the XSL namespace (the default namespace for the stylesheet is
								already set to HTML) so all of the stylesheet elements have a
								namespace prefix.</item>

						</list>
					</p>
				</lectureNote>
				<tutorial>
					<p>As you've probably noticed, we are dealing with three different languages
						during this process: the input language (in this case, TEI), the language of
						the output document (in this case HTML), and the language of the XSLT
						stylesheet itself (which contains elements like <gi>template</gi> and
							<gi>apply-templates</gi>.</p>
					<p>Within the stylesheet itself, it is important that we keep these languages
						distinct from each other, so that the processor knows what piece of the tree
						it is dealing with. We differentiate between the languages using
						namespaces.</p>
					<p>If you do not know about namespaces (or if you feel you could use a
						refresher), take this time to continue to the next slide for an overview.
						You can come back to this slide for any additional information that you
						need.</p>
					<p>Each of the languages used (input, output, and XSL) plays a specific role
						in the stylesheet ecology and gets referenced in a distinctive way.</p>
					<p>The output tree, as you can see, doesn't use a namespace prefix on each
						element. This is because we have already specified the namespace using the
							<att>xmlns</att> attribute on <gi>xsl:stylesheet</gi> (See the blue
						section on the stylesheet).</p>
					<p>The next fairly simple case is the input tree, which also looks as if it's
						not getting a namespace in the stylesheet. How are we keeping this separate
						from the output tree? The trick here is that the input tree is always
						accessed via the <att>match</att> and <att>select</att> (and similar)
						attributes. These attributes all access the input tree via XPath, and up at
						the top, we provided a default namespace for all XPaths (via
						xpath-default-namespace)</p>
					<p>And finally, the stylesheet document has its namespace specified as the XSL
						namespace (see the bit that says <q>xmlns:xsl=</q>). Since the default
						namespace of the document is HTML, we must use the the XSL prefix for all
						the elements we want to use that are in the XSL namespace.</p>
				</tutorial>
			</section>
			<section>
				<head>Namespaces review</head>
				<slide>
					<p>Without the genus, we don't know what animal these species are: <list>
							<item><emph>glauca</emph>: a pine tree (Picea glauca) or a small yellow
								flower (Agoceris glauca)?</item>
							<item><emph>leucocephalus</emph>: a cactus (Pilosocereus leucocephalus)
								or a bald eagle (Haliaeetus leucocephalus)?</item>
						</list>
					</p>
					<p>Without knowing the language, we don't know these words mean: <list>
							<item><emph>the</emph> (English definite article or a French hot
								drink?)</item>
							<item><emph>bad</emph> (English adjective or German noun for
								<q>bath</q>?)</item>
						</list>
					</p>
					<p>Without a namespace designation, we don't know what these elements mean: <list>
							<item><gi>p</gi> (TEI paragraph or HTML block element?)</item>
							<item><gi>div</gi> (TEI textual division or HTML grouping
								element?)</item>
							<item><gi>fileDesc</gi> (TEI or EAD?)</item>
						</list>
					</p>
					<p>With the namespace, all is clear: <list>
							<item><eg><![CDATA[<tei:p>]]></eg></item>
							<item><eg><![CDATA[<html:div>]]></eg></item>
							<item><eg><![CDATA[<ead:fileDesc>]]></eg></item>
						</list>
					</p>
					<p>The namespace prefix is somewhat like a genus or language name: it tells us
						more precisely what language we are speaking (and hence what the semantics
						of the element are)</p>
				</slide>
				<tutorial>
					<p>Namespaces function like genus names in taxonomy or like languages. They
						allow us to understand the context for given words. So for example,
							<q>glauca</q> can specify different species, depending upon the genus
						name, and <q>the</q> can mean be either an English article, or the French
						word for <q>tea.</q> Similarly, <gi>p</gi> means something different in HTML
						than it does in the TEI. Namespaces allow us to clear up this confusion, by
						providing the language that a given element is being used in.</p>
					<p>The namespace prefix is somewhat like a genus or language name: it tells us
						more precisely what language we are speaking (and hence what the semantics
						of the element are).</p>
					<list>
						<head>This tutorial is complete, please see links below to continue:</head>
						<item><ref target="./xpath_intro_tutorial_00.xhtml">Proceed to next tutorial
								in Transformation and Publication Primer</ref></item>
						<item><ref target="../../../../resources/transformation.html">Return to
								Transformation and Publication Primer</ref></item>
						<item><ref target="../../../../resources/tutorial_main.html">Return to main
								tutorial page</ref></item>
					</list>
				</tutorial>
			</section>



		</presentation>
	</text>
</TEI>
