The Queen’s Two Corpora: Finding Elizabeth and Creating Corpora using the WWO Database

The Queen’s Two Corpora: Finding Elizabeth and Creating Corpora using the WWO Database

This post is part of a series authored by our collaborators on the Intertextual Networks project. For more information, see here. 

By Kristen Abbott Bennett, Stonehill College

At Tilbury, Elizabeth I gave a rousing speech to motivate her subjects, exclaiming: “I know I have the bodie, but of a weak and feeble woman, but I have the heart and Stomach of a King, and of a King of England” (Cabala). Elizabeth’s recognition of her female princely bodies as simultaneously separate and the same reflects awareness of her politically constructed dual corpora. Historically, the “King’s two bodies” theory was adapted from ideas surrounding the divine right of kings. During Elizabeth’s reign, it was legislated to preserve her interests in lands acquired by Edward IV in his minority:

For the King has in him two Bodies, viz., a Body natural, and a Body politic…. [The latter] is a Body that cannot be seen or handled, consisting of Policy and Government, and constituted for the Direction of the people, and the Management of the public weal, and this Body is utterly void of Infancy, and old Age, and other natural Defects and Imbecilities, which the Body natural is subject to, and for this Cause, what the King does in his Body politic, cannot be invalidated or frustrated by any Disability in his natural Body. (Kantorowicz 7)

The “King’s Two Bodies” construction offers an apt metaphor for thinking about approaches to corpus-based linguistic analyses. These approaches allow one to consider a single body of work in and of itself, as well as realize its rhetorical relationship to a larger corpus.1 In the context of sub-corpora created from the Women Writers Online database, the “intertexts” corpus I discuss here analogizes Elizabeth’s “body politic” that both embodies, yet remains distinctive from “the body natural”–here another sub-corpus containing Elizabeth’s speeches.

What follows is a brief account of the methods I have used to create corpora from the WWO database, ranging from basic keyword searches to more complex computationally assisted searches, along with a short discussion about the choices I made along the way. With an eye toward next steps, I close with an overview of how one may convert XML documents into different kinds of file types that lend themselves well to computational and visual analysis.

Finding Elizabeth

Initially, I used keyword searches to find the works that mention Elizabeth; works authored by her are listed, with WWO links, here. I quickly learned that my attempts to search “Elizabeth I” in a database featuring works produced between 1526–1850 was not the best move; Elizabeth II was yet to exist. This initial foray revealed 120 works of the 390 in the WWO corpus (as of spring 2017) that mention an Elizabeth, plus 276 discrete references to women named “Elizabeth.” I persevered, using Ctrl + F and skimming, ultimately locating suitable intertexts (that is, intertextual references to Elizabeth I) dating between the early seventeenth and early nineteenth centuries that discuss her in both historical and fictional contexts.

For example, both Esther Sowernam’s 1617 pamphlet, Esther Hang’d Haman and Bathusa Makin’s 1673 Essay to revive the ancient education of gentlewomen laud the historical Elizabeth’s virtues and learning. Yet in Mary Deverell’s 1792 play, Mary Queen of Scots, the fictionalized Scottish queen suggests that Elizabeth’s learnedness is undesirable and unfeminine: “my sister’s mind is masculine” (O2v).

Although Deverell’s work ultimately presents an even-handed assessment of two Queens surrounded by male advisors and doing the best they can, American writer Judith Sargent Murray’s 1798 fictional account of Elizabeth and Mary’s history portrays the English queen as manipulative, dissembling, and self-serving. I had high hopes for Margaret Cavendish saying something excessive, but she mentions Elizabeth’s reign only to mark time in Nature’s Pictures. This early research generated enough information and questions for me to propose, and commit to creating, a multimedia intertextual exhibit that networks transcontinental representations of Elizabeth by six other WWO authors in the context of common discourses associated with the queen: her dual-gender, her “cult of love,” renowned learning, relationship with Mary, Queen of Scots, and her refusal to marry.

At this point in the process, I was introduced to Ashley Clark’s (Northeastern) brilliant Counting Robot (an XQuery for performing basic counts on WWP files) and saw an opportunity to test human-brain approaches to “finding” related texts in a large database against basic computational methods.

Creating the Corpora

A  <persName> search for “Elizabeth” produced 103 files including Elizabeth’s speeches, but it still threw out false positives. Eventually, I adapted Ashley’s code to create multiple search strings using early modern spellings and alternate names (Eliz, Elizabeth, Princess, Bess, etc.) and then checked contexts manually—this method resulted in finding 33 files, including Elizabeth’s works.

The results were similar when my colleague Mary Erica Zimmer suggested the labor-saving method of searching for cases where @ref on <persName> pointed to the unique identifier established for Elizabeth in the WWP’s personography; this method helped us extract Elizabeth I from her many (likely) namesakes and locate 21 valid intertexts.

During the first pass, it made sense to create one corpus containing Elizabeth’s works, another of her intertexts, and a third including all the files. Although this seems relatively straightforward, the concept of “Elizabeth’s works” is problematic. The WWO database includes her speeches, one translation, and one “true copie of a letter.” Although Elizabeth’s speeches were transcribed and printed by men, they offer a record of the way she presented herself to her subjects. It made sense to limit the “Elizabeth” corpus to her speeches, and excise the “true copy of a letter” and the translation to focus on a single genre. Once “Elizabeth” was defined, the intertexts were easy to manage; the sole criterion for inclusion was at least one clear mention of Elizabeth I. In the context of the “two bodies” metaphor, these corpora situate Elizabeth’s “natural” body in the context of her “body politic.”

Now What?

The first corpora were encoded in XML and lent themselves well to computational inquiry using the Counting Robot, XPath searches in oXygen, and AntConc. For example, these initial forays revealed that elements with @rend (indicating typographic changes) often point to a given work’s proper nouns and linguistic shifts, in addition to elements such as <persName> and <emph> that mark such features more explicitly. For the purposes of this project, I put that query aside for the time being and thought about the possibilities for these specific corpora.

It quickly became apparent that any computational analysis of these works called for creating additional corpora. Any text mining, visualization, or mapping approaches required removing the tags from the texts. Following Sarah Connell’s suggestion of a quick, if relatively low-tech, method for transforming the XML files, we opened the texts in oXygen, switched to “Author” mode, and then copied and pasted each text into a Word document. The last step was to make a plain text corpus.2

Why so many corpora? The first set, in XML, lend themselves well to computational queries about tagged elements. Reformatting the corpora into Word docs made the works more easily searchable, plus these documents lend themselves well to visualization using tools like Voyant. Similarly, conversion into text files permits users to work with visualization and analytical tools such as AntConc and Recogito. Although clearly exceeding the “two corpora” promised by this title, I hope to have offered people who may be new to working with literary databases helpful approaches toward getting up and running.


Anon. Cabala: sive Scrinia Sacra. London, Printed for G. Bedel, and T. Collins, and are to be ſold at their Shop at the Middle-Temple-gate in Fleetſtreet, 1654. Women Writers Online, Accessed 5 May 2017.

Cavendish, Margaret (Lucas), Duchess of Newcastle. Natures Pictures Drawn by Fancies Pencil to the Live, J. Martin and J. Allstrye, 1656. Women Writers Online, Accessed 5 May 2017.

Deverell, Mary. Mary Queen of Scots; an Historical Tragedy, or, Dramatic Poem. Deverell, 1792. Women Writers Online, Accessed 5 May 2017.

Kantorowicz, Ernst A. The King’s Two Bodies: A Study in Mediaeval Political Theology. Princeton UP, 1957.

Makin, Bathusa. An Essay to Revive the Antient Education of Gentlewomen. J.D., 1673. Women Writers Online, Accessed 5 May 2017.

Murray, Judith (Sargent). The Gleaner, I. Thomas and E.T. Andrews, 1798. Women Writers Online, Accessed 5 May 2017.

Sowernam, Esther. Esther Hath Hanged Haman. Nicholas Bourne, 1617. Women Writers Online, Accessed 5 May 2017.



  1. I am indebted throughout to Mary Erica Zimmer’s contributions during an earlier collaboration when she joined me to present part of this work in progress at the Women and Culture in the Early Modern World seminar at the Mahindra Humanities Center at Harvard University in February 2017.
  2. The WWP has since published more robust mechanisms for generating full-text versions of its corpora here.

Leave a Reply

Your email address will not be published. Required fields are marked *