Annotated Sources

This annotated bibliography is a list of helpful sources that have been organized into four main categories to make for easier navigation: introductory readings, advanced readings, background readings, and digital pedagogy readings.

The introductory readings are more approachable pieces about both word embeddings and similar corpus-based techniques. The advanced readings go into further detail about methodology, looking more specifically at the code and mathematical concepts at stake. The background readings provide a theoretical foundation for our own project’s methodology. Finally, the digital pedagogy readings are authored by practitioners who are either providing a case study about or a theoretical model for incorporating digital technologies in the classroom.

Cherny, Lynn. “Visualizing Word Embeddings in Pride and Prejudice.” Ghostweather Research and Design Blog, 22 Nov. 2014.

This reading is an accessible process-oriented reflection on training a word embedding model and presenting that model using Javascript; Cherny demonstrates the values of experimentation and play in her reflection on creating this model. Cherny trained a word2vec model on the full corpus of Austen novels and then replaced all of the nouns in Pride and Prejudice with the most similar word for each; an associated visualization fills in the noun pairs and the path between them in two-dimensional space when they are moused over, resulting in a cumulative word-cloud developed by the viewer’s exploration of terms. Cherny also discusses her results for gendered words in the Austen novel (the word most closely related to “husband” is “nerves”).

Heuser, Ryan. “Word Vectors in the Eighteenth Century, Episode 1: Concepts.” Virtue and the Virtual, 14 Apr. 2016.

Heuser introduces the subject of word vectors in this first blog post and includes a handful of detailed analyses of 18th-century prose to exemplify how word vector analysis can be useful for scholarship on early texts. With an accessible rhetorical framing that emphasizes his own developing understanding of the methodologies at stake, Heuser’s piece clarifies both pragmatics and core concepts of working with these models.

Heuser, Ryan. “Word Vectors in the Eighteenth Century, Episode 2: Methods.” Virtue and the Virtual, 1 June 2016.

Heuser’s second post explains the conceptual logic behind word vectors. Within a model, the similarity of two words is a result of how and where they occur together across the input data, and thus word embedding models take into account contextual information and semantic similarities or differences. The process by which the words become represented as vectors and then embedded into the model is different across different algorithms and software, and Heuser acknowledges that the intelligibility of the underlying math is one of the downfalls of word embedding models. To illustrate the function of word vector models, he outlines the semantic relationship between Queen/Woman and King/Man, then their vector relationship, to reveal how and why the mathematical process performed by the vector algorithms produces the same result as human logic.

Klein, Lauren. F. “The Image of Absence: Archival Silence, Data Visualization, and James Hemings.” American Literature, vol. 85, no. 4, Jan. 2013, pp. 661–88.

Although Klein does not use word vector models specifically, this piece provides a valuable framework for making explicit discursive reading practices in creating data models and visualizations. Klein addresses archival silence, or the erasure of particular narratives and people in archival work, to make visible the presence of James Hemings, Sally Heming’s brother and a slave owned by Thomas Jefferson, in Jefferson’s digitized letters. Klein presents the steps she took to create a social network visualization to capture a more nuanced representation of the relationships between the people mentioned in these correspondences.

Recchia, Gabriel. “‘Numberless Degrees of Similitude’: A Response to Ryan Heuser’s ‘Word Vectors in the Eighteenth Century, Part 1.’” Gabriel Recchia, 11 June 2016.

Recchia begins with an explanation of the predecessors for algorithmic vector space models and an articulation of the caution that digital humanists should know how to interpret findings from words vector models, what information is conveyed by transforming language into statistics, and when/where that statistical representation would not be valid. Recchia then defines four different types of models and training methods: count-based models, random vector models, the “continuous bag of words” algorithm, and the “skip-gram” algorithm. The bulk of Recchia’s post examines the same relationships between “genius,” “learning,” “virtue,” and “riches” that Ryan Heuser explores in “Word Vectors in the Eighteenth Century” (see above) in order to elucidate the complexities and limits of word vector analysis. Ultimately, Recchia concludes with a claim that computational methods can help focus research, but that findings should be confirmed by close-reading methods.

Schmidt, Ben. “Vector Space Models for the Digital Humanities.” Ben Schmidt, 25 Oct. 2015.

In this post, Schmidt provides a few major reasons why word embedding models are so useful for a deeper analysis of word relationships. This piece clearly defines what word embedding models are and walks through several examples that help to clarify what is involved in both training and querying these models.

Schmidt, Ben. “Rejecting the Gender Binary: A Vector-Space Operation.” Ben Schmidt, 30 Oct. 2015.

In this post, Schmidt uses word embedding models to isolate the vectors associated with gendered words in teaching reviews from the website Rate my Professor. He also uses a process called “vector subtraction” or “vector rejection” which he defines as “building a new vector space from the old by transforming each element to no longer have any directionality along the vector that separates male from female.” This reading is useful because it demonstrates how textual analysis can make explicit social biases in data, provides a step-by-step walkthrough for how Schmidt created this model along with code snippets, and explains the results.

“How Does Word2vec Work? Can Someone Walk through a Specific Example?” Quora, 20 Oct. 2014.

This Quora thread provides a multiplicity of answers and a well-rounded explanation of the word2vec algorithm. Omer Levy gives a brief overview, Ajit Rajaskharan provides detailed commentary on the source code, Borislav Agapiev elaborates on the probability statistics behind the code, Stephan Gouws explains the geometrical theories of vector space, and Abhishek Patnia clarifies the underlying assumptions from a Natural Language Processing (NLP) perspective. Overall, this an especially useful, publically-available resource for non-computer scientists to gain an understanding of the algorithm itself.

Allison, Sarah, et al. “Quantitative Formalism: An Experiment.” Stanford Literary Lab Pamphlet, vol. 1, Jan. 2011.

This study provides a brief explanation of how algorithms understand language, the limits to that computational understanding, and the possibilities for digital humanists to put it to work. The authors are interested in whether formalistic features of literary works (like genre) can be determined via quantitative methods. The two tools used here, Docuscope and Most Frequent Words, both rely on an unsupervised factor analysis using Language Action Types (LATs)—which is a smart dictionary that can determine a word’s function. The researchers concluded that language and style were not enough to delimit one genre from another, and, though the systems tracked features that makes one genre different from another, this tells us little about a form’s inner structure.

Blankenship, Avery. “What We Didn’t Know a Recipe Could Be: Political Commentary, Machine Learning Models, and the Fluidity of Form in Nineteenth-Century Newspaper Recipes.” Journal of Cultural Analytics, vol. 9, April 2024.

This article outlines a methodology for using word embeddings with doc2vec to identify recipes in nineteenth-century newspapers. The clear methodological outline will be useful for any readers interested in understanding how doc2vec might be applied for classification purposes, and the article also offers a thoughtful approach to humanities research applications for machine learning.

Gagliano, Andrea, et al. “Intersecting Word Vectors to Take Figurative Language to New Heights.” Presented at Fifth Workshop on Computational Linguistics for Literature. 2016.

Gagilano et al. use word2vec to create a model of metaphors to better develop systems for understanding how figurative language is used in poetry. Because metaphors are created by connecting words or concepts to represent other words and concepts, a statistical model of metaphors should create a set of connector words that articulates the “figurative relationship” between two word pairs. The authors explain how they were able to model connector words using word pairs and basic word vector functions like addition, intersection, and subtraction. They also use a case study to show how they might qualitatively analyze the results of this method. This article demonstrates how an intimate understanding of word embedding models can lead to different approaches to exploring relationships among words.

Jurafsky, Daniel and James H. Martin. “Vector Semantics and Embeddingss.” Speech and Language Processing. 2024.

This chapter discusses vector semantics and word embeddings. It provides a comprehensive introduction to the principles involved in computationally modeling the semantics of language, offering a detailed and thorough grounding that will be useful for researchers attempting to strengthen their understanding of word embeddings.

Jockers, Matthew, and Julia Flanders. “A Matter of Scale.” Keynote for Boston Area Days of Digital Humanities Conference, 2013.

This piece is a transcription of a debate during the Boston Area Days of Digital Humanities Conference at Northeastern University in 2013. The discussion centers around the idea of scale, and how it is integral for understanding digital humanities research. The presentation slides and dialogue between the two speakers underline the interconnection and interdependence of both macro- and microscopic views of DH projects.

Ramsay, Stephen. “The Hermeneutics of Screwing Around; or What You Do with a Million Books.” Pastplay: Teaching and Learning History with Technology, edited by Kevin Kee, University of Michigan Press, 2014.

The least technical of these essays, Ramsay’s piece focuses on the notion that there have always been too many books for humans to read in our lifetime and thus always lists that suggest which books are worth reading. Ramsay sees these lists as paths through culture—or, in other words, a way for people to place a book within a network of their known associations before they read it. Searching (say, via Google) will also reify this known network of association; whereas the act of browsing (like, in a library) allows a person to uncover unknown associations. Ramsay hopes for an algorithmic cataloguing of digitized books that allows for browsing, or “screwing around.” However, Ramsey acknowledges that humanists are concerned with shared culture, especially in the public sphere, and hopes that digitization and algorithmic efforts can balance these different paths through culture.

Rawson, Katie, and Trevor Muñoz. “Against Cleaning.” July 2016. curatingmenus.org

The term “data cleaning” makes opaque the actual process of data transformation. Cleaning implies there is an underlying standard order that needs to be discovered, and all a practitioner has to do is “clean” the “messy” data. Rawson and Muñoz also argue that practitioners who approach data transformation through this framing may not be thinking critically about how they are transforming and standardizing the data, including what valuable data they might be removing. Rawson and Muñoz advocate for centering diversity in data, making transparent the data transformation process through index-making, sharing “messy” data, and bringing communities who are directly affected by or have interest in this data into conversations about data transformation.

Schöch, Christof. “Big? Smart? Clean? Messy? Data in the Humanities.” Journal of Digital Humanities, vol. 2, no. 3, 2013.

This reading provides an introduction to defining and understanding “data” in the humanities. Schöch proposes two types of data in the humanities: “smart data” and “big data.” Smart data is data that has been transformed from the original form in which it was collected; Schöch proposes TEI-encoded documents as an example of smart data. He also argues that the term “big data” is more of a paradigmatic shift in humanistic inquiry, in which instead of looking at just a few texts, “macroanalysis” (also called distant reading, corpus analysis, etc.) can be performed on a corpus. Schöch concludes with advocates for what he calls “smart big data”: better quantitative processes when performing humanistic inquiries.

Witmore, Michael. “Text: A Massively Addressable Object.” Debates in the Digital Humanities, 2012th ed., 2012.

Witmore argues that the feature distinguishing digital texts from their physical counterparts is their ability to be examined, or “addressed” at a wide varieties of scale and levels of abstraction. These levels are numerous: to look at a text/texts through a word, a single folio-style book, a genre is to look according to different, flexible scales of abstraction. Although this addressability is not unique to the digital text, the ease and speed of querying this flexibility is.

Christian-Lamb, Caitlin, and Anelise Hanson Shrout. “‘Starting From Scratch’? Workshopping New Directions in Undergraduate Digital Humanities.” Digital Humanities Quarterly, vol. 11, no. 3, 2017.

This article reports on a workshop about developing undergraduate DH curriculum and courses from the Alliance of Digital Humanities Association’s conference. This workshop discussed and produced the patterns for a successful undergraduate DH curriculum that centers student agency: these curricula typically a) emphasize collaboration, b) are housed in traditional spaces that integrate liberal arts pedagogy, and c) are highly flexible. The authors push against terms like “digital native” and “apprentice-researcher” because these models do not accurately capture students’ relationship to technology and can foster misconceptions about DH classrooms on an institutional level. This article demonstrates what digital humanists value in undergraduate education.

Davis, Rebecca Frost, et al., editors. Digital Pedagogy in the Humanities: Concepts, Models, and Experiments. Modern Language Association Commons, 2016.

This piece is a collection of concepts and applied examples from well-known practitioners about incorporating the digital into their pedagogy. These concepts range from artifact-specific terms such as “Visualizations” and “ePortfolios”; to pedagogy-inspired terms like “Collaboration” and “Assessment”; to justice-centered concepts such as “Race” and Queer.” Terms such as “Visualization,” “Code,” and “Text Analysis” are particularly relevant for this project.

Sayers, Jentery. “Tinker-Centric Pedagogy in Literature and Language Classrooms.” Collaborative Approaches to the Digital in English Studies, edited by Laura McGrath, Utah State University Press: Computers and Composition Digital Press, 2011.

Sayers advocates for a tinker-centric pedagogy, a learning style that centers around playing, collaborating, and venturing into unfamiliar territories of knowledge. In order to show the effectiveness and power of this pedagogy, Sayers provides examples and rationales from his own classrooms. Sayers stresses the importance of assigning “change logs,” or reflections that student writers construct about how their “big ideas” change “from experiment to experiment” (285). The rationales behind each assignment/lesson and the encouragement for researchers and teachers to continue studying the implementation of these assignments offer a compelling justification for welcoming play in pedagogy.

Singer, Kate. “Digital Close Reading: TEI for Teaching Poetic Vocabularies.” The Journal of Interactive Technology and Pedagogy, vol. 3, May 2013.

This article provides one approach to using TEI in the classroom, focused on individual TEI documents and encoding practices. Singer uses TEI as a pedagogical tool to encourage slow, deliberate, and close reading of poetry. Using TEI invites students to understand poetic terminology and to apply it while reading. Singer provides a narrative for how she has incorporated TEI in the classroom, how she plans to revise and play with the structure of her teaching, and what her assignments and daily lessons during this particular class looked like. Classroom discussions also incorporate visualizations of the student writers encoded documents, which allows student writers/encoders to engage with their classmate’s readings and textual interpretations.