Navigating the Women Writers Vector Toolkit
We imagine this space as one designed to foster curiosity and exploration. Because this project is built upon accessibility and transparency for users with multiple levels of text analysis experience, the Women Writers Vector Toolkit:
- provides the Word Vector Interface for researchers, instructors, and learners to explore;
- shows our processes for transforming and regularizing TEI data by publishing a detailed methodology;
- demonstrates teaching implementations by providing sample assignments and suggested searches;
- shares resources that include an annotated readings list and a glossary; and
- publishes case studies that demonstrate some research and exploration possibilities for word embedding models.
See our site map for an outline of the webpages within the WWVT.
Word Vector Interface
This interface will allow you to query terms in word2vec models that were trained on texts from Women Writers Online, the Victorian Women Writers Project, and Early English Books Online. The main page of the interface allows you to query words in these different models to see which are “closest” in vector space—that is, which words are likeliest to be used in similar contexts. The interface also has a “Clusters” tab for exploring neighboring words in vector space—words that are used in similar contexts will be clustered together. The “operations” tab allows you to do “vector math,” such as adding or subtracting the contexts of words; this tab also offers the ability to construct analogies. The “visualizations” tab allows you to create a word cloud or a scatterplot for the query term you would like to analyze. These visualizations display the query term and the other words that appear around that term.
Resources
Under the Resources tab, you can find an introduction page that discusses core concepts, provides a brief background on the development of word embeddings, and explains the specific implementation of word embedding models for the Women Writers Vector Toolkit (WWVT). However, this introduction is not exhaustive. For more information about any of the topics covered, please visit our annotated readings list. Additionally, there is a glossary of key terminology under the same tab. To explore the example research for word embedding models, you can read several case studies using the Word Vector Interface. If you would like to train a model of your own, there are code walkthrough notebooks. These notebooks provide code and instructions for training and querying models using the wordVectors R package. Some of the notebooks are designed for RStudio Server, others for RStudio Desktop, and some can be used with either.
Teaching Implementations
One of the goals for the Women Writers Vector Toolkit is to increase classroom access to conducting vector analyses on early women’s texts. We offer other tools and materials to help instructors use machine learning methods more effectively in their teaching. In the teaching exploration sections, you’ll find a database of assignments created for the WWVT. The assignments have been written for different levels of familiarity with the corpus, the exploration interface, and word vector modeling more generally. This section includes a variety of classroom-based activities and take-home assignments.
Methodology
To learn more about how the team created the models used in the Word Vector Interface, visit the Methodologies section on the Toolkit site. Also, this section describes the processes needed to prepare text corpora for model training.
Explore the Women Writers Online corpus using the Word Vector Interface!
Learn about text analysis through a lens crafted for users with a humanities background.
Read some of the case studies written about researchers’ experiences with the project’s corpora and tools.
Discover assignments and learning activities for teaching these models in a classroom setting.