Word Vectors for the Thoughtful Humanist

Northeastern University
May 16–20, 2022

Sarah Connell, Julia Flanders, Syd Bauman, Juniper Johnson, Ash Clark
Digital Scholarship Group, Northeastern University Libraries


― Monday May 16 ―

Session 1 12:30–1:50 (Eastern) Welcome, introduction and scoping (slides: HTML, TEI, notes)
1:50–2:00 break
Session 2 2:00–3:00 Conceptual orientation and walkthrough of the basics of R (slides: HTML, TEI, notes)
3:00–3:10 break
Session 3 3:10–4:10 A deeper look at core concepts and terms, part 1 (slides: HTML, TEI, notes)
Session 4 4:10–4:50 Pedagogical showcase (slides: HTML, TEI, notes)
Wrap-up 4:50–5:00

Homework: Hands-on experimentation with the Women Writers Vector Toolkit and/or our Sandbox

― Tuesday May 17―

Session 5 12:30–1:45 A deeper look at core concepts and terms, part 2 (slides: HTML, TEI, notes)
1:45–2:00 break
Session 6 2:00–3:00 Walkthrough commented code on querying an existing model (slides: HTML, TEI, notes)
Session 7 3:15–4:30 Group hands-on practice (slides: HTML)
Troubleshooting 4:30–5:00 Optional session

Homework: Choose a term that you’re interested in, query that term in at least two different models, and make notes on your results in the Day 2 Homework document in the Group Activities folder in our shared Google Drive. Also, please read through the sample curricular materials and make notes on one or two learning goals for word embeddings in your classroom. On Wednesday, we'll be training our own models, so aim to have a corpus of no more than ~4 million words to work with, either from your own project or subsetted from the test corpora.

― Wednesday May 18―

Pre-session 11:00–12:00 Downloading R and RStudio (optional but recommended!)
Session 8 12:30–1:00 Group discussion of sample assignments and learning goals (slide: HTML)
Session 9 1:00–2:15 Process, part 1: Corpus and data preparation (slides: HTML, TEI, notes)
2:15–2:30 break
Session 10 2:30–3:30 Group walkthrough of model training in RStudio (slide: HTML, TEI, notes)
3:30–3:45 break
Session 11 3:45–5:00 Hands-on practice: getting ready to train a model, loading your own data (slides: HTML, TEI, notes), exporting results (slide: HTML)

Homework: train a model on your own data, varying one parameter from the defaults

― Thursday May 19―

Pre-session 11:00–12:00 Downloading R and RStudio (optional but recommended!)
Session 12 12:30–1:45 Group annotation and discussion of pedagogical artifacts (slide: HTML)
1:45–2:00 break
Session 13 2:00–3:00 Process, part 2: Parameters and validation (slides: HTML, TEI, notes)
3:00–3:15 break
Session 14 3:15–4:00 Treasure hunt (slide: HTML)
Session 15 4:00–5:00 Group hands-on practice: setting up to train and validate another model (walkthrough slides: HTML, TEI, notes; hands-on slide: HTML)

Homework: Train another model with different parameters, develop some word pairs, and run validation code

― Friday May 20―

Pre-session 11:00–12:00 Tools and tactics for full-text corpus exploration (optional but recommended!)
Session 16 12:30–1:45 Small group discussion of syllabi and course design (slide: HTML)
1:45–2:00 break
Session 17 2:00–3:15 Walkthrough demo: Exploration and analysis (slides: HTML, TEI, notes)
3:15–3:30 break
Session 18 3:30–4:30 Full-group discussion and wrap-up (slides: HTML, TEI)


Women Writers Vector Toolkit

Model training walkthroughs

Model training walkthroughs, web-friendly versions

Materials for Download

The resource page has links to WWP tutorials and slides, interesting web sites we may have shown, and useful TEI links