Word Vectors for the Thoughtful Humanist

Northeastern University
July 12–16, 2021

Sarah Connell, Julia Flanders, Syd Bauman, Ash Clark
Digital Scholarship Group, Northeastern University Libraries

Schedule

― Monday, 12 July ―

Session 1 12:30–1:50 Welcome, introduction and scoping (slides: HTML, TEI)
1:50–2:00 break
Session 2 2:00–3:15 Conceptual orientation and walkthrough of the basics of R
Session 3 3:15–3:45 Hands-on experimentation with the Women Writers Vector Toolkit
Session 4 3:45–4:30 A deeper look at core concepts and terms, part 1 (slides: HTML, TEI, notes)

Homework: Experiment further with the WWVT!

― Tuesday, 13 July ―

Pre-session on downloading R and RStudio 11:00–12:00 (optional but recommended!)
Session 5 12:30–1:45 A deeper look at core concepts and terms, part 2 (slides: HTML, TEI, notes)
1:45–2:00 break
Session 6 2:00–4:30 Walkthrough commented code on querying an existing model, and group hands-on practice
Session 7 4:30–5:00 Troubleshooting (if needed)

― Wednesday, 14 July ―

Pre-session on downloading R and RStudio 11:00–12:00 (optional but recommended!)
Session 8 12:30–1:45 Process, part 1: Corpus and data preparation (slides: HTML, TEI, notes)
Session 9 1:45–2:15 Treasure hunt!
2:15–2:30 break
Session 10 2:30–3:30 Group walkthrough of model training in R
Session 11 3:30–4:30 Hands-on practice: getting ready to train a model, loading your own data

Homework: train a model on your own data, varying one paramater from the defaults

― Thursday, 15 July ―

Session 12 12:30–1:00 Quick questions and review of our trained models
Session 13 1:00–2:30 Process, part 2: Parameters and validation (slides: HTML, TEI, notes)
2:30–2:45 break
Session 14 2:45–3:45 Working with our trained models
Session 15 3:45–4:30 Group hands-on practice: setting up to train and validate another model

Homework: Train another model, and run validation code

― Friday, 16 July ―

Session 16 12:30–2:00 Hands-on practice and discussion in small groups
2:00–2:15 break
Session 17 2:15–3:15 Group walkthrough: Visualizing semantic spaces
Session 18 3:15–4:30 Full-group discussion and wrap-up (slides: HTML, TEI)

This workshop was originally scheduled for 2020, but was postponed to 2021 due to the pandemic.

Resources

Women Writers Vector Toolkit

Model training walkthroughs

Materials for Download

The resource page has links to all the slide sets (whether used in this seminar or not), interesting web sites we may have shown, and useful TEI links