Word Vectors for the Thoughtful Humanist

Northeastern University
July 12–16, 2021

Sarah Connell, Julia Flanders, Syd Bauman, Juniper Johnson, Ash Clark
Digital Scholarship Group, Northeastern University Libraries

Schedule

― Monday, 12 July ―

Session 1 12:30–1:50 (Eastern) Welcome, introduction and scoping (slides: HTML, TEI, notes)
1:50–2:00 break
Session 2 2:00–3:15 Conceptual orientation and walkthrough of the basics of R (slides: HTML, TEI, notes)
Session 3 3:15–4:20 A deeper look at core concepts and terms, part 1 (slides: HTML, TEI, notes)
Session 4 4:20–4:30 Wrap-up

Homework: Hands-on experimentation with the Women Writers Vector Toolkit and/or our Sandbox

― Tuesday, 13 July ―

Session 5 12:30–1:45 A deeper look at core concepts and terms, part 2 (slides: HTML, TEI, notes)
1:45–2:00 break
Session 6 2:00–4:30 Walkthrough commented code on querying an existing model, and group hands-on practice (slides: HTML, TEI, notes)
Session 7 4:30–5:00 Troubleshooting (if needed)

Homework: Choose a term that you’re interested in, query that term in at least two different models, and make notes on your results in the Day 2 Homework document in the Group Activities folder in our shared Google Drive

― Wednesday, 14 July ―

Pre-session 11:00–12:00 Downloading R and RStudio (optional but recommended!)
Session 8 12:30–1:45 Process, part 1: Corpus and data preparation (slides: HTML, TEI, notes)
Session 9 1:45–2:15 Treasure hunt in the WWVT Sandbox! (slide: HTML)
2:15–2:30 break
Session 10 2:30–3:30 Group walkthrough of model training in RStudio (slide: HTML, TEI, notes)
Session 11 3:30–4:30 Hands-on practice: getting ready to train a model, loading your own data (slides: HTML, TEI, notes)

Homework: train a model on your own data, varying one paramater from the defaults

― Thursday, 15 July ―

Pre-session 11:00–12:00 Downloading R and RStudio (optional but recommended!)
Session 12 12:30–1:00 Questions and review of our trained models
Session 13 1:00–2:30 Process, part 2: Parameters and validation (slides: HTML, TEI, notes)
2:30–2:45 break
Session 14 2:45–3:45 Treasure hunt, part 2 (slide: HTML)
Session 15 3:45–4:30 Group hands-on practice: setting up to train and validate another model (slides: HTML, TEI, notes)

Homework: Train another model with different parameters, develop some word pairs, and run validation code

― Friday, 16 July ―

Session 16 12:30–1:45 Hands-on practice and discussion in small groups (slides: HTML)
1:45–2:00 break
Session 17 2:00–3:15 Walkthrough demo: Exploration and analysis (slides: HTML, TEI, notes)
Session 18 3:15–4:30 Full-group discussion and wrap-up (slides: HTML, TEI)

This workshop was originally scheduled for 2020, but was postponed to 2021 due to the pandemic.

Resources

Women Writers Vector Toolkit

Our RStudio Server instance

Model training walkthroughs

Materials for Download

The resource page has links to all the slide sets (whether used in this seminar or not), interesting web sites we may have shown, and useful TEI links