Session 14: Treasure hunt

Your goal is to examine the “experimental” models in the Sandbox or RStudio Server and build a forensic case for the specific properties or limitations in their preparation. Feel free to chat as a group and share notes and ideas together. Try looking at clusters, run a few queries (especially comparing results between these models and the “WWO Full Corpus” model), and test some of the operations, such as additions, subtractions, and analogies. You can even run some of our validation scenarios in RStudio Server, if you’re feeling adventurous.

You have two options for your exploration:

For either option, make a list of the evidence you can find about data preparation or model training, and develop some notes about how these choices seem to be impacting the models.

Word Vectors: Hands-on Practice and Group Work, Intensive Pedagogy-focused, slide 9 of 11