… Session 10: Small groups, hands-on treasure hunt

Your goal is to examine the “experimental corpus” models in the sandbox and build a forensic case for the specific properties or limitations in their preparation. If you have a corpus of your own in the sandbox, you can also include that in your investigation. Feel free to chat as a group and share notes and ideas as you explore.

Look at two or three of the models: what can you determine about the choices that were made in corpus preparation? What are the specific clues you can find by exploring the models? Look for evidence related to one or two of the points below:

Make a list of the evidence you can find about data preparation, and develop some notes about how these data preparation choices seem to be impacting the models.

Some hints and things to think about:

Word Vectors: Hands-on Practice and Group Work, slide 5 of 16