Browsed by
Month: April 2021

Experiments in Tokenization for Word Embedding Models

Experiments in Tokenization for Word Embedding Models

By Laura Johnson During my time as the 2019–2020 NULab Coordinator, I extended my previous research experience with the Women Writers Project to build an XSLT script for tokenizing element content for the Women Writers Vector Toolkit (WWVT). The WWVT is an online laboratory for learning about and experimenting with word embedding models and features over 20 models created using the Women Writers Online (WWO) corpus and parallel corpora created from the Text Creation Partnership collections. The WWVT models are…

Read More Read More