“Day of DH” Snapshots of Our Daily Lives

The Women Writers Project is proud to host our local Digital Scholarship Group “Day of DH” post this year. “Day of DH” provides an opportunity for members of the DH community to share “day in the life” vignettes with each other. For more information about “Day of DH,” please view the official site and you can follow the twitter hashtag #DayofDH.  I hope these snapshots offer a fun array of some of the people, activities, and work that comprises the DH community at Northeastern.

Julia Flanders, Director of the Digital Scholarship Group and the Women Writers Project

This year for “Day of DH” I had an unusually substantive day–in the past I’ve sometimes found myself trying to create an inspiring narrative about the relevance of administrative work, but today I did some genuinely digital-humanities things. My first activity was a meeting of the research group for a seedling grant that is focused on using the Women Writers Project corpus with Word2Vec. In the coming year we’ll be expanding some tools Ashley Clark developed that produce a modified version of the WWP’s TEI/XML markup from which we can then extract plain-text data to feed into the word vector analysis. The modifications handle things like hyphenated words broken across a line break (representing these as a single word for analysis purposes), or selecting the regularized-spelling option for words which the WWP has marked for regularization. The resulting output produces more meaningful results in the word vector analysis (since it doesn’t include word fragments and typographical variants). We sat down together as a group and installed the current version of Ashley’s XSLT and XQuery routines, so that as the grant work gets going we can all experiment together.

After that, the Digital Scholarship Group had its weekly staff meeting at which we discussed the recently announced NHPRC/Mellon “Digital Edition Publishing Cooperatives” funding program, and the potential it might hold for DSG. Then in the afternoon, Syd Bauman and I taught the second session of a short and intensive workshop on schema-writing with RelaxNG, for graduate students in Northeastern’s Digital Humanities Certificate program.

A good and enjoyable day with wonderful colleagues–I feel really lucky for these moments of routine productivity, amid more uncertain and threatening circumstances.

Sarah Connell, Assistant Director of the Women Writers Project and the NULab for Texts, Maps, and Networks

You can get a reasonable picture of my day by looking at “before” and “after” versions of my to-do list, combined with my calendar. Today was a fairly standard Thursday in that it was mostly meetings, with other work happening in the gaps between. On my train ride in and for the first half-hour of the day, I was able to prepare for a training session I have tomorrow and send out a scheduling notice for an upcoming meeting that the NULab faculty will be having to plan for our programming next year, which will focus on the theme of fake news and disinformation. I also checked one of our WWO texts to see if my suspicions that a semicolon really needed to be a period were correct (they were). I replied to a few emails as well (there are always emails) and I got some incremental work done in reviewing the newest set of Women Writers in Context exhibits for publication.

Then, Ashley Clark and I met with the team who will be working on a new WWP project, funded by one of Northeastern’s TIER 1 grants, to set up a prototype vector space analysis web platform for Women Writers Online. This was a fun meeting because we were getting the whole team up and running with the XSLT and XQuery transformations necessary to take encoded texts and prepare them for analysis using Ben Schmidt’s word2vec package in R. It was a good chance for me to practice walking people through these processes and, as always, there were some new wrinkles that came up, which Ashley and I will now be able to anticipate the next time we teach this. That meeting ran late, so I ended up going right into the Digital Scholarship Group team meeting (which actually just meant moving to a different seat on the couch in our media lounge).

After the DSG meeting I grabbed a bit of lunch and sent a few more emails, including a scheduling message for a meeting on using the CERES Toolkit in a class on Literature and Digital Diversity that Elizabeth Dillon and I will be teaching in the fall. I was also able to take care of a few WWP admin tasks before the next meeting—in this case, actually a workshop on RELAX NG and schema planning, the second of two sessions led by Julia Flanders and Syd Bauman. After that workshop, Julia and I had our weekly meeting, which enabled me to check off a few items on my to-do list, particularly around our planning for the DH Certificate and for the work that the WWP and other DSG & NULab projects will be doing over the summer. As often happens, I added a few new items to my to-do list as well.

Finally, it was time for a Barrs Lecture, “Senecan Inwardness and the Staging of Race in Titus Andronicus and Othello” by Curtis Perry, followed by dinner with the speaker and then a train ride home (during which I’ll probably write more emails). I’m sending this for posting prior to the lecture and I’m really looking forward to it.

And now it’s time to check off one last item on my to-do list: “Write Day of DH post.”

Sarah’s “to-do” list at the beginning of the day.
Sarah’s “to-do” list at the end of the day. At the WWP we are all amazed at everything Sarah manages every single day.

Ashley Clark, XML Applications Programmer

This morning I assisted Sarah Connell in introducing the process we use to generate full-text versions of Women Writers Project TEI. The process consists of an XSL transformation I wrote to regularize things like <choice> elements and soft hyphens—phenomena that the WWP encoders have dutifully transcribed, but the implications of which can be lost when one strips out the markup, retaining only the text content. For example, a typo transcribed as:

will, when the encoding is stripped out, appear like this:

The XSLT creates a normalized version of the WWP TEI, moving non-useful text into an attribute I’ve called ‘read’ (as in, “for this element, read ‘This'”):

which translates into this plain text version:

But! Since the original text content is preserved in `@read`, you can reconstitute it and use XPath to find the matching phrase in its original context:

`//text//p[matches(normalize-space(.),’the Emrppre[sſ]s’)]`

(Note that I haven’t yet made explicit the normalization of long-S to regular S. Ideally, the XSLT would use @read for the long-S as well, so you wouldn’t have to resort to regular expressions.)

Lara Roberts, PhD Candidate in English

Lara’s Day of Digital *Human*ities

0930-1100 I was part of a group that transformed the WWP corpus with XSLT and XQuery to use later with the word2vec R package.
1130-1300 I went to our weekly meeting for the Early Caribbean Digital Archive. We were so excited working on prepping the website for launch that I forgot to take a picture. Instead, here’s a slide with pictures of the team members (past and present)!
From 1300-1600, I joined my cohort in our teeny office to have weekly work time trying to understand data analysis through RStudio.
1600, Usually, at some point, we have to go get snacks to keep our brains fueled, before…
1630-1900 I ended the day in the always challenging and entertaining Humanities Data Analysis class.

Joanne DeCaro Afornalli, Outreach Coordinator for the Women Writers Project

After a brisk morning walk with my exceedingly energetic little puppy Brooke, I settled in to some tea and emails. I was very excited to see a congratulatory email from David Lazer, Co-Director of NULab, on a recent presentation I gave for the NULab faculty on my Digital Humanities Certificate project. Afterwards, I spent some time looking over a new contribution for our Intertextual Networks series. I’m really looking forward to sharing Cassie Childs’ upcoming post on Delarivier Manley’s Letters Written by Mrs Manley and food history. It includes some fascistic analysis of archival images from eighteenth-century recipe books and botanical guides, and the post’s images immediately struck me with their beauty and nostalgia.

My big event of the day was attending Northeastern’s Academic Honors Convocation to receive the Outstanding Graduate Student Award for Experiential Learning. The award recognizes a graduate student who has “shown an extraordinary capacity to integrate academics and professional work, and establish themselves as an emerging leader in their field.” I was highly honored to received it, and very glad I could share the experience with my advisor Elizabeth Maddock Dillon, my Co-op coordinator Lisa Cantwell Doherty, and Marina Leslie (who so kindly nominated me for the award).

Now that I’m home for the night, I plan on making the final minor formatting touches on my master’s thesis, and then submitting it to ProQuest! My thesis, “Angelenos Incarcerated: The La County Jail Oral History Project” is a DH project that features the oral histories of ex-inmates told through videography, mapping, exhibits, and encoded texts (with a customized TEI schema). You can view the project’s website here.

Overall, it was a pretty big day. Not necessarily the heaviest DH day for me. But, I was so honored to have the multimedia and digital humanities work I do recognized in a big way today. And I was beyond grateful to have such an amazing group of women cheering me on.

Liz Polcha, PhD Candidate in English

Cara Messina, PhD Candidate in English

This morning I woke up feeling the familiar finals anxiety. Even so, I pushed myself to attend the RelaxNG workshop run by Julia Flanders. Thanks to learning the different approaches to schema building (and Julia’s excellent scaffolding and metaphors), I have begun creating a flexible XML schema that I plan to use as a pedagogical tool next semester. Learning new DH tools is the perfect form of productive procrastination!

After the workshop, I attended Ryan Cordell’s Humanities Data Analysis final class. Throughout the semester, we’ve used R to analyze our corpora; my corpus contains the metadata and actual texts of 3,000 Korra x Asami (Korrasami) fanfictions from Archive of Our Own.  We went over topic modeling and classification again; Ryan encouraged us to embrace topic modeling’s lack of stability. Although most of the class revolved around discussing challenges and asking/answering questions about our struggles with R, we had a few laughs reading Day of DH Tweets and reflecting on the semester.

Bill Quinn, PhD Candidate in English

Today for DH, I worked on writing my prospectus. I wrote about how computational text analysis will help me explore intertextuality in modernist magazines. It feels really weird writing about what computers do between inputting the data and rendering the visualizations, and I am trying to figure out how some people do it so well. Fortunately, Stanley the dog was there to help out.

