Shakespeare's plays reveal his psychological signature

I think this was quite interesting, and I don't doubt the science behind it is well established, but one or two things did raise some questions for me. While they gesture to a number of studies backing their methodologies, I didn't see much discussion of the differences in early modern English to modern English - which might be trivial in some senses, but might also require a bit of calibration to make sure there aren't erroneous assumptions about their data set.

That seems particularly true of something like their "content word" measure, with their standard dictionary used to assess these words (Pennebaker, Booth, & Francis, 2007) describes itself as being based on "Roget's Thesaurus, and standard English dictionaries". I'm unclear as to whether these include historical-contextual sources in order to give a more accurate assessment of seventeenth-century English - if not, this could be a serious problem in their work, though is obviously only one measure that they looked at. The fact the authors chose to render each word "through software designed specifically to convert idiosyncratic and outmoded spellings to their U.S. equivalents" makes me extremely alarmed - anyone looking for anything in an early modern text would need to be extremely conscious of the multiple puns that come to bear on a word through its spelling - in Shakespeare perhaps more than most writers.

There's also a slight issue in the actual texts that they used. They note that they've excluded collaborations, though the inclusion of Measure for Measure and Macbeth (which some at least some evidence of Middleton's hand, perhaps as a reviser), Henry VI part 1 is another one, though I'm less up to date on exactly what the scholarship is on that one.

Talking about texts, my major, major issue with their use of Shakespeare is that they haven't given a source for their texts of any author's works. In fact, the only text by Shakespeare they cite is a Complete Works from 1890 - which makes me slightly worried about the quality of they texts they're working with. Which versions of the plays did they use? Q1 and F Hamlet look very different - and though it's only a single play, there are a lot of these minor quibbles that might add up to something. The lack of discussion on this point makes it difficult to know either way.

All in all, it's good stuff. That said, there's some questions about some assumptions in their approach to EM English - though I only took a brief read through and may have missed something. There would also need to be a much bigger sample size to say anything concrete - they may well be another 17th/18th century writer who looks like a similarly good, or better candidate. Still, an interesting (potential) tool in the arsenal of stylometric work.

/r/shakespeare Thread Link - phys.org