Category Archives: Elec. Textual Analysis

Review: Dave Lordan’s First Book of Frags

Experimental texts pose something of a quandary to electronic textual analysis in that they tend to abandon those typical statistical trends required to form an authorial signature. Computational stylistics, for all its analytic diversity, is utterly dependent on the integrity of its authorial signature if it is to be used as an approach to analysis. If you want to see why computational stylistics is the realm of digital humanists and not purebred statisticians or computer scientists, run Finnegans Wake through R. Of course, the nature of experimental texts, while problematic in relation to any analysis based on computational linguistics, also presents the opportunity for textual explorations of a refreshingly unpredictable fashion – it is in experimental works that the digital humanist can hope to produce results that are truly unexpected, even if the unexpected is precisely that which is expected. Enter Dave Lordan, and the wonderfully crafted First Book of Frags, his recent collection of experimental short stories. At first I had intended to offer a traditional review of the text, but these will undoubtedly be in plentiful supply, and with time against me and my curiosity piqued at the prospect of running a brand new experimental text through the digital gauntlet, I couldn’t resist but take a computational approach. This decision was of course influenced by the fact that this is a collection of experimental short stories – 16 unique segments – mouth-watering to a cluster fiend such as myself.

Amongst his many other accomplishments, Ian Fellows will long be remembered as the scholar who gave us empirical word clouds.  Using his innovative R package, I generated such a visualization of the top 50 most frequently used words in Lordan’s collection, excluding those that would be considered common. Common words only have significance in the development of an authorial signature, and thus would have served little purpose to this particular aspect of the analysis.

First Book of Frags Wordcloud

50 most frequent uncommon words in Dave Lordan’s First Book of Frags

Continue reading

Electronic Textual Analysis: What and Why?

This short reflection was extracted from the Apple iBook, Digital Arts & Humanities: Scholarly Reflections, freely available at: http://itunes.apple.com/us/book/digital-arts-humanities-scholarly/id529097990?ls=1

Offering a short reflection on a topic that you are studying in great depth is a challenging task. Without a tangible research question, and the promise of an answer to that question – both of which act as something of a compass – you find yourself in what can only be described as a wilderness. Emerging from that wilderness isn’t the purpose of such a process, perhaps, but rather, the purpose is to document what went on during your travels.

Continue reading

The trouble with electronic textual analysis

The trouble with electronic textual analysis – like all interpretive practices it is not without its flaws – is that it requires specialist expertise. In addition, it requires reliable sources from which literary and textual critics can extract data – data that can be used to form meaning; shape and justify interpretations. Continue reading

Automating TEI encoding (using the Overtoom/Jockers Python script)

We all know that appropriate standards are required if electronic textual scholarship is to become precisely what it claims to be – scholarly. Enter the TEI, the various debates on its use, and the rest is history – we now have a standard for electronic textual encoding. What next? Well, encoding, what else? Textual encoding is a tedious process, particularly if you are working with a large corpus. Thankfully, Michiel Overtoom set about writing a Python script to automate the conversion of Project Gutenberg plain texts files to a format more suited to his own purposes (this included removal of the Gutenberg boilerplate). Stanford’s Matt Jockers (@mljockers) took this a step further in terms of textual scholarship, adapting Overtoom’s script so that it converts the Gutenberg text to a TEI-compliant XML file.

Continue reading