I was invited to attend this international workshop hosted by Rice University’s Humanities Research Center; the goal was to envision a general textual analysis tool for exploring and interrogating digitised historical corpora.
Although a variety of digital textual analysis tools are already available, they can be complicated and difficult to use and are often tailored for specific, narrow research interests. The intention here was to design a tool enabling scholars from different disciplines to ask a spectrum of questions of a collection of texts and for the tool to be so intuitive it could be used without training or programming knowledge.
The workshop participants included specialists in corpus linguistic analysis, humanities scholars using computational methods, and those leading efforts to digitise historical texts. There were also participants from systems biology, where the design of software tools for analysing genomic data throws up similar challenges. Demonstrating the potential of cross-disciplinary collaboration, Erez Lieberman Aiden of Baylor College of Medicine talked about his work developing the Google Books Ngram Viewer. I was invited to offer perspectives on data visualisation; particularly as a simple/intuitive user interface and effective ways of displaying/visualising results were important components of the workshop goal.
It was very instructive to learn more about corpus linguistics approaches to text analysis. And after lots of brainstorming and discussion, I look forward to seeing the next phase of this project.