Corpus Presenter

MEMT and EMEMT are distributed with a specially designed version of Corpus Presenter, a corpus tool developed by Raymond Hickey (University of Essen). Although the corpora can be analyzed with most other corpus tools as well, the inclusion of a proprietary tool on the CD-ROM makes it easier for users to access the texts. As an added bonus, the software comes both as a Windows version and as a Java version, the latter making it possible to examine the texts on any Java compatible platform, such as Mac and Linux.

Cooperation between the Scientific thought-styles project and Prof. Hickey began back in the late 1990's. The final distribution version of MEMT Presenter was developed in close cooperation during 2004 and 2005. On the Helsinki side, Martti Mäkinen did most of the work, with Turo Hiltunen and Jukka Tyrkkö helping out as beta testers.

The cooperation continued with EMEMT Presenter. The new version is vastly expanded in functionality, allowing several new features never before seen in historical corpora of this kind. These include the inclusion of a gallery of facsimile images, extensive background information on each text, hyperlinks to online sources, and more. EMEMT Presenter is based on Corpus Presenter 12. Jukka Tyrkkö stepped in to become the main collaborator in Helsinki.

Special features of EMEMT Presenter

The main difference between EMEMT Presenter and most other corpus tools is in the amount of direct access it gives to the texts. Rather than being a simple concordancer, EMEMT Presenter allows the user to browse through the texts effortlessly, and to access both the manual and text catalogue of the corpus instantaneously. The search features of EMEMT Presenter, in particular concordancing using a wordlist, aid the historical linguist in finding variant spellings.

EMEMT Presenter incorporates a hierarchical organization of corpus texts by default. This allows the corpus compilers to represent their understanding of textual history. Naturally, the texts can also be examined in any alternative combinations, either with EMEMT Presenter or using another corpus tool.

The corpus tool allows multiple ways of analysis, ranging from concordance searches and wordlist generation to more advanced features such as collocations and keyness analysis. The example image shows the search statistics window.

Text catalogue and image gallery

Another important feature of the customized software is that it allows the integration of biographical and bibliographical information within the corpus tool itself. This makes it easy to learn about the authors, translators and other important characters involved in the production of a text, as well as a convenient way to access detailed bibliographical information about the book. Hyperlinks to relevant pages on the Early English Books Online and Oxford Dictionary of National Biography websites are also provided from each catalogue description.

All the images from the information cards are also accessible as a separate image gallery. The corpus user can browse through the hundreds of examples of title pages, illustrations, and layout features to gain a feel for the early printed page. Each image is equipped with a short caption giving a bibliographical reference and a short explanation.

Standardized spelling

EMEMT is distributed with a parallel version of the corpus featuring standardized spelling. This not only makes EMEMT easier to use for non-linguists, but also aids in the use of corpus linguistic methods such as collocation and keyness analysis which rely on uniform word forms.

The standardized-spelling version of EMEMT was produced in co-operation with Alistair Barron, Paul Rayson, and Dawn Archer at Lancaster. The standardisation was performed automatically using VARD, or Variant Detector, developed by Barron, and the resulting files were analysed and tweaked at the Helsinki end by Anu Lehto.