Digital Editions for Corpus Linguistics (DECL)

The purpose of this project is to develop a new type of online edition that combines the accurate description of historical documents with the flexibility of search tools developed for linguistic computing.

Our premise is that by offering tools that are useful for both historians and linguists we will help to bridge the gap between the two disciplines, increasing interdisciplinary co-operation and making electronic editions useful for a wider public.

Project members
  • Alpo Honkapohja (University of Zurich)
  • Samuli Kaislaniemi (University of Helsinki)
  • Ville Marttila (University of Helsinki)
  • Martti Mäkinen (Hanken School of Economics)
  • Mike Olson (University of Madison, Wisconsin)

DECL was formed by three postgraduate students at Varieng in late 2007. We shared a dissatisfaction with extant tools and resources, believing that digitized historical texts and manuscripts generally failed to live up to expectations. What was needed, we thought, was a more versatile and user-friendly model for these resources.

At the same time, we recognized that digitization was time-consuming and complicated, and for this reason compromises had been made in digital editions and corpora. Therefore a user-friendly framework for the creation of such editions would be desirable, one created from extant standards, tools and solutions.

Aims of the project

  1. To create editions that function as both editions and corpora — allowing equally the comparison of manuscript image and diplomatic transcript, and intricate textual searches of the transcripts and linguistic tags.
  2. To create a framework which makes the creation of such editions easy and is readily adaptable to different types of historical texts.

The DECL project encoding standard is based on and compliant with the latest TEI XML Guidelines (P5, published 1.11.2007), compatible with a wide range of software platforms. We use open source models, and DECL editions will allow access to the XML code. We advocate open access publication, and welcome collaboration with scholars working on historical manuscripts.

Current state

DECL was envisioned from the beginning to be a long-term project, and the first release of the DECL Guidelines is expected to take place only after the original members have received their PhDs. DECL is not a funded project, and work on the project proceeds alongside other work of the core members of the DECL team. For the first completed DECL-standard TEI XML schema, see Ville Marttila's thesis (link below).



In fall, Ville Marttila successfully defended his doctoral thesis, Creating Digital Editions for Corpus Linguistics : The case of Potage Dyvers, a family of six Middle English recipe collections. His thesis, which includes a DECL-standard TEI XML schema, is available in HELDA.


DECL co-organized a TEI XML workshop in Helsinki that took place on 24-25 May, concurrently with the annual VARIANTTI seminar. More information on the event can be found in the workshop flyer (pdf).


DECL was part of the project team for creating the TEI XML version of the Helsinki Corpus.


Ville Marttila. "Kinds of annotation: a wider perspective". Paper presented at the Varieng Annotation Day. Helsinki, 7-8 October 2010.

Ville Marttila. Co-organization of the TEI Workshop in Helsinki and Varieng Annotation Day.

Alpo Honkapohja & Samuli Kaislaniemi. Participation in the TEI @ Oxford Summer School 2010, Oxford, 12-14 July 2010.

DECL. Participation in the Digital Humanities 2010 conference at King's College, London, 7-10 July 2010.

Samuli Kaislaniemi. Participation in THATCamp London, King's College, London, 6-7 July 2010.

Alpo Honkapohja. Participation in the DHO Summer School, Dublin, Ireland. 28 June-2 July 2010.


DECL. Participation in the Textual scholarship workshop led by Prof. David Greetham (City University of New York), organized by VARIANTTI. Helsinki, 3-4 December 2009.

Alpo Honkapohja, Samuli Kaislaniemi & Ville Marttila. "Introduction to manuscript studies". BA/MA level course at the Department of English, University of Helsinki. Fall term 2009.

Alpo Honkapohja, Samuli Kaislaniemi & Ville Marttila. "Digital Editions for Corpus Linguistics: Encoding Abbreviations in TEI XML Markup". Paper presented at the International Medieval Congress in Leeds, UK, 13-16 July 2009.

Alpo Honkapohja. "Digital Editions for Corpus Linguistics: Encoding Abbreviations in TEI XML Markup". Poster presented at the Digital Humanities 2009 conference at Maryland, USA, 22-25 June 2009. Abstract (pdf)Poster (pdf).


Alpo Honkapohja, Samuli Kaislaniemi & Ville Marttila. "Digital Editions for Corpus Linguistics: A new approach to creating electronic editions of historical manuscripts". Paper presented at the Digital Humanities 2008 conference in Oulu, Finland, 25-29 June 2008. Abstract (html)Presentation (pdf)Handout (pdf).

Alpo Honkapohja, Samuli Kaislaniemi & Ville Marttila. "Digital Editions for Corpus Linguistics: Representing manuscript reality in electronic corpora". Poster presented at the ICAME 29 conference in Ascona, Switzerland, 14-18 May 2008. Abstract (html)Poster (pdf). Handout (pdf).

External links