The Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME)
The Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME) is a syntactically annotated corpus of prose text samples. Its syntactic annotation (parsing) permits searching, not only for words and word sequences, but also for syntactic structure. The corpus is designed for the use of students and scholars of the history of English, especially the historical syntax of the language, and it is part of an ongoing larger project at the University of Pennsylvania and the University of York to produce syntactically annotated corpora for all stages of the history of English.
Project leader: Anthony Kroch
Time of compilation: 1999–2004
Size: c. 1,8 million words (1,794,010)
Language: Early Modern English
Number of texts/samples: 229
Funding: National Endowment for
the Humanities and the National Science Foundation
Project home page: http://www.ling.upenn.edu/hist-corpora/
Reference lines and copyright
Kroch, Anthony, Beatrice Santorini, and Lauren Delfs. 2004. Penn-Helsinki Parsed Corpus of Early Modern English. http://www.ling.upenn.edu/hist-corpora/PPCEME-RELEASE-2/index.html.
Santorini, Beatrice. 2005. Annotation manual for the PPCME2, PPCEME, and PCEEC. http://www.ling.upenn.edu/hist-corpora/annotation/index.htm
Professor Anthony Kroch and Dr Beatrice Santorini (University of Pennsylvania)
Each text in the corpus comes in three different formats: text (.txt), part-of-speech (POS) tagged (.pos) and parsed (.psd).
The Penn Corpora are distributed with a search program CorpusSearch 2, written by Beth Randall, and released as open source software.
A CD-ROM may be ordered with the corpus order form.
Penn-Helsinki Parsed Corpus of Middle English, 2nd edition (PPCME2)
York-Helsinki Parsed Corpus of Old English Poetry
York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE)
Brooklyn-Geneva-Amsterdam-Helsinki Parsed Corpus of Old English
Parsed Corpus of Early English Correspondence (PCEEC)
Penn Parsed Corpus of Modern British English (1700-1914)
Information for the entry was edited by Prof. Anthony Kroch.