The Coruņa Corpus of English Scientific Writing (CC)

The Coruña Corpus of English Scientific Writing (CC) is a specialised corpus which is divided in subcrpora depending on domain or discipline. From the star of the project in 2004 it was designed to contain 10,0000-word samples of scientific works published between 1700 and 1900 and that had been directly written in English.

Compilers: MUSTE Research Group
Project director: Isabel Moskowich
Period: 1700–1900
Size: extracts of ca. 400,000 words each sub-corpus
Language: EModE, LModE (scientific)
Project home page:

Reference line and copyright

Parapar López, Javier & Moskowich, Isabel. 2007. The Coruña Corpus Tool. Revista del Procesamiento de Lenguaje Natural, 39: 289–290.

Moskowich, Isabel & Crespo García, Begoña. 2007. Presenting the Coruña Corpus: A Collection of Samples for the Historical Study of English Scientific Writing. In Pérez Guerra, Javier et al. (eds.) ‘Of Varying Language and Opposing Creed’: New Insights into Late Modern English. Bern: Peter Lang (341–357).

Moskowich-Spiegel Fandiño, Isabel & Parapar López, Javier. 2008. Writing Science, Compiling Science. The Coruña Corpus of English Scientific Writing. In Lorenzo Modia, María Jesús (ed.) Proceedings from the 31st AEDEAN Conference. A Coruña: Universidade da Coruña (531–544).

Crespo García, Begoña & Isabel Moskowich. 2010. CETA in the Context of the Coruña Corpus. Literary and Linguistic Computing, 25(2): 153–164. doi:10.1093/llc/fqp038


On CD, each subcorpus is published individually.


Research Group for Multidimensional Corpus-Based Studies in English (MuStE)


On each CD, also includes a handbook.

Other corpora under compilation

CEPhiT (Corpus of English Philosophy Texts)
CELiST (Corpus of English Life Sciences Texts)
CHET (Corpus of Historical English Texts)
CECheT (Corpus of English Chemistry Texts)