The Coruņa Corpus of English Scientific Writing (CC)

The Coruña Corpus of English Scientific Writing (CC) is a specialised corpus which is divided in subcorpora depending on domain or discipline. From the start of the project in 2004 it was designed to contain 10,000-word samples of scientific works published between 1700 and 1900 and that had been directly written in English. Also from the start, all samples and metadata files were edited in XML following TEI.

Compilers: MUSTE Research Group
Project director: Isabel Moskowich
Period: 1700–1900
Size: extracts of ca. 400,000 words each sub-corpus
Language: LModE (scientific)
Project home page:

Reference line and copyright

Parapar López, Javier & Moskowich, Isabel. 2007. The Coruña Corpus Tool. Revista del Procesamiento de Lenguaje Natural, 39: 289–290.

Moskowich, Isabel & Crespo García, Begoña. 2007. Presenting the Coruña Corpus: A Collection of Samples for the Historical Study of English Scientific Writing. In Pérez Guerra, Javier et al. (eds.) ‘Of Varying Language and Opposing Creed’: New Insights into Late Modern English. Bern: Peter Lang (341–357).

Moskowich-Spiegel Fandiño, Isabel & Parapar López, Javier. 2008. Writing Science, Compiling Science. The Coruña Corpus of English Scientific Writing. In Lorenzo Modia, María Jesús (ed.) Proceedings from the 31st AEDEAN Conference. A Coruña: Universidade da Coruña (531–544).

Crespo García, Begoña & Isabel Moskowich. 2010. CETA in the Context of the Coruña Corpus. Literary and Linguistic Computing, 25(2): 153–164. doi:10.1093/llc/fqp038


On CD (Astronomy and Philosophy), each subcorpus is published individually.

Open access at


Research Group for Multidimensional Corpus-Based Studies in English (MuStE)


Each subcorpus includes a handbook (both CD and online versions).

Other corpora under compilation

CEPhiT (Corpus of English Philosophy Texts)
CELiST (Corpus of English Life Sciences Texts)
CHET (Corpus of Historical English Texts)
CECheT (Corpus of English Chemistry Texts)
CETeL (Corpus of English Texts on Languages)