The “fifth letter” of the human DNA changes the way genetic code is read

A Swedish-Finnish study shows that a fifth “letter” in the DNA of human cells changes the way their genetic code is read. Published in Science, the results help understand the ways DNA directs gene expression during human development and in the generation of diseases.

The order of the letters in the human genome – A, C, G and T – has been known since 2000. In addition to these four letters, the Cs in combinations of C and G can be transformed in the cell into a “fifth letter” of the genome through a process known as methylation. Understanding the order of the DNA base pairs, or letters, is necessary for applying genome data in medicine. The strings of letters in the genes are known, but there is still little understanding of the regulatory sequences which contain instructions on when and where the gene is to be expressed.

The human body consists of several different types of cells, all of which have the same order of letters in the genome. However, the level of methylation in the CG combinations in the genome varies between cells in different tissue types. The methylation can have a major impact on the expression of the genes in the cell, and consequently on the characteristics of the cell, particularly if the methylation takes place in the regulatory sequences’ “DNA words” to which the regulatory proteins known as transcriptions factors attach.

Researchers led by Professor Jussi Taipale (University of Helsinki and the Karolinska Institutet) have previously identified the most common DNA words with which individual or paired transcription factors bind. How the methylation of the CG combinations impacts the way in which the proteins “read” the DNA words has not previously been examined. Consequently, Taipale’s group systematically mapped the attachments of the transcription factors to DNA words with either methylated or non-methylated CG combinations.

The results reveal that methylation in the CG combinations changes the way many transcription factors bind with the DNA words. Unlike what was previously thought, DNA words with methylated CG combinations are most commonly read by transcription factors present in utero and during the development of vital organs, as well as some transcription factors which are associated with prostate and colon cancer.

Researchers from the Academy of Finland’s Finnish Centre of Excellence in Cancer Genetics Research, which operates under the University of Helsinki, produced pivotal information about the methylation of the CG combinations in the genome and studied the appearance of the DNA words recognised by the transcription factors in the genomes of humans and other species.

The results of the study are highly significant for understanding individual development, the growth of cancer tumours and the origins of diseases.

Reference: Yimeng Yin, Ekaterina Morgunova, Arttu Jolma, Eevi Kaasinen, Biswajyoti Sahu, Syed Khund-Sayeed, Pratyush K. Das, Teemu Kivioja, Kashyap Dave, Fan Zhong, Kazuhiro R. Nitta, Minna Taipale, Alexander Popov, Paul Adrian Ginno, Silvia Domcke, Jian Yan, Dirk Schübeler, Charles Vinson, and Jussi Taipale: Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science, 5 May, 2017

More information:

Jussi Taipale, PhD, Professor of medical systems biology
Department of Medical Biochemistry and Biophysics, Karolinska Institutet
Tel. +46 (0)736 49 18 79
Email: jussi.taipale@ki.se