像绳珠一样沿着整个染色体进行排列的核小体,是染色体的最基本单位。那么,是什么决定核小体在何时、何处,怎样沿着DNA链进行定位的呢?魏兹曼研究院今天宣称,该院研究人员破译了决定核小体如何在DNA链上进行定位的基因代码。这些发现发表在7月9日的《Nature》杂志上。
核小体在DNA上的精确定位对细胞日常功能的发挥起重要作用,当相邻核小体之间的自由区域只有约20个碱基长时,单个核小体大约包含着150个碱基对。正是在这些核小体自由区域,才能进行遗传信息的复制。
多年以来,科学家并不认为活细胞中核小体的位置是由遗传排序自身控制的。魏兹曼研究院的埃兰·赛杰尔博士等研究人员则设法证明了,DNA的排序确实对如何放置核小体的“分区制”信息进行了编码。另外,他们还分析出这个代码的特征,并利用仅利用DNA排序,就精确地预测出酵母菌细胞中大量核小体的位置。
为完成这项研究,赛杰尔等人检查了大约200个不同核小体在DNA中的位置,并且从它们的排序中寻找共同之处。他们用数学方法分析核小体排序之间的相似之处,最后找到了一种特殊的“代码世界”。这个“代码世界”是由一个在排序上每隔10个碱基出现的周期性信号组成。这个信号有规则地循环,帮助DNA片断急剧弯曲成能够形成核小体所需要的球形形状。为识别这个核小体的定位代码,研究人员利用概率模型,来分析被核小体约束着的排序,并且开发了一个计算机算法,来预测一个整个染色体上核小体的编码组织。
一个困扰分子生物学家的难题是,细胞是如何指导转录因子到达它们在DNA上的合适位置,而不是那些相似却在功能上不相关的地点?研究人员发现,一个与结合位置有功能关联的基础信息,部分存在于核小体定位代码中:想要到达的地点在核小体之间的染色体片断上被发现,从而允许它们接受不同转录因子的引导。因此,如果用相同结构的假结合位置就能误导转录因子,从而帮助科学家找出这种结构。
研究人员认为,由于形成核小体核心的蛋白质存在于自然界大多数进化物种中,他们识别出来的遗传代码应该存在于包括人类在内的许多有机体之中。有些疾病,如癌症等,就是由基因突变以及它们组织染色体的方式引起的。这种突变过程可能受到DNA对于不同蛋白质的可接近性,以及在细胞核中DNA组织的影响。因此,研究人员相信,他们发现的核小体的定位代码,有助于揭示许多疾病的病理。
DNA这种又长又细的分子负载着我们的遗传信息。DNA在细胞核中被蛋白包围着并被压缩成微小的球体——核小体(nucleosome)。这些珠状的核小体成串的沿着染色体分布,染色体自己折叠和组装以适合细胞核的大小。是什么决定了核小体沿着DNA序列定位的方式、时间和地点的呢?Eran Segal博士和Weizmann科学研究所计算机科学与应用数学系的研究生Yair Field已经成功的破解了编码核小体在DNA链上的定位的遗传密码。他们的合作者包括来自芝加哥西北大学的同事。
核小体在DNA分子上的精确位置被认为在细胞每天的活动中具有十分重要的作用,由于DNA被包装进核小体中,阻断了许多蛋白与DNA的接触机会,包括那些负责一些最基本的生命过程的蛋白。这些被隔绝的蛋白中有的是起始DNA复制、转录和DNA修复的因子。因此,核小体的位置分布决定了这些过程能不能发生。这些过程所受的限制相当的多:大部分DNA包装进核小体中。单个核小体包含大约150个遗传碱基,而相邻的核小体之间的自由区域仅仅为20个碱基的长度。就是在这些核小体之间的自由区域,像转录等过程才能起始。
许多年以来,是否核小体在活细胞的位置受自身的遗传序列所控制呢?科学家们一直未能达成共识。Segal和他的同事们成功地证明DNA序列的确编码着放置核小体的“区域”信息。同时,他们仅仅利用DNA序列破解了这些密码,并能准确地预言酵母细胞中大量核小体的位置。
Segal和他的同事们通过研究大约200个不同的核小体在DNA上的位置并观察它们的序列是之间是否存在共性来完成这项发现工作的。数学分析揭示了这些核小体包装的序列的相似性,并且最终发现了一种特殊的“密码语言”。这种“密码语言”由序列上出现的每10个碱基的周期信号组成。这种信号的规则重复帮助DNA片断剧烈的弯曲成核小体所需的球状。为了确证这些核小体的定位密码,研究小组利用概率模型来获得被核小体包围的DNA序列,然后他们开发了一种计算机算法来预言整个染色体中的核小体的编码组织方式。
该研究小组的发现为另一个困惑分子生物学家们很久的神秘事物——细胞是怎样指导转录因子结合到DNA上预想的位点的呢,而不是到达基因组中其他一些序列相似但功能毫不相关的位点上——提供了深入的理解。这些短的结合位点自身不包含足够的让转录因子识别它们的信息。科学家们表示关于结合位点的功能相关的基本信息至少有一部分是编码在核小体定位密码中:在核小体之间的自由区域片断中发现这些预期的结合位点,因此,使得它们能够接触到各种不同的转录因子。相反,一些具有相同结构的伪似位点可能包含在核小体中,因此转录因子难以接近。
既然来自核小体核心的蛋白在自然界进化中是十分保守的,科学家因此相信他们所证实的这些遗传密码应该在包括人类在内的许多生物中是十分保守的。一些疾病,像癌症,一般都会伴随着或者由DNA的突变导致。这种突变的过程可能会影响DNA与各种蛋白的接触机会和细胞核中的DNA组装。因此,科学家们相信他们发现的核小体定位密码可以在将来帮助他们理解这些疾病的发病机理。
英文原文:
Scientists discover a genetic code for organizing DNA within the nucleus
DNA ?the long, thin molecule that carries our hereditary material ?is compressed around protein scaffolding in the cell nucleus into tiny spheres called nucleosomes. The bead-like nucleosomes are strung along the entire chromosome, which is itself folded and packaged to fit into the nucleus. What determines how, when and where a nucleosome will be positioned along the DNA sequence? Dr. Eran Segal and research student Yair Field of the Computer Science and Applied Mathematics Department at the Weizmann Institute of Science have succeeded, together with colleagues from Northwestern University in Chicago, in cracking the genetic code that sets the rules for where on the DNA strand the nucleosomes will be situated. Their findings appeared today in Nature.
The precise location of the nucleosomes along the DNA is known to play an important role in the cell's day to day function, since access to DNA wrapped in a nucleosome is blocked for many proteins, including those responsible for some of life's most basic processes. Among these barred proteins are factors that initiate DNA replication, transcription (the transfer of genetic information from DNA to RNA) and DNA repair. Thus, the positioning of nucleosomes defines the segments in which these processes can and can't take place. These limitations are considerable: Most of the DNA is packaged into nucleosomes. A single nucleosome contains about 150 genetic bases (the "letters" that make up a genetic sequence), while the free area between neighboring nucleosomes is only about 20 bases long. It is in these nucleosome-free regions that processes such as transcription can be initiated.
For many years, scientists have been unable to agree whether the placement of nucleosomes in live cells is controlled by the genetic sequence itself. Segal and his colleagues managed to prove that the DNA sequence indeed encodes "zoning" information on where to place nucleosomes. They also characterized this code and then, using the DNA sequence alone, were able to accurately predict a large number of nucleosome positions in yeast cells.
Segal and his colleagues accomplished this by examining around 200 different nucleosome sites on the DNA and asking whether their sequences have something in common. Mathematical analysis revealed similarities between the nucleosome-bound sequences and eventually uncovered a specific "code word." This "code word" consists of a periodic signal that appears every 10 bases on the sequence. The regular repetition of this signal helps the DNA segment to bend sharply into the spherical shape required to form a nucleosome. To identify this nucleosome positioning code, the research team used probabilistic models to characterize the sequences bound by nucleosomes, and they then developed a computer algorithm to predict the encoded organization of nucleosomes along an entire chromosome.
The team's findings provided insight into another mystery that has long been puzzling molecular biologists: How do cells direct transcription factors to their intended sites on the DNA, as opposed to the many similar but functionally irrelevant sites along the genomic sequence? The short binding sites themselves do not contain enough information for the transcription factors to discern between them. The scientists showed that basic information on the functional relevance of a binding site is at least partially encoded in the nucleosome positioning code: The intended sites are found in nucleosome-free segments, thereby allowing them to be accessed by the various transcription factors. In contrast, spurious binding sites with identical structures that could potentially sidetrack transcription factors are conveniently situated in segments that form nucleosomes, and are thus mostly inaccessible.
Since the proteins that form the core of the nucleosome are among the most evolutionarily conserved in nature, the scientists believe the genetic code they identified should also be conserved in many organisms, including humans. Several diseases, such as cancer, are typically accompanied or caused by mutations in the DNA and the way it organizes into chromosomes. Such mutational processes may be influenced by the relative accessibility of the DNA to various proteins and by the organization of the DNA in the cell nucleus. Therefore, the scientists believe that the nucleosome positioning code they discovered may aid scientists in the future in understanding the mechanisms underlying many diseases.
Dr. Eran Segal's research is supported by the Arie and Ida Crown Memorial Charitable Fund and the Estelle Funk Foundation.
The Weizmann Institute of Science in Rehovot, Israel, is one of the world's top-ranking multidisciplinary research institutions. Noted for its wide-ranging exploration of the natural and exact sciences, the Institute is home to 2,500 scientists, students, technicians and supporting staff. Institute research efforts include the search for new ways of fighting disease and hunger, examining leading questions in mathematics and computer science, probing the physics of matter and the universe, creating novel materials and developing new strategies for protecting the environment