在中国科学院北京基因组所研究员于军的指导下,硕博生朱江、和夫红等所在小组利用EST数据对人类基因表达的组织特异性进行了系统分析,取得的成果发表在日前出版的《BMC基因组学》(BMC Genomics)杂志上。
成人个体由超过200种细胞类型组成,各种细胞类型存在细胞特异的转录组构成。Tissue-specific (TS) 基因只在特定组织中表达以完成组织特异的细胞功能,而housekeeping (HK) 基因在所有组织中均表达以维持细胞的基本功能。界定HK基因所组成的“基本转录组”是理解“细胞特异转录组”的基础。目前已有多个基于芯片的研究工作界定了相应的HK基因集,虽然这些基因集都估计大约500个人体HK基因,但是各HK基因集之间重叠很小。人体转录组中究竟有多少,哪些是HK基因仍然是目前待解决的问题。
中科院的科研人员按照人体组织分类整合了现有的人类EST数据和一组被广泛使用并且较为完备的芯片数据,同时针对目前研究最为透彻的18个人体组织,对两组数据进行了系统的比较分析。通过一组按照基因功能注释确定的HK基因做为参照,研究结果表明两类数据均存在各自的局限性:目前对大多数人体组织的EST测序仍然没有达到饱和,限制了在这些组织中的基因检测及组织特异表达模式的研究;而芯片数据平均具有较低的基因检测率,因而过低估计了人体HK基因。研究还表明,在目前已经注释的基因中,约40%的基因在人体各组织中广泛表达,而仅有5%的基因在特定组织中特异表达,揭示组织特异的基因表达模式因此需要更精确、更大规模的转录组数据。该研究最终对人体HK基因进行了重新界定。基于EST数据界定的人体HK基因数量上在3140到6909之间,大约是过去基于芯片数据得到的HK基因集的十倍。该研究工作为系统分析HK及TS基因的性质提供了新的基础,同时为该所正在开展的以细胞为单元的人体转录组研究(“973”项目)提供了分析框架和初步的数据模型。
后续研究将基于新界定的人体HK基因集,在基因结构、进化速率、启动子结构等方面对HK和TS基因进行系统的比较分析,证明HK基因在各个层面上都具有与TS基因不同的特征,最终揭示HK基因在转录组组成中的特殊角色。(来源:中科院北京基因组研究所)
生物谷推荐原始出处:
(BMC Genomics),doi:10.1186/1471-2164-9-172,Jiang Zhu,Jun Yu
How many human genes can be defined as housekeeping with current expression data?
Jiang Zhu , Fuhong He , Shuhui Song , Jing Wang and Jun Yu
Abstract (provisional)
Background
Housekeeping (HK) genes are ubiquitously expressed in all tissue/cell types and constitute a basal transcriptome for the maintenance of basic cellular functions. Partitioning transcriptomes into HK and tissue-specific (TS) genes relatively is fundamental for studying gene expression and cellular differentiation. Although many studies have aimed at large-scale and thorough categorization of human HK genes, a meaningful consensus has yet to be reached.
Results
We collected two latest gene expression datasets (both EST and microarray data) from public databases and analyzed the gene expression profiles in 18 human tissues that have been well-documented by both two data types. Benchmarked by a manually-curated HK gene collection (HK408), we demonstrated that present data from EST sampling was far from saturated, and the inadequacy has limited the gene detectability and our understanding of TS expressions. Due to a likely over-stringent threshold, microarray data showed higher false negative rate compared with EST data, leading to a significant underestimation of HK genes. Based on EST data, we found that 40.0% of the currently annotated human genes were universally expressed in at least 16 of 18 tissues, as compared to only 5.1% specifically expressed in a single tissue. Our current EST-based estimate on human HK genes ranged from 3,140 to 6,909 in number, a ten-fold increase in comparison with previous microarray-based estimates.
Conclusions
We concluded that a significant fraction of human genes, at least in the currently annotated data depositories, was broadly expressed. Our understanding of tissue-specific expression was still preliminary and required much more large-scale and high-quality transcriptomic data in future studies. The new HK gene list categorized in this study will be useful for genome-wide analyses on structural and functional features of HK genes.