Arid
肉苁蓉主要药用成分相关基因的挖掘及分子标记的鉴定
其他题名Discovery of Primary Active Ingredients Related Genes of Cistanche deserticola and Identification of Molecular Markers
王西亮
出版年2015
学位类型博士
导师胡松年
学位授予单位中国科学院大学
中文摘要肉苁蓉是一种非光合作用的寄生植物,主要分布于我国西北荒漠地区。肉苁蓉的干燥肉质茎是我国传统的名贵中药材,具有广泛的药用价值,如增强性功能、提高免疫力等。但目前肉苁蓉的转录组与基因组数据还很匮乏。在本研究中,我们对肉苁蓉肉质茎进行了全转录组深度测序,共获得80M末端配对序列。使用Trinity进行转录本从 头拼接,获得95,787个转录本,长度从200bp到15,698bp不等,平均长度为950bp。在63,957 个表达转录本(FPKM ≥ 0.5)中,30,098个转录本可以通过与公共数据库(Uniprot、Genbank nt/nr、KEGG)比较获得基因的功能注释信息。通过与KEGG数据库比较,我们注释到了参与木质素生物合成途径的所有酶。苯丙氨酸氨基裂解酶(PAL)是木质素与苯乙醇苷(PhGs)生物合成的第一个关键酶,根据序列相似性比较与系统发生分析,鉴定出了肉苁蓉的至少4个PAL基因。PhGs是肉苁蓉的主要药用成分,根据转录本表达信息,我们首次推测出了肉苁蓉 PhGs 的两个可能的生物合成途径。为了进一步研究不同种质肉苁蓉肉质茎药用成分与基因表达的差异,我们采集了多个不同种质的肉苁蓉肉质茎进行药化测定与转录组测序。不同样本的长度与直径差别较大;左旗三个样本在重量方面差别不大,但右旗的三个样本差异较大。采用HPLC测定各样本的毛蕊花糖苷、松果菊苷与总苷,右旗样本的PhGs含量普遍高于左旗样本。同时,对六个样本分别进行全转录组深度测序,将数据混合起来用Trinity进行拼接,共获得337,132个转录本,平均长度为1,125bp。在156,877个活跃表达的转录本(FPKM ≥0.5)中,72,186 个转录本可以通过与Genbank nt/nr 数据库比较获得基因功能的注释信息。通过样本之间表达转录本的比较分析发现,样本间共表达的转录本要远多于样本特异表达的转录本。两两样本间的差异表达转录本分析结果表明:右旗样本间的表达差异要小于左旗样本间。根据差异表达转录本的聚类结果,共筛选出10个明显与主要药用成 分协同变化的分子标记。此外,我们也进行了转录本的 SSR 分子标签的分析,1/3的转录本含有SSR,大多数SSR是单核苷酸重复序列。基于转录组数据,我们同时构建了一 个公共数据库 CISTANCHE ESERTICOLA Genome Database。本研究有助于在分子水平理解肉苁蓉的生理过程与药用价值,也有可能用于肉苁蓉品质鉴定与栽培种的选择。
英文摘要Cistanche deserticola is a completely non-photosynthetic parasitic plant with great medicinal value and mainly distributed in desert of Northwest China. Its dried fleshy stem is a crucial tonic in traditional Chinese medicine with roles of mainly improving male sexual function and strengthening immunity, but few genomic and transcriptomic resources are available. In this study, we performed deep transcriptome sequencing in fleshy stem of C. deserticola, and about 80 million reads were generated using Illumina pair-end sequencing technology. Using trinity assembler, we obtained 95,787 transcript sequences with lengths ranging from 200bp to 15,698bp, having an average length of 950 bases. 63,957 transcripts were identified as actively expressed transcripts with FPKM ≥ 0.5, in which 30,098 transcripts were annotated with gene descriptions or gene ontology terms by sequence similarity analyses against everal public databases (Uniprot, NR and Nt at NCBI, and KEGG). Furthermore, we identified enzymes involved in biosynthesis of lignin by comparison with KEGG database. At least four phenylalanine ammonia-lyase (PAL) genes, the first key enzyme in lignin and phenylethanoid glycosides (PhGs) biosynthesis, were identified based on sequences comparison and phylogenetic analysis. PhGs are known to be the primary active ingredients and two potential biosynthesis pathways of PhGs in C. deserticola were also proposed for the first time. To further research difference of primary active ingredients and gene transcription among various C. deserticola resources, samples were collected from different sites to quantify medical composition and perform mRNA sequencing. Fleshy stem of samples obviously varied in length and diameter; In term of weight, samples from ZuoQi had small variances, while samples from YouQi had much difference. We used HPLC to quantify echinacoside, acteoside and total glycosides, PhGs content of samples from YouQi was more than that from ZuoQi. We performed deep transcriptome sequencing in those six samples using Illumina pair-end sequencing technology, clean reads were mixed into a reads pool. Using Trinity software, we obtained 337,132 transcript sequences with average length 1,125 bases. 156,877 transcripts were actively expressed with FPKM ≥ 0.5, in which 72,186 transcripts were annotated with gene descriptions by sequence similarity analyses against Genbank nt/nr database. Transcripts co-expressed among samples were more than that specific-expressed in one sample. Results of differential expressed transcripts analysis showed that expression change among samples from YouQi was less than that from ZuoQi. Ten transcripts whose FPKM value changed concerted with PhGs content change among samples were selected as molecular arker. Moreover, SSR tags were detected based on transcripts sequence. 1/3 transcripts contained SSR, and most SSR were one-nucleotide repeat motif. A public database (CISTANCHE DESERTICOLA Genome Database) was built based on our transcriptome dataset. Our study will accelerate understanding of physiological process and great medicinal value of C. deserticola in molecular level, also given the potential for quality evaluation and cultivars selection of C.deserticola.
中文关键词肉苁蓉 ; 从头拼接 ; 功能注释 ; 基因挖掘 ; 鉴定分子标记
英文关键词Cistanche deserticola,de novo assembly,functional annotation,gene discovery,identification of gene markers
语种中文
国家中国
来源学科分类基因组学
来源机构中国科学院北京基因组研究所
资源类型学位论文
条目标识符http://119.78.100.177/qdio/handle/2XILL650/287469
推荐引用方式
GB/T 7714
王西亮. 肉苁蓉主要药用成分相关基因的挖掘及分子标记的鉴定[D]. 中国科学院大学,2015.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[王西亮]的文章
百度学术
百度学术中相似的文章
[王西亮]的文章
必应学术
必应学术中相似的文章
[王西亮]的文章
相关权益政策
暂无数据
收藏/分享

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。