Knowledge Resource Center for Ecological Environment in Arid Area
DOI | 10.1038/s41598-023-39620-6 |
Optimizing an efficient ensemble approach for high-quality de novo transcriptome assembly of Thymus daenensis | |
Ahmadi, Hosein; Sheikh-Assadi, Morteza; Fatahi, Reza; Zamani, Zabihollah; Shokrpour, Majid | |
通讯作者 | Fatahi, R |
来源期刊 | SCIENTIFIC REPORTS
![]() |
ISSN | 2045-2322 |
出版年 | 2023 |
卷号 | 13期号:1 |
英文摘要 | Non-erroneous and well-optimized transcriptome assembly is a crucial prerequisite for authentic downstream analyses. Each de novo assembler has its own algorithm-dependent pros and cons to handle the assembly issues and should be specifically tested for each dataset. Here, we examined efficiency of seven state-of-art assemblers on similar to 30 Gb data obtained from mRNA-sequencing of Thymus daenensis. In an ensemble workflow, combining the outputs of different assemblers associated with an additional redundancy-reducing step could generate an optimized outcome in terms of completeness, annotatability, and ORF richness. Based on the normalized scores of 16 benchmarking metrics, EvidentialGene, BinPacker, Trinity, rnaSPAdes, CAP3, IDBA-trans, and Velvet-Oases performed better, respectively. EvidentialGene, as the best assembler, totally produced 316,786 transcripts, of which 235,730 (74%) were predicted to have a unique protein hit (on uniref100), and also half of its transcripts contained an ORF. The total number of unique BLAST hits for EvidentialGene was approximately three times greater than that of the worst assembler (Velvet-Oases). EvidentialGene could even capture 17% and 7% more average BLAST hits than BinPacker and Trinity. Although BinPacker and CAP3 produced longer transcripts, the EvidentialGene showed a higher collinearity between transcript size and ORF length. Compared with the other programs, EvidentialGene yielded a higher number of optimal transcript sets, further full-length transcripts, and lower possible misassemblies. Our finding corroborates that in non-model species, relying on a single assembler may not give an entirely satisfactory result. Therefore, this study proposes an ensemble approach of accompanying EvidentialGene pipelines to acquire a superior assembly for T. daenensis. |
类型 | Article |
语种 | 英语 |
开放获取类型 | Green Published, gold |
收录类别 | SCI-E |
WOS记录号 | WOS:001067885000075 |
WOS类目 | Multidisciplinary Sciences |
WOS研究方向 | Science & Technology - Other Topics |
资源类型 | 期刊论文 |
条目标识符 | http://119.78.100.177/qdio/handle/2XILL650/398611 |
推荐引用方式 GB/T 7714 | Ahmadi, Hosein,Sheikh-Assadi, Morteza,Fatahi, Reza,et al. Optimizing an efficient ensemble approach for high-quality de novo transcriptome assembly of Thymus daenensis[J],2023,13(1). |
APA | Ahmadi, Hosein,Sheikh-Assadi, Morteza,Fatahi, Reza,Zamani, Zabihollah,&Shokrpour, Majid.(2023).Optimizing an efficient ensemble approach for high-quality de novo transcriptome assembly of Thymus daenensis.SCIENTIFIC REPORTS,13(1). |
MLA | Ahmadi, Hosein,et al."Optimizing an efficient ensemble approach for high-quality de novo transcriptome assembly of Thymus daenensis".SCIENTIFIC REPORTS 13.1(2023). |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。