Arid
DOI10.1186/s12859-018-2384-y
TransFlow: a modular framework for assembling and assessing accurate de novo transcriptomes in non-model organisms
Seoane, Pedro1; Espigares, Marina1; Carmona, Rosario2; Polonio, Alvaro3,4; Quintana, Julia5; Cretazzo, Enrico6; Bota, Josefina7; Perez-Garcia, Alejandro3,4; de Dios Alche, Juan2; Gomez, Luis8,9; Gonzalo Claros, M.1
通讯作者Gonzalo Claros, M.
来源期刊BMC BIOINFORMATICS
ISSN1471-2105
出版年2018
卷号19
英文摘要

Background: The advances in high-throughput sequencing technologies are allowing more and more de novo assembling of transcriptomes from many new organisms. Some degree of automation and evaluation is required to warrant reproducibility, repetitivity and the selection of the best possible transcriptome. Workflows and pipelines are becoming an absolute requirement for such a purpose, but the issue of assembling evaluation for de novo transcriptomes in organisms lacking a sequenced genome remains unsolved. An automated, reproducible and flexible framework called TransFlow to accomplish this task is described.


Results: TransFlow with its five independent modules was designed to build different workflows depending on the nature of the original reads. This architecture enables different combinations of Illumina and Roche/454 sequencing data, and can be extended to other sequencing platforms. Its capabilities are illustrated with the selection of reliable plant reference transcriptomes and the assembling six transcriptomes (three case studies for grapevine leaves, olive tree pollen, and chestnut stem, and other three for haustorium, epiphytic structures and their combination for the phytopathogenic fungus Podosphaera xanthii). Arabidopsis and poplar transcriptomes revealed to be the best references. A common result regarding de novo assemblies is that Illumina paired-end reads of 100 nt in length assembled with OASES can provide reliable transcriptomes, while the contribution of longer reads is noticeable only when they complement a set of short, single-reads.


Conclusions: TransFlow can handle up to 181 different assembling strategies. Evaluation based on principal component analyses allows its self-adaptation to different sets of reads to provide a suitable transcriptome for each combination of reads and assemblers. As a result, each case study has its own behaviour, prioritises evaluation parameters, and gives an objective and automated way for detecting the best transcriptome within a pool of them. Sequencing data type and quantity (preferably several hundred millions of 2 x 100 nt or longer), assemblers (OASES for Illumina, MIRA4 and EULER-SR reconciled with CAP3 for Roche/454) and strategy (preferably scaffolding with OASES, and probably merging with Roche/454 when available) arise as the most impacting factors.


英文关键词Transcriptome Assembling Workflow pipeline PCA Non-model organism
类型Article
语种英语
国家Spain ; USA
收录类别SCI-E
WOS记录号WOS:000454362600008
WOS关键词RNA ; ANNOTATION ; PLANT
WOS类目Biochemical Research Methods ; Biotechnology & Applied Microbiology ; Mathematical & Computational Biology
WOS研究方向Biochemistry & Molecular Biology ; Biotechnology & Applied Microbiology ; Mathematical & Computational Biology
资源类型期刊论文
条目标识符http://119.78.100.177/qdio/handle/2XILL650/208172
作者单位1.Univ Malaga, Dept Biol Mol & Bioquim, Campus Teatinos S-N, E-29071 Malaga, Spain;
2.CSIC, Estn Expt Zaidin, Dept Biochem Cell & Mol Biol Plants, Plant Reprod Biol Lab, Prof Albareda 1, Granada 18160, Spain;
3.Univ Malaga, Consejo Super Invest Cient IHSM UMA CSIC, Dept Microbiol, Campus Teatinos S-N, E-29071 Malaga, Spain;
4.Univ Malaga, Consejo Super Invest Cient IHSM UMA CSIC, Inst Hortofruticultura Subtrop & Mediterranea La, Campus Teatinos S-N, E-29071 Malaga, Spain;
5.Worcester Polytech Inst, Dept Chem & Biochem, 100 Inst Rd, Worcester, MA 01609 USA;
6.Inst Andaluz Invest & Formac Agr IFAPA, Ctr Churriana, Cortijo de la Cruz S-N, Churriana 29140, Spain;
7.Univ Illes Balears, Dept Biol, Grp Recerca Biol Plantes Cond Mediterranies, Carretera Valldemossa,Km 7-5, Palma De Mallorca 07122, Spain;
8.Univ Politecn Madrid, ETSI Forestal Montes & Medio Nat, Dept Sistemas & Recursos Nat, Ciudad Univ, E-28040 Madrid, Spain;
9.Univ Politecn Madrid, INIA, CBGP, Campus Montegancedo, Pozuelo De Alarcon 28223, Spain
推荐引用方式
GB/T 7714
Seoane, Pedro,Espigares, Marina,Carmona, Rosario,et al. TransFlow: a modular framework for assembling and assessing accurate de novo transcriptomes in non-model organisms[J],2018,19.
APA Seoane, Pedro.,Espigares, Marina.,Carmona, Rosario.,Polonio, Alvaro.,Quintana, Julia.,...&Gonzalo Claros, M..(2018).TransFlow: a modular framework for assembling and assessing accurate de novo transcriptomes in non-model organisms.BMC BIOINFORMATICS,19.
MLA Seoane, Pedro,et al."TransFlow: a modular framework for assembling and assessing accurate de novo transcriptomes in non-model organisms".BMC BIOINFORMATICS 19(2018).
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Seoane, Pedro]的文章
[Espigares, Marina]的文章
[Carmona, Rosario]的文章
百度学术
百度学术中相似的文章
[Seoane, Pedro]的文章
[Espigares, Marina]的文章
[Carmona, Rosario]的文章
必应学术
必应学术中相似的文章
[Seoane, Pedro]的文章
[Espigares, Marina]的文章
[Carmona, Rosario]的文章
相关权益政策
暂无数据
收藏/分享

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。