HybPiper: Extracting coding sequence and introns for phylogenetics from high‐throughput sequencing reads using target enrichment

Johnson, Matthew; Gardner, Elliot; Liu, Yang; Medina Bujalance, Rafael; Goffinet, Bernard; Shaw, Jonathan; Zerega, Nyree; Wickett, Norman

HybPiper: Extracting coding sequence and introns for phylogenetics from high‐throughput sequencing reads using target enrichment

Download

HybPiper.pdf (992.87 KB)

Official URL

https://doi.org/10.3732/apps.1600016

Publication date

2016

Authors

Johnson, Matthew

Gardner, Elliot

Liu, Yang

Medina Bujalance, Rafael

Publisher

Botanical Society of America

Citations

Exportar

URI

https://hdl.handle.net/20.500.14352/95646

Citation

Johnson, Matthew G., et al. «HybPiper: Extracting Coding Sequence and Introns for Phylogenetics from High‐throughput Sequencing Reads Using Target Enrichment». Applications in Plant Sciences, vol. 4, n.o 7, julio de 2016, p. 1600016. https://doi.org/10.3732/apps.1600016.

Abstract

Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of highthroughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a userfriendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper.

Description

This research was funded by National Science Foundation grants to A.J.S. (DEB-1239980), B.G. (DEB-1240045 and DEB-1146295), N.J.W. (DEB-1239992), and N.J.C.Z. (DEB-0919119), and by a grant from the Northwestern University Institute for Sustainability and Energy (N.J.C.Z.). Data generated for this study can be found at www.artocarpusresearch.org, www.datadryad.org( http://dx.doi.org/10.5061/dryad.3293r), and the NCBI Sequence Read Archive (SRA; BioProject PRJNA301299).

UCM subjects

Biología molecular (Biología)

Unesco subjects

2415.02 Biología Molecular de Plantas

Collections

Artículos

Full item page

HybPiper: Extracting coding sequence and introns for phylogenetics from high‐throughput sequencing reads using target enrichment

Download

Official URL

Full text at PDC

Publication date

Authors

Advisors (or tutors)

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Citations

Exportar

URI

Citation

Abstract

Research Projects

Organizational Units

Journal Issue

Description

UCM subjects

Unesco subjects

Keywords

Collections