ezTree: An automated pipeline for identifying phylogenetic marker genes and inferring evolutionary relationships among uncultivated prokaryotic draft genomes

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Background: Inferring phylogenetic trees for newly recovered genomes from metagenomic samples is very useful in determining the identities of uncultivated microorganisms. Even though 16S ribosomal RNA small subunit genes have been established as "gold standard" markers for inferring phylogenetic trees, they usually cannot be assembled very well in metagenomes due to shared regions among 16S genes. Using single-copy marker genes to build genome trees has become increasingly popular for uncultivated species. Predefined marker gene sets were discovered and have been applied in various genomic studies; however these gene sets might not be adequate for novel, uncultivated, draft, or incomplete genomes. The automatic identification of marker gene sets among a set of genomes with different assembly qualities has thus become a very important task for inferring reliable phylogenetic relationships for microbial populations. Results: A computational pipeline, ezTree, was developed to automatically identify single-copy marker genes for a group of genomes and build phylogenetic trees from the marker genes. Testing ezTree on a group of proteobacteria species revealed that ezTree was highly effective in pinpointing marker genes and constructing reliable trees for different groups of bacterial genomes. Applying ezTree to genomes that were recently recovered from metagenomes also showed that ezTree can help elucidate taxonomic relationships among newly recovered genomes and existing ones. Conclusions: The development of ezTree can help scientists build reliable phylogenetic trees for uncultivated species retrieved from environmental samples. The uncovered single-copy marker genes may also provide crucial hints for understanding shared features of a group of microbes. The ezTree pipeline is freely available at https://github.com/yuwwu/ezTreeunder a GNU GPLv3 license.

Original languageEnglish
Article number921
JournalBMC Genomics
Volume19
DOIs
Publication statusPublished - Jan 19 2018

Fingerprint

Genome
Genes
Metagenome
16S Ribosomal RNA
Bacterial Genomes
Metagenomics
Proteobacteria
Licensure
Population

Keywords

  • Marker gene
  • Phylogenetic tree
  • Uncultivated species

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Cite this

@article{457800ba4ea5409e8cbad9fcca110c34,
title = "ezTree: An automated pipeline for identifying phylogenetic marker genes and inferring evolutionary relationships among uncultivated prokaryotic draft genomes",
abstract = "Background: Inferring phylogenetic trees for newly recovered genomes from metagenomic samples is very useful in determining the identities of uncultivated microorganisms. Even though 16S ribosomal RNA small subunit genes have been established as {"}gold standard{"} markers for inferring phylogenetic trees, they usually cannot be assembled very well in metagenomes due to shared regions among 16S genes. Using single-copy marker genes to build genome trees has become increasingly popular for uncultivated species. Predefined marker gene sets were discovered and have been applied in various genomic studies; however these gene sets might not be adequate for novel, uncultivated, draft, or incomplete genomes. The automatic identification of marker gene sets among a set of genomes with different assembly qualities has thus become a very important task for inferring reliable phylogenetic relationships for microbial populations. Results: A computational pipeline, ezTree, was developed to automatically identify single-copy marker genes for a group of genomes and build phylogenetic trees from the marker genes. Testing ezTree on a group of proteobacteria species revealed that ezTree was highly effective in pinpointing marker genes and constructing reliable trees for different groups of bacterial genomes. Applying ezTree to genomes that were recently recovered from metagenomes also showed that ezTree can help elucidate taxonomic relationships among newly recovered genomes and existing ones. Conclusions: The development of ezTree can help scientists build reliable phylogenetic trees for uncultivated species retrieved from environmental samples. The uncovered single-copy marker genes may also provide crucial hints for understanding shared features of a group of microbes. The ezTree pipeline is freely available at https://github.com/yuwwu/ezTreeunder a GNU GPLv3 license.",
keywords = "Marker gene, Phylogenetic tree, Uncultivated species, Marker gene, Phylogenetic tree, Uncultivated species",
author = "Wu, {Yu Wei}",
year = "2018",
month = "1",
day = "19",
doi = "10.1186/s12864-017-4327-9",
language = "English",
volume = "19",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central Ltd.",

}

TY - JOUR

T1 - ezTree

T2 - An automated pipeline for identifying phylogenetic marker genes and inferring evolutionary relationships among uncultivated prokaryotic draft genomes

AU - Wu, Yu Wei

PY - 2018/1/19

Y1 - 2018/1/19

N2 - Background: Inferring phylogenetic trees for newly recovered genomes from metagenomic samples is very useful in determining the identities of uncultivated microorganisms. Even though 16S ribosomal RNA small subunit genes have been established as "gold standard" markers for inferring phylogenetic trees, they usually cannot be assembled very well in metagenomes due to shared regions among 16S genes. Using single-copy marker genes to build genome trees has become increasingly popular for uncultivated species. Predefined marker gene sets were discovered and have been applied in various genomic studies; however these gene sets might not be adequate for novel, uncultivated, draft, or incomplete genomes. The automatic identification of marker gene sets among a set of genomes with different assembly qualities has thus become a very important task for inferring reliable phylogenetic relationships for microbial populations. Results: A computational pipeline, ezTree, was developed to automatically identify single-copy marker genes for a group of genomes and build phylogenetic trees from the marker genes. Testing ezTree on a group of proteobacteria species revealed that ezTree was highly effective in pinpointing marker genes and constructing reliable trees for different groups of bacterial genomes. Applying ezTree to genomes that were recently recovered from metagenomes also showed that ezTree can help elucidate taxonomic relationships among newly recovered genomes and existing ones. Conclusions: The development of ezTree can help scientists build reliable phylogenetic trees for uncultivated species retrieved from environmental samples. The uncovered single-copy marker genes may also provide crucial hints for understanding shared features of a group of microbes. The ezTree pipeline is freely available at https://github.com/yuwwu/ezTreeunder a GNU GPLv3 license.

AB - Background: Inferring phylogenetic trees for newly recovered genomes from metagenomic samples is very useful in determining the identities of uncultivated microorganisms. Even though 16S ribosomal RNA small subunit genes have been established as "gold standard" markers for inferring phylogenetic trees, they usually cannot be assembled very well in metagenomes due to shared regions among 16S genes. Using single-copy marker genes to build genome trees has become increasingly popular for uncultivated species. Predefined marker gene sets were discovered and have been applied in various genomic studies; however these gene sets might not be adequate for novel, uncultivated, draft, or incomplete genomes. The automatic identification of marker gene sets among a set of genomes with different assembly qualities has thus become a very important task for inferring reliable phylogenetic relationships for microbial populations. Results: A computational pipeline, ezTree, was developed to automatically identify single-copy marker genes for a group of genomes and build phylogenetic trees from the marker genes. Testing ezTree on a group of proteobacteria species revealed that ezTree was highly effective in pinpointing marker genes and constructing reliable trees for different groups of bacterial genomes. Applying ezTree to genomes that were recently recovered from metagenomes also showed that ezTree can help elucidate taxonomic relationships among newly recovered genomes and existing ones. Conclusions: The development of ezTree can help scientists build reliable phylogenetic trees for uncultivated species retrieved from environmental samples. The uncovered single-copy marker genes may also provide crucial hints for understanding shared features of a group of microbes. The ezTree pipeline is freely available at https://github.com/yuwwu/ezTreeunder a GNU GPLv3 license.

KW - Marker gene

KW - Phylogenetic tree

KW - Uncultivated species

KW - Marker gene

KW - Phylogenetic tree

KW - Uncultivated species

UR - http://www.scopus.com/inward/record.url?scp=85040796775&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85040796775&partnerID=8YFLogxK

U2 - 10.1186/s12864-017-4327-9

DO - 10.1186/s12864-017-4327-9

M3 - Article

C2 - 29363425

AN - SCOPUS:85040796775

VL - 19

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

M1 - 921

ER -