The phylogenomic species concept, which combines phylogenetic and genomic analyses, can be used to circumscribe species
James T. Staley
Jim Staley is Professor Emeritus in the Department of Microbiology, University of Washington, Seattle. He can be reached at
● A phylogenomic species concept that relies on phylogenetic and genomic approaches for circumscribing species is proposed.
● Although phylogenetic analyses of 16S rRNA sequences are currently used to ascertain the taxonomy of Bacteria and Archaea at higher taxonomic levels, less highly conserved genes must be used for species.
● Current applications of the phylogenomic species concept, such as multiple-locus sequence analysis, are already being used to identify clades that can be classified as species.
● Horizontal gene transfers pose a major challenge for any taxonomy, but genomic approaches will help resolve this issue.
● The phylogenomic species concept could apply universally to all organisms.
Bacteriologists have not yet adopted a concept for a species. Bacterial and archaeal species are defined on the basis of phenotypic properties and whole-genome DNA-DNA hybridization. Each species must have unique phenotypic properties and exhibit more than 70% DNA hybridization among strains. This combination of phenotype and genotype, sometimes referred to as the polyphasic species definition, was a breakthrough in bacterial taxonomy and has served microbiologists very well by stabilizing the field and bringing uniformity to classifying species of Bacteria and Archaea.
The 1990s brought DNA, RNA, and protein sequencing to the fore, and they soon were adapted for use in phylogenetic analysis. The advantage of sequencing approaches from a taxonomic viewpoint is that sequences can be used to infer the evolution of lineages. The highly conserved 16S rRNA gene became the primary macromolecule for phylogeny because of its fidelity in deducing the relatedness of Bacteria and Archaea at taxonomic levels at and above the genus level. As a result, the entire second edition of Bergey's Manual of Systematic Bacteriology uses the phylogenetic approach for classifying Bacteria and Archaea (www.bergeys .org). Therefore, for the first time, there is a complete hierarchical taxonomy for the Bacteria and Archaea from the domain down to the genus. However, despite the wide acceptance of the phylogenetic approach for higher taxa, it has not yet been successfully applied at the species level.
The Species Dilemma
Initially there was great anticipation that 16S rRNA gene sequences could be used to define species. However, virtually identical 16S sequences can be found in two organisms that are different species based on the polyphasic definition, so bacterial taxonomists have retained the current definition. Although 16S rRNA gene sequencing cannot differentiate species, it has some benefit in bacterial taxonomy at the species level by placing constraints on what comprises a species. Studies using both DNA hybridization and 16S rRNA gene sequence data illustrate that if two strains show less than 97% 16S rRNA gene sequence similarity, they are separate species. Indeed, Jyoti Keswani and William Whitman at the University of Georgia in Athens find that for most species, similarity values as high as 99% distinguish species with confidence. Therefore, it is unnecessary to carry out the somewhat arduous task of DNA hybridization for those organisms. But, as helpful as this might be for those who have isolated a novel strain, it is of no help in determining what consitutes a species.
The major dilemma facing microbial taxonomists is whether to retain the current polyphasic species definition, recognizing that it is not based on the evolutionary process of speciation, or whether to adopt a concept that is consistent with how species evolve.
How Do Organisms Form Species?
Animals and plants speciate primarily due to geographic separation by a process called allopatric speciation. For example, when a group of animals of one species becomes physically separated from other members of the species by an event such as continental drift, the two isolated populations begin to evolve separately by mutation, selection, and/or genetic drift. This is how humans and chimpanzees are thought to have diverged from their common ancestor. Humans evolved in the drier, more open savannah that was formed following the great rift in east Africa, while the chimpanzees evolved independently in the original jungle setting.
Another process, referred to as sympatric speciation, occurs when changes in the local environment no longer allow inbreeding within the species, resulting in divergence and, eventually, two separate species. This form of speciation, which is driven by physical, chemical, or biological factors and is therefore ecological in nature, results from modifications of the habitat that allow for the selection of appropriate mutants and the evolution of novel species. An example is the evolution of a new plant species in the same vicinity as its ancestral lineage that was caused by acid leached from a mine that changed its breeding season.
Bacterial and archaeal speciation is less well understood. However, ample evidence suggests that ecological factors, analogous to sympatry, play a major role in bacterial speciation. For example, specific bacterial diseases of plants and animals are caused by particular species or subspecies of bacteria.
Allopatric speciation is more controversial for microbiologists because of the difficulty in providing evidence for its occurrence. However, recent studies support the importance of the role of geography (i.e., allopatry) in bacterial and archaeal speciation. In particular, studies of the archaeon Sulfolobus islandicus in Rachel Whitaker's lab at the University of Illinois provide the most solid evidence. Numerous strains of Sulfolobus islandicus were isolated from hot springs in Iceland, Yellowstone National Park in Wyoming (Fig. 1), Lassen National Park in California, and two areas in Kamchatka, Russia. Although 16S rRNA gene sequences could not resolve any differences among the strains from the different locations, sequences of less highly conserved housekeeping genes showed that the strains from Iceland clustered separately from those of North America, while strains from both of these habitats were different from those in Kamchatka. These geographic clusters could be regarded as separate species if diagnosable characters, including sequence differences, can be found that indicate each geographic cluster has unique characteristics.
Universality of Species Concepts
Among the numerous species concepts that have been proposed, not many have the potential to be universal, i.e., applicable to all organisms. Among the most favored concepts are the biological, morphological, and phylogenetic species concepts. The biological concept promoted by Ernst Mayr defines a species as a group of organisms that can interbreed to produce fertile progeny. This concept is not broadly applicable to procaryotes or even many types of eucaryotes. Although bacterial genetic exchange occurs through transformation, transduction, and conjugation, it is not confined to interbreeding species. Indeed, it can extend to genera and, in some cases, domains. Furthermore, eucaryotic microbiologists and zoologists who study simple invertebrate animals find it difficult to determine whether sexuality is important for reproduction for a wide range of species. Therefore, although it is widely regarded, the biological concept cannot be seriously considered as a universal species concept.
By far the most widely used concept is the morphological concept based on the structural features of an organism. While the morphological concept is used by many zoologists, botanists, eucaryotic microbiologists, and paleontologists, it is of no practical use for bacteria and archaea because their cell shapes are too simple for this to be used as the basis of a universal species concept.
I have amended the phylogenetic concept to include genomic analyses. It is referred to here as the phylogenomic species concept (PSC) rather than the genomic-phylogenetic species concept as originally proposed in 2006. Genomes provide taxonomists not only with extensive phylogenetic information but also with other genomic information, such as synteny, as well as hybridization and gene expression analyses that enable further comparison among different strains. The strengths of the PSC are that it implies the evolutionary history of an organism through sequence and genomic analyses of its macromolecules, it is practical to apply, the sequences are archival, and the sequence information can be readily distributed and shared with others. Perhaps most importantly, the PSC is applicable to all cultivable microorganisms and has the potential to be extended to all living organisms.
Applying the Phylogenomic Species Concept
Bacteriologists already use PSC to identify clusters of strains of Bacteria and Archaea. For example, Brian Spratt at the Imperial College of London and his collaborators have used multiple locus sequence analyses (MLSA), a phylogenetic approach designed to infer relatedness among strains of various bacterial pathogens. Their approach typically entails choosing five to eight genes depending on the genus of interest. These are sequenced, and the resultant individual gene sequences are linked in tandem, or concatenated, before using them for phylogenetic analysis.
For example, MLSA was used for distinguishing pathogens in the Streptococcus pneumoniae group, including strains of S. mitis and S. oralis that could not be separated on the basis of phenotypic tests (Fig. 2). Moreover, MLSA confirmed DNA hybridization tests indicating that S. pseudopneumoniae is a new species. In analyzing strains of Sulfolobus islandicus, Rachel Whitaker also applied the MLSA approach, in part because 16S rRNA gene sequencing analysis could not resolve the contributions of geographic separation to speciation. Although the question of whether strains from the different locations are sufficiently different to be considered separate species remains open, her study supports the PSC approach for identifying novel lineages that might be new species. Genomic analysis provides not only DNA sequence information but also insights into genome organization and gene hybridization and expression data, and will help in resolving questions about horizontal gene transfer and biogeography. Kostas Konstantinidis and James Tiedje at Michigan State University in East Lansing have applied genomic approaches to species in several genera, including Burkholderia, Shewanella, Escherichia, Staphylococcus, and Streptococcus.
PSC can also be applied to symbioses. For instance, Buchnera aphidicola bacteria are obligate symbionts of aphids. The aphids provide the bacteria a home and nutrition, while the bacteria furnish aphids with essential amino acids. Phylogenetic analysis and the fossil record supports this symbiosis as a coevolutionary process, according to Nancy Moran at the University of Arizona, Paul and Linda Baumann at the University of California, Davis, and their collaborators.
From a taxonomic standpoint and logically, if this is a coevolutionary association, then it must also be considered a cospeciation process as well. Yet, although many families, genera, and species of aphids are identified and named, only a single species of the bacterial symbiont, B. aphidicola, is named! With about 4,000 known species of aphids, this single bacterial species is likely a stand-in for at least 4,000 species, many genera, several families, and one order of the Bacteria. Therefore, potentially this single symbiosis could raise the number of named species of Bacteria and Archaea to more than 11,000. A related question is whether some commensal bacteria form coevolutionary partners with their host organisms. Because commensal bacteria are common in animals, if even a small fraction coevolve, very large numbers of new bacterial species await discovery.
Phylogenies, Phenotypes, and Horizontal Gene Transfer
A key issue confronting bacterial taxonomists is horizontal gene transfer, a problem that surfaces at all taxonomic levels. At the highest levels, the Tree of Life may be more a Web of Life because of extensive horizontal gene transfers between the major phyla and domains, according to Ford Doolittle at Dalhousie University in Halifax, Nova Scotia, Canada. For example, bacteria from the Prosthecobacter genus carry homologs of eucaryotic tubulin genes that likely came from a member of the Eucarya via an interdomain horizontal gene transfer (Fig. 3).
In general, horizontal gene transfers tend to occur between closely related organisms-thus, more extensively between genera and species than at the level of phylum or family. The "fuzzy species" that cannot be resolved using MLSA are excellent examples. Although many strains of Neisseria meningitidis, N. lactamica, and N. gonorrhoeae can be differentiated using sevengene MLSA, some strains fall between N. lactamica and N. meningitidis. Because those strains are not resolved, these two species are considered "fuzzy."
Horizontal gene transfer poses amajor problem for a taxonomy that is based strictly on phylogeny or phenotypic properties. For instance, relying on genes that are commonly exchanged among species would give rise to a misleading phylogeny. Similarly, horizontal gene transfers also confuse taxonomies that rely on phenotype. Genomic approaches should prove helpful in clarifying these issues.
In addition, genomics could help resolve seeming contradictions between phenotype and phylogeny in taxonomy. Does the single gene or an undetected pathway of multiple genes exist in all strains of a species? If so, then the difference between strains may be due to whether a single gene is expressed. Expression arrays can be designed to assess whether some strains express specific genes under different conditions from other strains.
Toward a Universal Species Concept for Biology
Increasingly, evidence supports that all organisms- Bacteria, Archaea, and Eucarya-speciate in response to ecological and geographic factors through a process that can be inferred using phylogenomic approaches. Therefore, we can develop a universal concept to define species. In order to develop a universal concept, bacteriologists will need to work closely with other biologists to reach agreement on which species concept should be adopted. If the phylogenomic species concept or a comparable concept is approved, all species would be classified by the same criteria, which could help unite biology by completing the taxonomy of the entire Tree of Life from Domain to Species.
I thank Frank Harold especially for suggesting the term "phylogenomic," as well as Micah Krichevsky, Robert G. E. Murray, Eugene Nester, Brian Spratt, and William Whitman for their many helpful suggestions for improvement of the manuscript.
Bishop, C. J., D. M. Aanensen, G. E. Jordan, M. Kilian, W. P. Hanage, and B. G. Spratt. 2009. Assigning strains to bacterial species via the internet. BMC Biol. 7:3.
Doolittle, W. F. 1999. Phylogenetic classification and the universal tree. Science 284:2124-2128.
Hanage, W. P., C. Fraser, and B. G. Spratt. 2005. Fuzzy species among recombinogenic bacteria. BMC Biology 3:6-13.
Jenkins, C., R. Samudrala, I. Anderson, B. P. Hedlund, G. Petroni, N. Michailova, N. Pinel, R. Overbeek, G. Rosati, and J. T. Staley. 2002. Genes for the cytoskeletal protein tubulin in the bacterial genus Prosthecobacter. Proc. Natl. Acad. Sci. USA 99:17049-17054.
Keswani, J., and W. B. Whitman. 2001. Relationship of 16S rRNA sequence similarity to DNA hybridization in prokaryotes. Int. J. System. Evol. Microbiol. 51:667-678.
Konstantinidis, K. T., A. Ramette, and J. M. Tiedje. 2006 The bacterial species definition in the genomic era. Phil. Trans. R. Soc. B. 361:1929-1940.
Moran, N. A., M. A. Munson, P. Baumann, and H. Ishikawa. 1993. A molecular clock in endosymbiotic baceria is calibrated using the insect hosts. Phil. Trans. R. Soc. B 253:167-171.
Staley, J. T. 1999. Bacterial biodiversity: a time for place. ASM News 10:1- 6.
Staley, J. T. 2006. The bacterial species dilemma and the genomic-phylogenetic species concept. Phil. Trans. R. Soc. B 361:1899-1909.
Whitaker, R. J., D. W. Grogan, and J. W. Taylor. 2003. Geographic barriers isolate endemic populations of hyperthermophilic Archaea. Science 301:976-978.