ABSTRACT: The most widely used evolutionary model for phylogenetic trees is the equal-rates Markov (ERM) model. A problem is that the ERM model predicts less imbalance than observed for trees inferred from real data; in fact, the observed imbalance tends to fall between the values predicted by the ERM model and those predicted by the proportional-to-distinguishable-arrangements (PDA) model. Here, a continuous multi-rate (MR) family of evolutionary models is presented which contains entire subfamilies corresponding to both the PDA and ERM models. Furthermore, this MR family covers an entire range from 'completely balanced' to 'completely unbalanced' models. In particular, the MR family contains other known evolutionary models. The MR family is very versatile and virtually free of assumptions on the character of evolution; yet it is highly susceptible to rigorous analyses. In particular, such analyses help to uncover adaptability, quasi-stabilization and prolonged stasis as major possible causes of the imbalance. However, the MR model is functionally simple and requires only three parameters to reproduce the observed imbalance.
Project description:Tens of thousands of phylogenetic trees, describing the evolutionary relationships between hundreds of thousands of taxa, are readily obtainable from various databases. From such trees, inferences can be made about the underlying macroevolutionary processes, yet remarkably these processes are still poorly understood. Simple and widely used evolutionary null models are problematic: Empirical trees show very different imbalance between the sizes of the daughter clades of ancestral taxa compared to what models predict. Obtaining a simple evolutionary model that is both biologically plausible and produces the imbalance seen in empirical trees is a challenging problem, to which none of the existing models provide a satisfying answer. Here we propose a simple, biologically plausible macroevolutionary model in which the rate of speciation decreases with species age, whereas extinction rates can vary quite generally. We show that this model provides a remarkable fit to the thousands of trees stored in the online database TreeBase. The biological motivation for the identified age-dependent speciation process may be that recently evolved taxa often colonize new regions or niches and may initially experience little competition. These new taxa are thus more likely to give rise to further new taxa than a taxon that has remained largely unchanged and is, therefore, well adapted to its niche. We show that age-dependent speciation may also be the result of different within-species populations following the same laws of lineage splitting to produce new species. As the fit of our model to the tree database shows, this simple biological motivation provides an explanation for a long standing problem in macroevolution.
Project description:Classic null models for speciation and extinction give rise to phylogenies that differ in distribution from empirical phylogenies. In particular, empirical phylogenies are less balanced and have branching times closer to the root compared to phylogenies predicted by common null models. This difference might be due to null models of the speciation and extinction process being too simplistic, or due to the empirical datasets not being representative of random phylogenies. A third possibility arises because phylogenetic reconstruction methods often infer gene trees rather than species trees, producing an incongruity between models that predict species tree patterns and empirical analyses that consider gene trees. We investigate the extent to which the difference between gene trees and species trees under a combined birth-death and multispecies coalescent model can explain the difference in empirical trees and birth-death species trees. We simulate gene trees embedded in simulated species trees and investigate their difference with respect to tree balance and branching times. We observe that the gene trees are less balanced and typically have branching times closer to the root than the species trees. Empirical trees from TreeBase are also less balanced than our simulated species trees, and model gene trees can explain an imbalance increase of up to 8% compared to species trees. However, we see a much larger imbalance increase in empirical trees, about 100%, meaning that additional features must also be causing imbalance in empirical trees. This simulation study highlights the necessity of revisiting the assumptions made in phylogenetic analyses, as these assumptions, such as equating the gene tree with the species tree, might lead to a biased conclusion.
Project description:The shapes of evolutionary trees are influenced by the nature of the evolutionary process but comparisons of trees from different processes are hindered by the challenge of completely describing tree shape. We present a full characterization of the shapes of rooted branching trees in a form that lends itself to natural tree comparisons. We use this characterization to define a metric, in the sense of a true distance function, on tree shapes. The metric distinguishes trees from random models known to produce different tree shapes. It separates trees derived from tropical versus USA influenza A sequences, which reflect the differing epidemiology of tropical and seasonal flu. We describe several metrics based on the same core characterization, and illustrate how to extend the metric to incorporate trees' branch lengths or other features such as overall imbalance. Our approach allows us to construct addition and multiplication on trees, and to create a convex metric on tree shapes which formally allows computation of average tree shapes.
Project description:Cancer combination therapy to treat tumors with different therapeutic approaches can efficiently improve treatment efficacy and reduce side effects. Herein, we develop a theranostic nano-platform based on polydopamine (PDA) nanoparticles, which then are exploited as a versatile carrier to allow simultaneous loading of indocyanine green (ICG), doxorubicin (DOX) and manganese ions (PDA-ICG-PEG/DOX(Mn)), to enable imaging-guided chemo & photothermal cancer therapy. In this system, ICG acts as a photothermal agent, which shows red-shifted near-infrared (NIR) absorbance and enhanced photostability compared with free ICG. DOX, a model chemotherapy drug, is then loaded onto the surface of PDA-ICG-PEG with high efficiency. With Mn(2+) ions intrinsically chelated, PDA-ICG-PEG/DOX(Mn) is able to offer contrast under T1-weighted magnetic resonance (MR) imaging. In a mouse tumor model, the MR imaging-guided combined chemo- & photothermal therapy achieves a remarkable synergistic therapeutic effect compared with the respective single treatment modality. This work demonstrates that PDA nanoparticles could serve as a versatile molecular loading platform for MR imaging guided combined chemo- & photothermal therapy with minimal side effects, showing great potential for cancer theranostics.
Project description:Phylogenetic comparative methods are widely used to understand and quantify the evolution of phenotypic traits, based on phylogenetic trees and trait measurements of extant species. Such analyses depend crucially on the underlying model. Gaussian phylogenetic models like Brownian motion and Ornstein-Uhlenbeck processes are the workhorses of modeling continuous-trait evolution. However, these models fit poorly to big trees, because they neglect the heterogeneity of the evolutionary process in different lineages of the tree. Previous works have addressed this issue by introducing shifts in the evolutionary model occurring at inferred points in the tree. However, for computational reasons, in all current implementations, these shifts are "intramodel," meaning that they allow jumps in 1 or 2 model parameters, keeping all other parameters "global" for the entire tree. There is no biological reason to restrict a shift to a single model parameter or, even, to a single type of model. Mixed Gaussian phylogenetic models (MGPMs) incorporate the idea of jointly inferring different types of Gaussian models associated with different parts of the tree. Here, we propose an approximate maximum-likelihood method for fitting MGPMs to comparative data comprising possibly incomplete measurements for several traits from extant and extinct phylogenetically linked species. We applied the method to the largest published tree of mammal species with body- and brain-mass measurements, showing strong statistical support for an MGPM with 12 distinct evolutionary regimes. Based on this result, we state a hypothesis for the evolution of the brain-body-mass allometry over the past 160 million y.
Project description:The merlin-ERM (ezrin, radixin, moesin) family of proteins plays a central role in linking the cellular membranes to the cortical actin cytoskeleton. Merlin regulates contact inhibition and is an integral part of cell-cell junctions, while ERM proteins, ezrin, radixin and moesin, assist in the formation and maintenance of specialized plasma membrane structures and membrane vesicle structures. These two protein families share a common evolutionary history, having arisen and separated via gene duplication near the origin of metazoa. During approximately 0.5 billion years of evolution, the merlin and ERM family proteins have maintained both sequence and structural conservation to an extraordinary level. Comparing crystal structures of merlin-ERM proteins and their complexes, a picture emerges of the merlin-ERM proteins acting as switchable interaction hubs, assembling protein complexes on cellular membranes and linking them to the actin cytoskeleton. Given the high level of structural conservation between the merlin and ERM family proteins we speculate that they may function together.
Project description:<h4>Background</h4>Because phylogenetic inference is an important basis for answering many evolutionary problems, a large number of algorithms have been developed. Some of these algorithms have been improved by integrating gene evolution models with the expectation of accommodating the hierarchy of evolutionary processes. To the best of our knowledge, however, there still is no single unifying model or algorithm that can take all evolutionary processes into account through a stepwise or simultaneous method.<h4>Results</h4>On the basis of three existing phylogenetic inference algorithms, we built an integrated pipeline for inferring the evolutionary history of a given gene family; this pipeline can model gene sequence evolution, gene duplication-loss, gene transfer and multispecies coalescent processes. As a case study, we applied this pipeline to the STIMATE (TMEM110) gene family, which has recently been reported to play an important role in store-operated Ca<sup>2+</sup> entry (SOCE) mediated by ORAI and STIM proteins. We inferred their phylogenetic trees in 69 sequenced chordate genomes.<h4>Conclusions</h4>By integrating three tree reconstruction algorithms with diverse evolutionary models, a pipeline for inferring the evolutionary history of a gene family was developed, and its application was demonstrated.
Project description:Pancreatic ductal adenocarcinoma (PDA) has a 5-year survival rate of less than 5%, and therapeutic advances have been hampered by gaps in our understanding of cell-cycle control in the adult pancreas. Previously, we reported that basic Helix-Loop-Helix (bHLH) transcription factors regulate cell fate specification in the pancreas. In the present study, we found that a repressor of bHLH activity, Id3, was profoundly upregulated in ductal cells in murine models of pancreatitis and pancreatic intraepithelial neoplasia (PanIN). Id3 was also pervasively expressed in neoplastic lesions in human PDA in situ. We hypothesized that an imbalance in bHLH versus Id activity controlled cell growth in PDA. Consistent with this model, cell-cycle progression in PDA cells was impeded by siRNA-mediated depletion of Id3 or overexpression of the bHLH protein E47. The precursors of human PDA are normally quiescent duct cells which do not proliferate in response to high serum or growth factors. The finding that Id3 was expressed in pancreatitis, as well as PDA, suggested that Id3 might induce cell-cycle entry in ducts. To test this hypothesis, primary human pancreatic duct cells were transduced with an adenovirus-expressing Id3. Remarkably, Id3 expression alone was sufficient to trigger efficient cell-cycle entry, as manifested by expression of the proliferation markers Ki67, phospho-cyclin E, and phospho-histone H3. Collectively, the data establish dysregulation of the Id/bHLH axis as an early and sustained feature of ductal pathogenesis and mark this axis as a potential therapeutic target for intervention in pancreatitis and PDA.
Project description:TGF? is an important tumor suppressor in pancreatic ductal adenocarcinoma (PDA), yet inactivation of TGF? pathway components occurs in only half of PDA cases. TGF? cooperates with oncogenic RAS signaling to trigger epithelial-to-mesenchymal transition (EMT) in premalignant pancreatic epithelial progenitors, which is coupled to apoptosis owing to an imbalance of SOX4 and KLF5 transcription factors. We report that PDAs that develop with the TGF? pathway intact avert this apoptotic effect via ID1. ID1 family members are expressed in PDA progenitor cells and encode components of a set of core transcriptional regulators shared by PDAs. PDA progression selects against TGF?-mediated repression of ID1. The sustained expression of ID1 uncouples EMT from apoptosis in PDA progenitors. AKT signaling and mechanisms linked to low-frequency genetic events converge on ID1 to preserve its expression in PDA. Our results identify ID1 as a crucial node and potential therapeutic target in PDA. SIGNIFICANCE: Half of PDAs escape TGF?-induced tumor suppression without inactivating the TGF? pathway. We report that ID1 expression is selected for in PDAs and that ID1 uncouples TGF?-induced EMT from apoptosis. ID1 thus emerges as a crucial regulatory node and a target of interest in PDA.This article is highlighted in the In This Issue feature, p. 1.
Project description:Evolutionary relationships are typically inferred from molecular sequence data using a statistical model of the evolutionary process. When the model accurately reflects the underlying process, probabilistic phylogenetic methods recover the correct relationships with high accuracy. There is ample evidence, however, that models commonly used today do not adequately reflect real-world evolutionary dynamics. Virtually all contemporary models assume that relatively fast-evolving sites are fast across the entire tree, whereas slower sites always evolve at relatively slower rates. Many molecular sequences, however, exhibit site-specific changes in evolutionary rates, called "heterotachy." Here we examine the accuracy of 2 phylogenetic methods for incorporating heterotachy, the mixed branch length model--which incorporates site-specific rate changes by summing likelihoods over multiple sets of branch lengths on the same tree--and the covarion model, which uses a hidden Markov process to allow sites to switch between variable and invariable as they evolve. Under a variety of simple heterogeneous simulation conditions, the mixed model was dramatically more accurate than homotachous models, which were subject to topological biases as well as biases in branch length estimates. When data were simulated with strong versions of the types of heterotachy observed in real molecular sequences, the mixed branch length model was more accurate than homotachous techniques. Analyses of empirical data sets confirmed that the mixed branch length model can improve phylogenetic accuracy under conditions that cause homotachous models to fail. In contrast, the covarion model did not improve phylogenetic accuracy compared with homotachous models and was sometimes substantially less accurate. We conclude that a mixed branch length approach, although not the solution to all phylogenetic errors, is a valuable strategy for improving the accuracy of inferred trees.