Project description:BackgroundDespite recent advances, reliable tools to simultaneously handle different types of sequencing data (e.g., target capture, genome skimming) for phylogenomics are still scarce. Here, we evaluate the performance of the recently developed pipeline Captus in comparison with the well-known target capture pipelines HybPiper and SECAPR. As test data, we analyzed newly generated sequences for the genus Thladiantha (Cucurbitaceae) for which no well-resolved phylogeny estimate has been available so far, as well as simulated reads derived from the genome of Arabidopsis thaliana.ResultsOur pipeline comparisons are based on (1) the time needed for data assembly and locus extraction, (2) locus recovery per sample, (3) the number of informative sites in nucleotide alignments, and (4) the topology of the nuclear and plastid phylogenies. Additionally, the simulated reads derived from the genome of Arabidopsis thaliana were used to evaluate the accuracy and completeness of the recovered loci. In terms of computation time, locus recovery per sample, and informative sites, Captus outperforms HybPiper and SECAPR. The resulting topologies of Captus and SECAPR are identical for coalescent trees but differ when trees are inferred from concatenated alignments. The HybPiper phylogeny is similar to Captus in both methods. The nuclear genes recover a deep split of Thladiantha in two clades, but this is not supported by the plastid data.ConclusionsCaptus is the best choice among the three pipelines in terms of computation time and locus recovery. Even though there is no significant topological difference between the Thladiantha species trees produced by the three pipelines, Captus yields a higher number of gene trees in agreement with the topology of the species tree (i.e., fewer genes in conflict with the species tree topology).
Project description:Thladiantha nudiflora Hemsl. ex F.B.Forbes & Hemsl. 1887 (Cucurbitaceae) has been widely known as a traditional medicine plant. In this study, we sequenced, assembled, and annotated the complete chloroplast genome of T. nudiflora. The chloroplast genome of T. nudiflora is 156,824 base pair (bp) in length, containing a large single-copy region of 86,566 bp and a small single-copy region of 18,070 bp, separated by a pair of inverted repeats of 26,094 bp. The chloroplast genome contains 132 genes, including 87 protein-coding, 37 transfer RNA, and eight ribosomal RNA genes. Phylogenetic analysis of the chloroplast genome revealed that species of the genus Thladiantha were clustered together in the phylogenetic trees. This study will not only shed light on T. nudiflora's evolutionary position but also provide valuable chloroplast genomic information for future studies into the origins and diversification of the genus Thladiantha and the Cucurbitaceae family.
| S-EPMC10812858 | biostudies-literature
Project description:DNA sequencing data of Thladiantha nudiflora
Project description:This study aims to investigate the DNA methylation patterns at transcription factor binding regions and their evolutionary conservation with respect to binding activity divergence. We combined newly generated bisulfite-sequencing experiments in livers of five mammals (human, macaque, mouse, rat and dog) and matched publicly available ChIP-sequencing data for five transcription factors (CEBPA, HNF4a, CTCF, ONECUT1 and FOXA1). To study the chromatin contexts of TF binding subjected to distinct evolutionary pressures, we integrated publicly available active promoter, active enhancer and primed enhancer calls determined by profiling genome wide patterns of H3K27ac, H3K4me3 and H3K4me1.
Project description:Whole genome sequencing of the Arabidopsis thaliana dot5-1 transposon insertion line described in Petricka et al 2008 The Plant Journal 56(2): 251-263.