Extensive and biased intergenomic nonreciprocal DNA exchanges shaped a nascent polyploid genome, Gossypium (cotton).
ABSTRACT: Genome duplication is thought to be central to the evolution of morphological complexity, and some polyploids enjoy a variety of capabilities that transgress those of their diploid progenitors. Comparison of genomic sequences from several tetraploid (AtDt) Gossypium species and genotypes with putative diploid A- and D-genome progenitor species revealed that unidirectional DNA exchanges between homeologous chromosomes were the predominant mechanism responsible for allelic differences between the Gossypium tetraploids and their diploid progenitors. Homeologous gene conversion events (HeGCEs) gradually subsided, declining to rates similar to random mutation during radiation of the polyploid into multiple clades and species. Despite occurring in a common nucleus, preservation of HeGCE is asymmetric in the two tetraploid subgenomes. At-to-Dt conversion is far more abundant than the reciprocal, is enriched in heterochromatin, is highly correlated with GC content and transposon distribution, and may silence abundant A-genome-derived retrotransposons. Dt-to-At conversion is abundant in euchromatin and genes, frequently reversing losses of gene function. The long-standing observation that the nonspinnable-fibered D-genome contributes to the superior yield and quality of tetraploid cotton fibers may be explained by accelerated Dt to At conversion during cotton domestication and improvement, increasing dosage of alleles from the spinnable-fibered A-genome. HeGCE may provide an alternative to (rare) reciprocal DNA exchanges between chromosomes in heterochromatin, where genes have approximately five times greater abundance of Dt-to-At conversion than does adjacent intergenic DNA. Spanning exon-to-gene-sized regions, HeGCE is a natural noninvasive means of gene transfer with the precision of transformation, potentially important in genetic improvement of many crop plants.
Project description:We report genetic maps for diploid (D) and tetraploid (AtDt) Gossypium genomes composed of sequence-tagged sites (STS) that foster structural, functional, and evolutionary genomic studies. The maps include, respectively, 2584 loci at 1.72-cM ( approximately 600 kb) intervals based on 2007 probes (AtDt) and 763 loci at 1.96-cM ( approximately 500 kb) intervals detected by 662 probes (D). Both diploid and tetraploid cottons exhibit negative crossover interference; i.e., double recombinants are unexpectedly abundant. We found no major structural changes between Dt and D chromosomes, but confirmed two reciprocal translocations between At chromosomes and several inversions. Concentrations of probes in corresponding regions of the various genomes may represent centromeres, while genome-specific concentrations may represent heterochromatin. Locus duplication patterns reveal all 13 expected homeologous chromosome sets and lend new support to the possibility that a more ancient polyploidization event may have predated the A-D divergence of 6-11 million years ago. Identification of SSRs within 312 RFLP sequences plus direct mapping of 124 SSRs and exploration for CAPS and SNPs illustrate the "portability" of these STS loci across populations and detection systems useful for marker-assisted improvement of the world's leading fiber crop. These data provide new insights into polyploid evolution and represent a foundation for assembly of a finished sequence of the cotton genome.
Project description:Non-specific lipid transfer proteins (nsLTPs) had been previously isolated from cotton fiber but their functions were unclear so far. Bioinformatic analysis of the tetraploid cotton genome database identified 138 nsLTP genes, falling into the 11 groups as reported previously. Different from Arabidopsis, cacao, and other crops, cotton type XI genes were considerably expanded and diverged earlier on chromosome At11, Dt11, and Dt08. Corresponding to the type XI genes, the type XI proteins (GhLtpXIs) all contained an extra N-terminal cap resulting in larger molecular weight. The research revealed that the expression of type XI genes was dramatically increased in fibers of tetraploid cotton compared with the two diploid progenitors. High-level of GhLtpXIs expression was observed in long-fibered cotton cultivars during fiber elongation. Ectopic expression of GhLtpXIs in Arabidopsis significantly enhanced trichome length, suggesting that GhLtpXIs promoted fiber elongation. Overall, the findings of this research provide insights into phenotypic evolution of Gossypium species and regulatory mechanism of nsLTPs during fiber development. HIGHLIGHT A specific group, type XI nsLTPs, was identified with predominant expression in elongating fibers of Gossypium hirsutum based on evolutionary, transcriptional, and functional analyses.
Project description:Cotton (Gossypium spp.) is an important crop plant that is widely grown to produce both natural textile fibers and cottonseed oil. Cotton fibers, the economically more important product of the cotton plant, are seed trichomes derived from individual cells of the epidermal layer of the seed coat. It has been known for a long time that large numbers of genes determine the development of cotton fiber, and more recently it has been determined that these genes are distributed across At and Dt subgenomes of tetraploid AD cottons. In the present study, the organization and evolution of the fiber development genes were investigated through the construction of an integrated genetic and physical map of fiber development genes whose functions have been verified and confirmed. A total of 535 cotton fiber development genes, including 103 fiber transcription factors, 259 fiber development genes, and 173 SSR-contained fiber ESTs, were analyzed at the subgenome level. A total of 499 fiber related contigs were selected and assembled. Together these contigs covered about 151 Mb in physical length, or about 6.7% of the tetraploid cotton genome. Among the 499 contigs, 397 were anchored onto individual chromosomes. Results from our studies on the distribution patterns of the fiber development genes and transcription factors between the At and Dt subgenomes showed that more transcription factors were from Dt subgenome than At, whereas more fiber development genes were from At subgenome than Dt. Combining our mapping results with previous reports that more fiber QTLs were mapped in Dt subgenome than At subgenome, the results suggested a new functional hypothesis for tetraploid cotton. After the merging of the two diploid Gossypium genomes, the At subgenome has provided most of the genes for fiber development, because it continues to function similar to its fiber producing diploid A genome ancestor. On the other hand, the Dt subgenome, with its non-fiber producing D genome ancestor, provides more transcription factors that regulate the expression of the fiber genes in the At subgenome. This hypothesis would explain previously published mapping results. At the same time, this integrated map of fiber development genes would provide a framework to clone individual full-length fiber genes, to elucidate the physiological mechanisms of the fiber differentiation, elongation, and maturation, and to systematically study the functional network of these genes that interact during the process of fiber development in the tetraploid cottons.
Project description:BACKGROUND:Fluorescence in situ hybridization (FISH) is an efficient cytogenetic technology to study chromosome structure. Transposable element (TE) is an important component in eukaryotic genomes and can provide insights in the structure and evolution of eukaryotic genomes. RESULTS:A FISH probe derived from bacterial artificial chromosome (BAC) clone 299N22 generated striking signals on all 26 chromosomes of the cotton diploid A genome (AA, 2x=26) but very few on the diploid D genome (DD, 2x=26). All 26 chromosomes of the A sub genome (At) of tetraploid cotton (AADD, 2n=4x=52) also gave positive signals with this FISH probe, whereas very few signals were observed on the D sub genome (Dt). Sequencing and annotation of BAC clone 299N22, revealed a novel Ty3/gypsy transposon family, which was named as 'CICR'. This family is a significant contributor to size expansion in the A (sub) genome but not in the D (sub) genome. Further FISH analysis with the LTR of CICR as a probe revealed that CICR is lineage-specific, since massive repeats were found in A and B genomic groups, but not in C-G genomic groups within the Gossypium genus. Molecular evolutionary analysis of CICR suggested that tetraploid cottons evolved after silence of the transposon family 1-1.5 million years ago (Mya). Furthermore, A genomes are more homologous with B genomes, and the C, E, F, and G genomes likely diverged from a common ancestor prior to 3.5-4 Mya, the time when CICR appeared. The genomic variation caused by the insertion of CICR in the A (sub) genome may have played an important role in the speciation of organisms with A genomes. CONCLUSIONS:The CICR family is highly repetitive in A and B genomes of Gossypium, but not amplified in the C-G genomes. The differential amount of CICR family in At and Dt will aid in partitioning sub genome sequences for chromosome assemblies during tetraploid genome sequencing and will act as a method for assessing the accuracy of tetraploid genomes by looking at the proportion of CICR elements in resulting pseudochromosome sequences. The timeline of the expansion of CICR family provides a new reference for cotton evolutionary analysis, while the impact on gene function caused by the insertion of CICR elements will be a target for further analysis of investigating phenotypic differences between A genome and D genome species.
Project description:A high-density linkage map was constructed using 1,885 newly obtained loci and 3,747 previously published loci, which included 5,152 loci with 4696.03 cM in total length and 0.91 cM in mean distance. Homology analysis in the cotton genome further confirmed the 13 expected homologous chromosome pairs and revealed an obvious inversion on Chr10 or Chr20 and repeated inversions on Chr07 or Chr16. In addition, two reciprocal translocations between Chr02 and Chr03 and between Chr04 and Chr05 were confirmed. Comparative genomics between the tetraploid cotton and the diploid cottons showed that no major structural changes exist between DT and D chromosomes but rather between AT and A chromosomes. Blast analysis between the tetraploid cotton genome and the mixed genome of two diploid cottons showed that most AD chromosomes, regardless of whether it is from the AT or DT genome, preferentially matched with the corresponding homologous chromosome in the diploid A genome, and then the corresponding homologous chromosome in the diploid D genome, indicating that the diploid D genome underwent converted evolution by the diploid A genome to form the DT genome during polyploidization. In addition, the results reflected that a series of chromosomal translocations occurred among Chr01/Chr15, Chr02/Chr14, Chr03/Chr17, Chr04/Chr22, and Chr05/Chr19.
Project description:Genetic linkage maps play fundamental roles in understanding genome structure, explaining genome formation events during evolution, and discovering the genetic bases of important traits. A high-density cotton (Gossypium spp.) genetic map was developed using representative sets of simple sequence repeat (SSR) and the first public set of single nucleotide polymorphism (SNP) markers to genotype 186 recombinant inbred lines (RILs) derived from an interspecific cross between Gossypium hirsutum L. (TM-1) and G. barbadense L. (3-79). The genetic map comprised 2072 loci (1825 SSRs and 247 SNPs) and covered 3380 centiMorgan (cM) of the cotton genome (AD) with an average marker interval of 1.63 cM. The allotetraploid cotton genome produced equivalent recombination frequencies in its two subgenomes (At and Dt). Of the 2072 loci, 1138 (54.9%) were mapped to 13 At-subgenome chromosomes, covering 1726.8 cM (51.1%), and 934 (45.1%) mapped to 13 Dt-subgenome chromosomes, covering 1653.1 cM (48.9%). The genetically smallest homeologous chromosome pair was Chr. 04 (A04) and 22 (D04), and the largest was Chr. 05 (A05) and 19 (D05). Duplicate loci between and within homeologous chromosomes were identified that facilitate investigations of chromosome translocations. The map augments evidence of reciprocal rearrangement between ancestral forms of Chr. 02 and 03 versus segmental homeologs 14 and 17 as centromeric regions show homeologous between Chr. 02 (A02) and 17 (D02), as well as between Chr. 03 (A03) and 14 (D03). This research represents an important foundation for studies on polyploid cottons, including germplasm characterization, gene discovery, and genome sequence assembly.
Project description:Cultivated peanut and synthetics are allotetraploids (2n = 4x = 40) with two homeologous sets of chromosomes. Meiosis in allotetraploid peanut is generally thought to show diploid-like behavior. However, a recent study pointed out the occurrence of recombination between homeologous chromosomes, especially when synthetic allotetraploids are used, challenging the view of disomic inheritance in peanut. In this study, we investigated the meiotic behavior of allotetraploid peanut using 380 SSR markers and 90 F2 progeny derived from the cross between Arachis hypogaea cv Fleur 11 (AABB) and ISATGR278-18 (AAKK), a synthetic allotetraploid that harbors a K-genome that was reported to pair with the cultivated B-genome during meiosis. Segregation analysis of SSR markers showed 42 codominant SSRs with unexpected null bands among some progeny. Chi-square tests for these loci deviate from the expected 1:2:1 Mendelian ratio under disomic inheritance. A linkage map of 357 codominant loci aligned on 20 linkage groups (LGs) with a total length of 1728 cM, averaging 5.1 cM between markers, was developed. Among the 10 homeologous sets of LGs, one set consisted of markers that all segregated in a polysomic-like pattern, six in a likely disomic pattern and the three remaining in a mixed pattern with disomic and polysomic loci clustered on the same LG. Moreover, we reported a substitution of homeologous chromosomes in some progeny. Our results suggest that the homeologous recombination events occurred between the A and K genomes in the newly synthesized allotetraploid and have been highlighted in the progeny. Homeologous exchanges are rarely observed in tetraploid peanut and have not yet been reported for AAKK and AABB genomes. The implications of these results on peanut breeding are discussed.
Project description:COBRA-Like (COBL) genes, which encode a plant-specific glycosylphosphatidylinositol (GPI) anchored protein, have been proven to be key regulators in the orientation of cell expansion and cellulose crystallinity status. Genome-wide analysis has been performed in A. thaliana, O. sativa, Z. mays and S. lycopersicum, but little in Gossypium. Here we identified 19, 18 and 33 candidate COBL genes from three sequenced cotton species, diploid cotton G. raimondii, G. arboreum and tetraploid cotton G. hirsutum acc. TM-1, respectively. These COBL members were anchored onto 10 chromosomes in G. raimondii and could be divided into two subgroups. Expression patterns of COBL genes showed highly developmental and spatial regulation in G. hirsutum acc. TM-1. Of them, GhCOBL9 and GhCOBL13 were preferentially expressed at the secondary cell wall stage of fiber development and had significantly co-upregulated expression with cellulose synthase genes GhCESA4, GhCESA7 and GhCESA8. Besides, GhCOBL9 Dt and GhCOBL13 Dt were co-localized with previously reported cotton fiber quality quantitative trait loci (QTLs) and the favorable allele types of GhCOBL9 Dt had significantly positive correlations with fiber quality traits, indicating that these two genes might play an important role in fiber development.
Project description:Next generation sequencing (RNA-seq) technology was used to evaluate the effects of the Ligon lintless-2 (Li2) short fiber mutation on transcriptomes of both subgenomes of allotetraploid cotton (Gossypium hirsutum L.) as compared to its near-isogenic wild type. Sequencing was performed on 4 libraries from developing fibers of Li2 mutant and wild type near-isogenic lines at the peak of elongation followed by mapping and PolyCat categorization of RNA-seq data to the reference D5 genome (G. raimondii) for homeologous gene expression analysis. The majority of homeologous genes, 83.6% according to the reference genome, were expressed during fiber elongation. Our results revealed: 1) approximately two times more genes were induced in the AT subgenome comparing to the DT subgenome in wild type and mutant fiber; 2) the subgenome expression bias was significantly reduced in the Li2 fiber transcriptome; 3) Li2 had a significantly greater effect on the DT than on the AT subgenome. Transcriptional regulators and cell wall homeologous genes significantly affected by the Li2 mutation were reviewed in detail. This is the first report to explore the effects of a single mutation on homeologous gene expression in allotetraploid cotton. These results provide deeper insights into the evolution of allotetraploid cotton gene expression and cotton fiber development.
Project description:The activity of genome-specific repetitive sequences is the main cause of genome variation between Gossypium A and D genomes. Through comparative analysis of the two genomes, we retrieved a repetitive element termed ICRd motif, which appears frequently in the diploid Gossypium raimondii (D5) genome but rarely in the diploid Gossypium arboreum (A2) genome. We further explored the existence of the ICRd motif in chromosomes of G. raimondii, G. arboreum, and two tetraploid (AADD) cotton species, Gossypium hirsutum and Gossypium barbadense, by fluorescence in situ hybridization (FISH), and observed that the ICRd motif exists in the D5 and D-subgenomes but not in the A2 and A-subgenomes. The ICRd motif comprises two components, a variable tandem repeat (TR) region and a conservative sequence (CS). The two constituents each have hundreds of repeats that evenly distribute across 13 chromosomes of the D5genome. The ICRd motif (and its repeats) was revealed as the common conservative region harbored by ancient Long Terminal Repeat Retrotransposons. Identification and investigation of the ICRd motif promotes the study of A and D genome differences, facilitates research on Gossypium genome evolution, and provides assistance to subgenome identification and genome assembling.