The Li2 mutation results in reduced subgenome expression bias in elongating fibers of allotetraploid cotton (Gossypium hirsutum L.).
ABSTRACT: Next generation sequencing (RNA-seq) technology was used to evaluate the effects of the Ligon lintless-2 (Li2) short fiber mutation on transcriptomes of both subgenomes of allotetraploid cotton (Gossypium hirsutum L.) as compared to its near-isogenic wild type. Sequencing was performed on 4 libraries from developing fibers of Li2 mutant and wild type near-isogenic lines at the peak of elongation followed by mapping and PolyCat categorization of RNA-seq data to the reference D5 genome (G. raimondii) for homeologous gene expression analysis. The majority of homeologous genes, 83.6% according to the reference genome, were expressed during fiber elongation. Our results revealed: 1) approximately two times more genes were induced in the AT subgenome comparing to the DT subgenome in wild type and mutant fiber; 2) the subgenome expression bias was significantly reduced in the Li2 fiber transcriptome; 3) Li2 had a significantly greater effect on the DT than on the AT subgenome. Transcriptional regulators and cell wall homeologous genes significantly affected by the Li2 mutation were reviewed in detail. This is the first report to explore the effects of a single mutation on homeologous gene expression in allotetraploid cotton. These results provide deeper insights into the evolution of allotetraploid cotton gene expression and cotton fiber development.
Project description:BACKGROUND: Upland cotton (G. hirsutum L.) is the leading fiber crop worldwide. Genetic improvement of fiber quality and yield is facilitated by a variety of genomics tools. An integrated genetic and physical map is needed to better characterize quantitative trait loci and to allow for the positional cloning of valuable genes. However, developing integrated genomic tools for complex allotetraploid genomes, like that of cotton, is highly experimental. In this report, we describe an effective approach for developing an integrated physical framework that allows for the distinguishing between subgenomes in cotton. RESULTS: A physical map has been developed with 220 and 115 BAC contigs for homeologous chromosomes 12 and 26, respectively, covering 73.49 Mb and 34.23 Mb in physical length. Approximately one half of the 220 contigs were anchored to the At subgenome only, while 48 of the 115 contigs were allocated to the Dt subgenome only. Between the two chromosomes, 67 contigs were shared with an estimated overall physical similarity between the two chromosomal homeologs at 40.0 %. A total of 401 fiber unigenes plus 214 non-fiber unigenes were located to chromosome 12 while 207 fiber unigenes plus 183 non-fiber unigenes were allocated to chromosome 26. Anchoring was done through an overgo hybridization approach and all anchored ESTs were functionally annotated via blast analysis. CONCLUSION: This integrated genomic map describes the first pair of homoeologous chromosomes of an allotetraploid genome in which BAC contigs were identified and partially separated through the use of chromosome-specific probes and locus-specific genetic markers. The approach used in this study should prove useful in the construction of genome-wide physical maps for polyploid plant genomes including Upland cotton. The identification of Gene-rich islands in the integrated map provides a platform for positional cloning of important genes and the targeted sequencing of specific genomic regions.
Project description:Genetic linkage maps play fundamental roles in understanding genome structure, explaining genome formation events during evolution, and discovering the genetic bases of important traits. A high-density cotton (Gossypium spp.) genetic map was developed using representative sets of simple sequence repeat (SSR) and the first public set of single nucleotide polymorphism (SNP) markers to genotype 186 recombinant inbred lines (RILs) derived from an interspecific cross between Gossypium hirsutum L. (TM-1) and G. barbadense L. (3-79). The genetic map comprised 2072 loci (1825 SSRs and 247 SNPs) and covered 3380 centiMorgan (cM) of the cotton genome (AD) with an average marker interval of 1.63 cM. The allotetraploid cotton genome produced equivalent recombination frequencies in its two subgenomes (At and Dt). Of the 2072 loci, 1138 (54.9%) were mapped to 13 At-subgenome chromosomes, covering 1726.8 cM (51.1%), and 934 (45.1%) mapped to 13 Dt-subgenome chromosomes, covering 1653.1 cM (48.9%). The genetically smallest homeologous chromosome pair was Chr. 04 (A04) and 22 (D04), and the largest was Chr. 05 (A05) and 19 (D05). Duplicate loci between and within homeologous chromosomes were identified that facilitate investigations of chromosome translocations. The map augments evidence of reciprocal rearrangement between ancestral forms of Chr. 02 and 03 versus segmental homeologs 14 and 17 as centromeric regions show homeologous between Chr. 02 (A02) and 17 (D02), as well as between Chr. 03 (A03) and 14 (D03). This research represents an important foundation for studies on polyploid cottons, including germplasm characterization, gene discovery, and genome sequence assembly.
Project description:BACKGROUND: Cotton fiber length is an important quality attribute to the textile industry and longer fibers can be more efficiently spun into yarns to produce superior fabrics. There is typically a negative correlation between yield and fiber quality traits such as length. An understanding of the regulatory mechanisms controlling fiber length can potentially provide a valuable tool for cotton breeders to improve fiber length while maintaining high yields. The cotton (Gossypium hirsutum L.) fiber mutation Ligon lintless-2 is controlled by a single dominant gene (Li2) that results in significantly shorter fibers than a wild-type. In a near-isogenic state with a wild-type cotton line, Li2 is a model system with which to study fiber elongation. RESULTS: Two near-isogenic lines of Ligon lintless-2 (Li2) cotton, one mutant and one wild-type, were developed through five generations of backcrosses (BC5). An F2 population was developed from a cross between the two Li2 near-isogenic lines and used to develop a linkage map of the Li2 locus on chromosome 18. Five simple sequence repeat (SSR) markers were closely mapped around the Li2 locus region with two of the markers flanking the Li2 locus at 0.87 and 0.52 centimorgan. No apparent differences in fiber initiation and early fiber elongation were observed between the mutant ovules and the wild-type ones. Gene expression profiling using microarrays suggested roles of reactive oxygen species (ROS) homeostasis and cytokinin regulation in the Li2 mutant phenotype. Microarray gene expression data led to successful identification of an EST-SSR marker (NAU3991) that displayed complete linkage to the Li2 locus. CONCLUSIONS: In the field of cotton genomics, we report the first successful conversion of gene expression data into an SSR marker that is associated with a genomic region harboring a gene responsible for a fiber trait. The EST-derived SSR marker NAU3991 displayed complete linkage to the Li2 locus on chromosome 18 and resided in a gene with similarity to a putative plectin-related protein. The complete linkage suggests that this expressed sequence may be the Li2 gene.
Project description:Flowering time is an important ecological trait that determines the transition from vegetative to reproductive growth. Flowering time in cotton is controlled by short-day photoperiods, with strict photoperiod sensitivity. As the CO-FT (CONSTANS-FLOWER LOCUS T) module regulates photoperiodic flowering in several plants, we selected eight CONSTANS genes (COL) in group I to detect their expression patterns in long-day and short-day conditions. Further, we individually cloned and sequenced their homologs from 25 different cotton accessions and one outgroup. Finally, we studied their structures, phylogenetic relationship, and molecular evolution in both coding region and three characteristic domains. All the eight COLs in group I show diurnal expression. In the orthologous and homeologous loci, each gene structure in different cotton species is highly conserved, while length variation has occurred due to insertions/deletions in intron and/or exon regions. Six genes, COL2 to COL5, COL7 and COL8, exhibit higher nucleotide diversity in the D-subgenome than in the A-subgenome. The Ks values of 98.37% in all allotetraploid cotton species examined were higher in the A-D and At-Dt comparison than in the A-At and D-Dt comparisons, and the Pearson's correlation coefficient (r) of Ks between A vs. D and At vs. Dt also showed positive, high correlations, with a correlation coefficient of at least 0.797. The nucleotide polymorphism in wild species is significantly higher compared to G. hirsutum and G. barbadense, indicating a genetic bottleneck associated with the domesticated cotton species. Three characteristic domains in eight COLs exhibit different evolutionary rates, with the CCT domain highly conserved, while the B-box and Var domain much more variable in allotetraploid species. Taken together, COL1, COL2 and COL8 endured greater selective pressures during the domestication process. The study improves our understanding of the domestication-related genes/traits during cotton evolutionary process.
Project description:Cotton (Gossypium spp.) is an important crop plant that is widely grown to produce both natural textile fibers and cottonseed oil. Cotton fibers, the economically more important product of the cotton plant, are seed trichomes derived from individual cells of the epidermal layer of the seed coat. It has been known for a long time that large numbers of genes determine the development of cotton fiber, and more recently it has been determined that these genes are distributed across At and Dt subgenomes of tetraploid AD cottons. In the present study, the organization and evolution of the fiber development genes were investigated through the construction of an integrated genetic and physical map of fiber development genes whose functions have been verified and confirmed. A total of 535 cotton fiber development genes, including 103 fiber transcription factors, 259 fiber development genes, and 173 SSR-contained fiber ESTs, were analyzed at the subgenome level. A total of 499 fiber related contigs were selected and assembled. Together these contigs covered about 151 Mb in physical length, or about 6.7% of the tetraploid cotton genome. Among the 499 contigs, 397 were anchored onto individual chromosomes. Results from our studies on the distribution patterns of the fiber development genes and transcription factors between the At and Dt subgenomes showed that more transcription factors were from Dt subgenome than At, whereas more fiber development genes were from At subgenome than Dt. Combining our mapping results with previous reports that more fiber QTLs were mapped in Dt subgenome than At subgenome, the results suggested a new functional hypothesis for tetraploid cotton. After the merging of the two diploid Gossypium genomes, the At subgenome has provided most of the genes for fiber development, because it continues to function similar to its fiber producing diploid A genome ancestor. On the other hand, the Dt subgenome, with its non-fiber producing D genome ancestor, provides more transcription factors that regulate the expression of the fiber genes in the At subgenome. This hypothesis would explain previously published mapping results. At the same time, this integrated map of fiber development genes would provide a framework to clone individual full-length fiber genes, to elucidate the physiological mechanisms of the fiber differentiation, elongation, and maturation, and to systematically study the functional network of these genes that interact during the process of fiber development in the tetraploid cottons.
Project description:Cotton cultivars have evolved to produce extensive, long, seed-born fibers important for the textile industry, but we know little about the molecular mechanism underlying spinnable fiber formation. Here, we report how PACLOBUTRAZOL RESISTANCE 1 (PRE1) in cotton, which encodes a basic helix-loop-helix (bHLH) transcription factor, is a target gene of spinnable fiber evolution. Differential expression of homoeologous genes in polyploids is thought to be important to plant adaptation and novel phenotypes. PRE1 expression is specific to cotton fiber cells, upregulated during their rapid elongation stage and A-homoeologous biased in allotetraploid cultivars. Transgenic studies demonstrated that PRE1 is a positive regulator of fiber elongation. We determined that the natural variation of the canonical TATA-box, a regulatory element commonly found in many eukaryotic core promoters, is necessary for subgenome-biased PRE1 expression, representing a mechanism underlying the selection of homoeologous genes. Thus, variations in the promoter of the cell elongation regulator gene PRE1 have contributed to spinnable fiber formation in cotton. Overexpression of GhPRE1 in transgenic cotton yields longer fibers with improved quality parameters, indicating that this bHLH gene is useful for improving cotton fiber quality.
Project description:Gene expression profiling of wild-type and Li2 mutants were investigated with Affymetrix cotton Genome Array focused primarily on time-points related to fiber elongation, more specifically, 8 and 12 days post anthesis (DPA). These time points typically represent peak rates of elongation and the beginning of the transition stage when fibers are still elongating at lower rates. An additional time-point day of anthesis (DOA) was added to the microarray experiment to serve as a reference and also to confirm that there were no significant differences in fiber initiation at the gene expression level.
Project description:Reactive oxygen species (ROS) are important molecules in the plant, which are involved in many biological processes, including fiber development and adaptation to abiotic stress in cotton. We carried out transcription analysis to determine the evolution of the ROS genes and analyzed their expression levels in various tissues of cotton plant under abiotic stress conditions. There were 515, 260, and 261 genes of ROS network that were identified in Gossypium hirsutum (AD? genome), G. arboreum (A genome), and G. raimondii (D genome), respectively. The ROS network genes were found to be distributed in all the cotton chromosomes, but with a tendency of aggregating on either the lower or upper arms of the chromosomes. Moreover, all the cotton ROS network genes were grouped into 17 families as per the phylogenetic tress analysis. A total of 243 gene pairs were orthologous in G. arboreum and G. raimondii. There were 240 gene pairs that were orthologous in G. arboreum, G. raimondii, and G. hirsutum. The synonymous substitution value (Ks) peaks of orthologous gene pairs between the At subgenome and the A progenitor genome (G. arboreum), D subgenome and D progenitor genome (G. raimondii) were 0.004 and 0.015, respectively. The Ks peaks of ROS network orthologous gene pairs between the two progenitor genomes (A and D genomes) and two subgenomes (At and Dt subgenome) were 0.045. The majority of Ka/Ks value of orthologous gene pairs between the A, D genomes and two subgenomes of TM-1 were lower than 1.0. RNA seq. analysis and RT-qPCR validation, showed that, CSD1,2,3,5,6; FSD1,2; MSD1,2; APX3,11; FRO5.6; and RBOH6 played a major role in fiber development while CSD1, APX1, APX2, MDAR1, GPX4-6-7, FER2, RBOH6, RBOH11, and FRO5 were integral for enhancing salt stress in cotton. ROS network-mediated signal pathway enhances the mechanism of fiber development and regulation of abiotic stress in Gossypium. This study will enhance the understanding of ROS network and form the basic foundation in exploring the mechanism of ROS network-involving the fiber development and regulation of abiotic stress in cotton.
Project description:Long noncoding RNAs (lncRNAs) have several known functions in plant development, but their possible roles in responding to plant disease remain largely unresolved. In this study, we described a comprehensive disease-responding lncRNA profiles in defence against a cotton fungal disease Verticillium dahliae. We further revealed the conserved and specific characters of disease-responding process between two cotton species. Conservatively for two cotton species, we found the expression dominance of induced lncRNAs in the Dt subgenome, indicating a biased induction pattern in the co-existing subgenomes of allotetraploid cotton. Comparative analysis of lncRNA expression and their proposed functions in resistant Gossypium barbadense cv. '7124' versus susceptible Gossypium hirsutum cv. 'YZ1' revealed their distinct disease response mechanisms. Species-specific (LS) lncRNAs containing more SNPs displayed a fiercer inducing level postinfection than the species-conserved (core) lncRNAs. Gene Ontology enrichment of LS lncRNAs and core lncRNAs indicates distinct roles in the process of biotic stimulus. Further functional analysis showed that two core lncRNAs, GhlncNAT-ANX2- and GhlncNAT-RLP7-silenced seedlings, displayed an enhanced resistance towards V. dahliae and Botrytis cinerea, possibly associated with the increased expression of LOX1 and LOX2. This study represents the first characterization of lncRNAs involved in resistance to fungal disease and provides new clues to elucidate cotton disease response mechanism.
Project description:SNPs are the most abundant polymorphism type, and have been explored in many crop genomic studies, including rice and maize. SNP discovery in allotetraploid cotton genomes has lagged behind that of other crops due to their complexity and polyploidy. In this study, genome-wide SNPs are detected systematically using next-generation sequencing and efficient SNP genotyping methods, and used to construct a linkage map and characterize the structural variations in polyploid cotton genomes.We construct an ultra-dense inter-specific genetic map comprising 4,999,048 SNP loci distributed unevenly in 26 allotetraploid cotton linkage groups and covering 4,042 cM. The map is used to order tetraploid cotton genome scaffolds for accurate assembly of G. hirsutum acc. TM-1. Recombination rates and hotspots are identified across the cotton genome by comparing the assembled draft sequence and the genetic map. Using this map, genome rearrangements and centromeric regions are identified in tetraploid cotton by combining information from the publicly-available G. raimondii genome with fluorescent in situ hybridization analysis.We report the genotype-by-sequencing method used to identify millions of SNPs between G. hirsutum and G. barbadense. We construct and use an ultra-dense SNP map to correct sequence mis-assemblies, merge scaffolds into pseudomolecules corresponding to chromosomes, detect genome rearrangements, and identify centromeric regions in allotetraploid cottons. We find that the centromeric retro-element sequence of tetraploid cotton derived from the D subgenome progenitor might have invaded the A subgenome centromeres after allotetrapolyploid formation. This study serves as a valuable genomic resource for genetic research and breeding of cotton.