ABSTRACT: BACKGROUND:Limited DNA sequence and DNA marker resources have been developed for Iris (Iridaceae), a monocot genus of 200-300 species in the Asparagales, several of which are horticulturally important. We mined an I. brevicaulis-I. fulva EST database for simple sequence repeats (SSRs) and developed ortholog-specific EST-SSR markers for genetic mapping and other genotyping applications in Iris. Here, we describe the abundance and other characteristics of SSRs identified in the transcript assembly (EST database) and the cross-species utility and polymorphisms of I. brevicaulis-I. fulva EST-SSR markers among wild collected ecotypes and horticulturally important cultivars. RESULTS:Collectively, 6,530 ESTs were produced from normalized leaf and root cDNA libraries of I. brevicaulis (IB72) and I. fulva (IF174), and assembled into 4,917 unigenes (1,066 contigs and 3,851 singletons). We identified 1,447 SSRs in 1,162 unigenes and developed 526 EST-SSR markers, each tracing a different unigene. Three-fourths of the EST-SSR markers (399/526) amplified alleles from IB72 and IF174 and 84% (335/399) were polymorphic between IB25 and IF174, the parents of I. brevicaulis x I. fulva mapping populations. Forty EST-SSR markers were screened for polymorphisms among 39 ecotypes or cultivars of seven species - 100% amplified alleles from wild collected ecotypes of Louisiana Iris (I.brevicaulis, I.fulva, I. nelsonii, and I. hexagona), whereas 42-52% amplified alleles from cultivars of three horticulturally important species (I. pseudacorus, I. germanica, and I. sibirica). Ecotypes and cultivars were genetically diverse - the number of alleles/locus ranged from two to 18 and mean heterozygosity was 0.76. CONCLUSION:Nearly 400 ortholog-specific EST-SSR markers were developed for comparative genetic mapping and other genotyping applications in Iris, were highly polymorphic among ecotypes and cultivars, and have broad utility for genotyping applications within the genus.
Project description:<h4>Background</h4>Linkage maps are useful tools for examining both the genetic architecture of quantitative traits and the evolution of reproductive incompatibilities. We describe the generation of two genetic maps using reciprocal interspecific backcross 1 (BC1) mapping populations from crosses between Iris brevicaulis and Iris fulva. These maps were constructed using expressed sequence tag (EST)- derived codominant microsatellite markers. Such a codominant marker system allowed for the ability to link the two reciprocal maps, and compare patterns of transmission ratio distortion observed between the two.<h4>Results</h4>Linkage mapping resulted in markers that coalesced into 21 linkage groups for each of the reciprocal backcross maps, presumably corresponding to the 21 haploid chromosomes of I. brevicaulis and I. fulva. The composite map was 1190.0-cM long, spanned 81% of the I. brevicaulis and I. fulva genomes, and had a mean density of 4.5 cM per locus. Transmission ratio distortion (TRD) was observed in 138 (48.5%) loci distributed in 19 of the 21 LGs in BCIB, BCIF, or both BC1 mapping populations. Of the distorted markers identified, I. fulva alleles were detected at consistently higher-than-expected frequencies in both mapping populations.<h4>Conclusions</h4>The observation that I. fulva alleles are overrepresented in both mapping populations suggests that I. fulva alleles are favored to introgress into I. brevicaulis genetic backgrounds, while I. brevicaulis alleles would tend to be prevented from introgressing into I. fulva. These data are consistent with the previously observed patterns of introgression in natural hybrid zones, where I. fulva alleles have been consistently shown to introgress across species boundaries.
Project description:The Louisiana iris species Iris brevicaulis and I. fulva are morphologically and karyotypically distinct yet frequently hybridize in nature. A group of high-copy-number TY3/gypsy-like retrotransposons was characterized from these species and used to develop molecular markers that take advantage of the abundance and distribution of these elements in the large iris genome. The copy number of these IRRE elements (for iris retroelement), is approximately 1 x 10(5), accounting for approximately 6-10% of the approximately 10,000-Mb haploid Louisiana iris genome. IRRE elements are transcriptionally active in I. brevicaulis and I. fulva and their F(1) and backcross hybrids. The LTRs of the elements are more variable than the coding domains and can be used to define several distinct IRRE subfamilies. Transposon display or S-SAP markers specific to two of these subfamilies have been developed and are highly polymorphic among wild-collected individuals of each species. As IRRE elements are present in each of 11 iris species tested, the marker system has the potential to provide valuable comparative data on the dynamics of retrotransposition in large plant genomes.
Project description:Identifying processes that promote or limit gene flow can help define the ecological and evolutionary history of a species. Furthermore, defining those factors that make up "species boundaries" can provide a definition of the independent evolutionary trajectories of related taxa. For many species, the historic processes that account for their distribution of genetic variation remain unresolved. In this study, we examine the geographic distribution of genetic diversity for two species of Louisiana Irises, Iris brevicaulis and Iris fulva. Specifically, we asked how populations are structured and if population structure coincides with potential barriers to gene flow. We also asked whether there is evidence of hybridization between these two species outside Louisiana hybrid zones. We used a genotyping-by-sequencing approach and sampled a large number of single nucleotide polymorphisms across these species' genomes. Two different population assignment methods were used to resolve population structure in I. brevicaulis; however, there was considerably less population structure in I. fulva. We used a species tree approach to infer phylogenies both within and between populations and species. For I. brevicaulis, the geography of the collection locality was reflected in the phylogeny. The I. fulva phylogeny reflected much less structure than detected for I. brevicaulis. Lastly, combining both species into a phylogenetic analysis resolved two of six populations of I. brevicaulis that shared alleles with I. fulva. Taken together, our results suggest major differences in the level and pattern of connectivity among populations of these two Louisiana Iris species.
Project description:Simple sequence repeats (SSRs) are among the most important markers for population analysis and have been widely used in plant genetic mapping and molecular breeding. Expressed sequence tag-SSR (EST-SSR) markers, located in the coding regions, are potentially more efficient for QTL mapping, gene targeting, and marker-assisted breeding. In this study, we investigated 51,694 nonredundant unigenes, assembled from clean reads from deep transcriptome sequencing with a Solexa/Illumina platform, for identification and development of EST-SSRs in Chinese cabbage. In total, 10,420 EST-SSRs with over 12?bp were identified and characterized, among which 2744 EST-SSRs are new and 2317 are known ones showing polymorphism with previously reported SSRs. A total of 7877 PCR primer pairs for 1561 EST-SSR loci were designed, and primer pairs for twenty-four EST-SSRs were selected for primer evaluation. In nineteen EST-SSR loci (79.2%), amplicons were successfully generated with high quality. Seventeen (89.5%) showed polymorphism in twenty-four cultivars of Chinese cabbage. The polymorphic alleles of each polymorphic locus were sequenced, and the results showed that most polymorphisms were due to variations of SSR repeat motifs. The EST-SSRs identified and characterized in this study have important implications for developing new tools for genetics and molecular breeding in Chinese cabbage.
Project description:Chrysanthemum morifolium, is a well-known flowering plant worldwide, and has a high commercial, floricultural, and medicinal value. In this study, simple-sequence repeat (SSR) markers were generated from EST datasets and were applied to assess the genetic diversity among 32 cultivars. A total of 218 in silico SSR loci were identified from 7300 C. morifolium ESTs retrieved from GenBank. Of all SSR loci, 61.47% of them (134) were hexa-nucleotide repeats, followed by tri-nucleotide repeats (17.89%), di-nucleotide repeats (12.39%), tetra-nucleotide repeats (4.13%), and penta-nucleotide repeats (4.13%). In this study, 17 novel EST-SSR markers were verified. Along with 38 SSR markers reported previously, 55 C. morifolium SSR markers were selected for further genetic diversity analysis. PCR amplification of these EST-SSRs produced 1319 fragments, 1306 of which showed polymorphism. The average polymorphism information content of the SSR primer pairs was 0.972 (0.938-0.993), which showed high genetic diversity among C. morifolium cultivars. Based on SSR markers, 32 C. morifolium cultivars were separated into two main groups by partitioning of the clusters using the unweighted pair group method with arithmetic mean dendrogram, which was further supported by a principal coordinate analysis plot. Phylogenetic relationship among C. morifolium cultivars as revealed by SSR markers was highly consistent with the classification of medicinal C. morifolium populations according to their origin and ecological distribution. Our results demonstrated that SSR markers were highly reproducible and informative, and could be used to evaluate genetic diversity and relationships among medicinal C. morifolium cultivars.
Project description:Germplasm collections of tree crop species represent fundamental tools for conservation of diversity and key steps for its characterization and evaluation. For the olive tree, several collections were created all over the world, but only few of them have been fully characterized and molecularly identified. The olive collection of Perugia University (UNIPG), established in the years' 60, represents one of the first attempts to gather and safeguard olive diversity, keeping together cultivars from different countries. In the present study, a set of 370 olive trees previously uncharacterized was screened with 10 standard simple sequence repeats (SSRs) and nine new EST-SSR markers, to correctly and thoroughly identify all genotypes, verify their representativeness of the entire cultivated olive variation, and validate the effectiveness of new markers in comparison to standard genotyping tools. The SSR analysis revealed the presence of 59 genotypes, corresponding to 72 well known cultivars, 13 of them resulting exclusively present in this collection. The new EST-SSRs have shown values of diversity parameters quite similar to those of best standard SSRs. When compared to hundreds of Mediterranean cultivars, the UNIPG olive accessions were splitted into the three main populations (East, Center and West Mediterranean), confirming that the collection has a good representativeness of the entire olive variability. Furthermore, Bayesian analysis, performed on the 59 genotypes of the collection by the use of both sets of markers, have demonstrated their splitting into four clusters, with a well balanced membership obtained by EST respect to standard SSRs. The new OLEST (Olea expressed sequence tags) SSR markers resulted as effective as the best standard markers. The information obtained from this study represents a high valuable tool for ex situ conservation and management of olive genetic resources, useful to build a common database from worldwide olive cultivar collections, also based on recently developed markers.
Project description:Ramie (Boehmeria nivea L. Gaud) is one of the most important natural fiber crops, and improvement of fiber yield and quality is the main goal in efforts to breed superior cultivars. However, efforts aimed at enhancing the understanding of ramie genetics and developing more effective breeding strategies have been hampered by the shortage of simple sequence repeat (SSR) markers. In our previous study, we had assembled de novo 43,990 expressed sequence tags (ESTs). In the present study, we searched these previously assembled ESTs for SSRs and identified 1,685 ESTs (3.83%) containing 1,878 SSRs. Next, we designed 1,827 primer pairs complementary to regions flanking these SSRs, and these regions were designated as SSR markers. Among these markers, dinucleotide and trinucleotide repeat motifs were the most abundant types (36.4% and 36.3%, respectively), whereas tetranucleotide, pentanucleotide, and hexanucleotide motifs represented <10% of the markers. The motif AG/CT was the most abundant, accounting for 28.74% of the markers. One hundred EST-SSR markers (97 SSRs located in genes encoding transcription factors and 3 SSRs in genes encoding cellulose synthases) were amplified using polymerase chain reaction for detecting 24 ramie varieties. Of these 100 markers, 98 markers were successfully amplified and 81 markers were polymorphic, with 2-6 alleles among the 24 varieties. Analysis of the genetic diversity of all 24 varieties revealed similarity coefficients that ranged from 0.51 to 0.80. The EST-SSRs developed in this study represent the first large-scale development of SSR markers for ramie. These SSR markers could be used for development of genetic and physical maps, quantitative trait loci mapping, genetic diversity studies, association mapping, and cultivar fingerprinting.
Project description:BACKGROUND: Microsatellites or simple sequence repeats (SSRs) in expressed sequence tags (ESTs) are useful resources for genome analysis because of their abundance, functionality and polymorphism. The advent of commercial second generation sequencing machines has lead to new strategies for developing EST-SSR markers, necessitating the development of bioinformatic framework that can keep pace with the increasing quality and quantity of sequence data produced. We describe an open scheme for analyzing ESTs and developing EST-SSR markers from reads collected by Sanger sequencing and pyrosequencing of sugi (Cryptomeria japonica). RESULTS: We collected 141,097 sequence reads by Sanger sequencing and 1,333,444 by pyrosequencing. After trimming contaminant and low quality sequences, 118,319 Sanger and 1,201,150 pyrosequencing reads were passed to the MIRA assembler, generating 81,284 contigs that were analysed for SSRs. 4,059 SSRs were found in 3,694 (4.54%) contigs, giving an SSR frequency lower than that in seven other plant species with gene indices (5.4-21.9%). The average GC content of the SSR-containing contigs was 41.55%, compared to 40.23% for all contigs. Tri-SSRs were the most common SSRs; the most common motif was AT, which was found in 655 (46.3%) di-SSRs, followed by the AAG motif, found in 342 (25.9%) tri-SSRs. Most (72.8%) tri-SSRs were in coding regions, but 55.6% of the di-SSRs were in non-coding regions; the AT motif was most abundant in 3' untranslated regions. Gene ontology (GO) annotations showed that six GO terms were significantly overrepresented within SSR-containing contigs. Forty-four EST-SSR markers were developed from 192 primer pairs using two pipelines: read2Marker and the newly-developed CMiB, which combines several open tools. Markers resulting from both pipelines showed no differences in PCR success rate and polymorphisms, but PCR success and polymorphism were significantly affected by the expected PCR product size and number of SSR repeats, respectively. EST-SSR markers exhibited less polymorphism than genomic SSRs. CONCLUSIONS: We have created a new open pipeline for developing EST-SSR markers and applied it in a comprehensive analysis of EST-SSRs and EST-SSR markers in C. japonica. The results will be useful in genomic analyses of conifers and other non-model species.
Project description:Curcuma alismatifolia widely used as an ornamental plant in Thailand and Cambodia. This species of herbaceous perennial from the Zingiberaceae family, includes cultivars with a wide range of colours and long postharvest life, and is used as an ornamental cut flower, as a potted plant, and in exterior landscapes. For further genetic improvement, however, little genomic information and no specific molecular markers are available. The present study used Illumina sequencing and de novo transcriptome assembly of two C. alismatifolia cvs, 'Chiang Mai Pink' and 'UB Snow 701', to develop simple sequence repeat markers for genetic diversity studies. After de novo assembly, 62,105 unigenes were generated and 48,813 (78.60%) showed significant similarities versus six functional protein databases. In addition, 9,351 expressed sequence tag-simple sequence repeats (EST-SSRs) were identified with a distribution frequency of 12.5% total unigenes. Out of 8,955 designed EST-SSR primers, 150 primers were selected for the development of potential molecular markers. Among these markers, 17 EST-SSR markers presented a moderate level of genetic diversity among three C. alismatifolia cultivars, one hybrid, three Curcuma, and two Zingiber species. Three different genetic groups within these species were revealed using EST-SSR markers, indicating that the markers developed in this study can be effectively applied to the population genetic analysis of Curcuma and Zingiber species. This report describes the first analysis of transcriptome data of an important ornamental ginger cultivars, also provides a valuable resource for gene discovery and marker development in the genus Curcuma.
Project description:<h4>Background</h4>Date palm (Phoenix dactylifera L.) is an important tree in the Middle East and North Africa due to the nutritional value of its fruit. Molecular Breeding would accelerate genetic improvement of fruit tree through marker assisted selection. However, the lack of molecular markers in date palm restricts the application of molecular breeding.<h4>Results</h4>In this study, we analyzed 28,889 EST sequences from the date palm genome database to identify simple-sequence repeats (SSRs) and to develop gene-based markers, i.e. expressed sequence tag-SSRs (EST-SSRs). We identified 4,609 ESTs as containing SSRs, among which, trinucleotide motifs (69.7%) were the most common, followed by tetranucleotide (10.4%) and dinucleotide motifs (9.6%). The motif AG (85.7%) was most abundant in dinucleotides, while motifs AGG (26.8%), AAG (19.3%), and AGC (16.1%) were most common among trinucleotides. A total of 4,967 primer pairs were designed for EST-SSR markers from the computational data. In a follow up laboratory study, we tested a sample of 20 random selected primer pairs for amplification and polymorphism detection using genomic DNA from date palm cultivars. Nearly one-third of these primer pairs detected DNA polymorphism to differentiate the twelve date palm cultivars used. Functional categorization of EST sequences containing SSRs revealed that 3,108 (67.4%) of such ESTs had homology with known proteins.<h4>Conclusion</h4>Date palm EST sequences exhibits a good resource for developing gene-based markers. These genic markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in date palm, such as diversity study, QTL mapping, and molecular breeding.