Comparative mapping in the Fagaceae and beyond with EST-SSRs.
ABSTRACT: BACKGROUND: Genetic markers and linkage mapping are basic prerequisites for comparative genetic analyses, QTL detection and map-based cloning. A large number of mapping populations have been developed for oak, but few gene-based markers are available for constructing integrated genetic linkage maps and comparing gene order and QTL location across related species. RESULTS: We developed a set of 573 expressed sequence tag-derived simple sequence repeats (EST-SSRs) and located 397 markers (EST-SSRs and genomic SSRs) on the 12 oak chromosomes (2n?=?2x?=?24) on the basis of Mendelian segregation patterns in 5 full-sib mapping pedigrees of two species: Quercus robur (pedunculate oak) and Quercus petraea (sessile oak). Consensus maps for the two species were constructed and aligned. They showed a high degree of macrosynteny between these two sympatric European oaks. We assessed the transferability of EST-SSRs to other Fagaceae genera and a subset of these markers was mapped in Castanea sativa, the European chestnut. Reasonably high levels of macrosynteny were observed between oak and chestnut. We also obtained diversity statistics for a subset of EST-SSRs, to support further population genetic analyses with gene-based markers. Finally, based on the orthologous relationships between the oak, Arabidopsis, grape, poplar, Medicago, and soybean genomes and the paralogous relationships between the 12 oak chromosomes, we propose an evolutionary scenario of the 12 oak chromosomes from the eudicot ancestral karyotype. CONCLUSIONS: This study provides map locations for a large set of EST-SSRs in two oak species of recognized biological importance in natural ecosystems. This first step toward the construction of a gene-based linkage map will facilitate the assignment of future genome scaffolds to pseudo-chromosomes. This study also provides an indication of the potential utility of new gene-based markers for population genetics and comparative mapping within and beyond the Fagaceae.
Project description:A comparative genetic and QTL mapping was performed between Quercus robur L. and Castanea sativa Mill., two major forest tree species belonging to the Fagaceae family. Oak EST-derived markers (STSs) were used to align the 12 linkage groups of the two species. Fifty-one and 45 STSs were mapped in oak and chestnut, respectively. These STSs, added to SSR markers previously mapped in both species, provided a total number of 55 orthologous molecular markers for comparative mapping within the Fagaceae family. Homeologous genomic regions identified between oak and chestnut allowed us to compare QTL positions for three important adaptive traits. Colocation of the QTL controlling the timing of bud burst was significant between the two species. However, conservation of QTL for height growth was not supported by statistical tests. No QTL for carbon isotope discrimination was conserved between the two species. Putative candidate genes for bud burst can be identified on the basis of colocations between EST-derived markers and QTL.
Project description:<h4>Background</h4>Expressed Sequence Tags (ESTs) are a source of simple sequence repeats (SSRs) that can be used to develop molecular markers for genetic studies. The availability of ESTs for Quercus robur and Quercus petraea provided a unique opportunity to develop microsatellite markers to accelerate research aimed at studying adaptation of these long-lived species to their environment. As a first step toward the construction of a SSR-based linkage map of oak for quantitative trait locus (QTL) mapping, we describe the mining and survey of EST-SSRs as well as a fast and cost-effective approach (bin mapping) to assign these markers to an approximate map position. We also compared the level of polymorphism between genomic and EST-derived SSRs and address the transferability of EST-SSRs in Castanea sativa (chestnut).<h4>Results</h4>A catalogue of 103,000 Sanger ESTs was assembled into 28,024 unigenes from which 18.6% presented one or more SSR motifs. More than 42% of these SSRs corresponded to trinucleotides. Primer pairs were designed for 748 putative unigenes. Overall 37.7% (283) were found to amplify a single polymorphic locus in a reference full-sib pedigree of Quercus robur. The usefulness of these loci for establishing a genetic map was assessed using a bin mapping approach. Bin maps were constructed for the male and female parental tree for which framework linkage maps based on AFLP markers were available. The bin set consisting of 14 highly informative offspring selected based on the number and position of crossover sites. The female and male maps comprised 44 and 37 bins, with an average bin length of 16.5 cM and 20.99 cM, respectively. A total of 256 EST-SSRs were assigned to bins and their map position was further validated by linkage mapping. EST-SSRs were found to be less polymorphic than genomic SSRs, but their transferability rate to chestnut, a phylogenetically related species to oak, was higher.<h4>Conclusion</h4>We have generated a bin map for oak comprising 256 EST-SSRs. This resource constitutes a first step toward the establishment of a gene-based map for this genus that will facilitate the dissection of QTLs affecting complex traits of ecological importance.
Project description:BACKGROUND: The Fagaceae family comprises about 1,000 woody species worldwide. About half belong to the Quercus family. These oaks are often a source of raw material for biomass wood and fiber. Pedunculate and sessile oaks, are among the most important deciduous forest tree species in Europe. Despite their ecological and economical importance, very few genomic resources have yet been generated for these species. Here, we describe the development of an EST catalogue that will support ecosystem genomics studies, where geneticists, ecophysiologists, molecular biologists and ecologists join their efforts for understanding, monitoring and predicting functional genetic diversity. RESULTS: We generated 145,827 sequence reads from 20 cDNA libraries using the Sanger method. Unexploitable chromatograms and quality checking lead us to eliminate 19,941 sequences. Finally a total of 125,925 ESTs were retained from 111,361 cDNA clones. Pyrosequencing was also conducted for 14 libraries, generating 1,948,579 reads, from which 370,566 sequences (19.0%) were eliminated, resulting in 1,578,192 sequences. Following clustering and assembly using TGICL pipeline, 1,704,117 EST sequences collapsed into 69,154 tentative contigs and 153,517 singletons, providing 222,671 non-redundant sequences (including alternative transcripts). We also assembled the sequences using MIRA and PartiGene software and compared the three unigene sets. Gene ontology annotation was then assigned to 29,303 unigene elements. Blast search against the SWISS-PROT database revealed putative homologs for 32,810 (14.7%) unigene elements, but more extensive search with Pfam, Refseq_protein, Refseq_RNA and eight gene indices revealed homology for 67.4% of them. The EST catalogue was examined for putative homologs of candidate genes involved in bud phenology, cuticle formation, phenylpropanoids biosynthesis and cell wall formation. Our results suggest a good coverage of genes involved in these traits. Comparative orthologous sequences (COS) with other plant gene models were identified and allow to unravel the oak paleo-history. Simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs) were searched, resulting in 52,834 SSRs and 36,411 SNPs. All of these are available through the Oak Contig Browser http://genotoul-contigbrowser.toulouse.inra.fr:9092/Quercus_robur/index.html. CONCLUSIONS: This genomic resource provides a unique tool to discover genes of interest, study the oak transcriptome, and develop new markers to investigate functional diversity in natural populations.
Project description:BACKGROUND: Genetic markers and linkage mapping are basic prerequisites for marker-assisted selection and map-based cloning. In the case of the key grassland species Lolium spp., numerous mapping populations have been developed and characterised for various traits. Although some genetic linkage maps of these populations have been aligned with each other using publicly available DNA markers, the number of common markers among genetic maps is still low, limiting the ability to compare candidate gene and QTL locations across germplasm. RESULTS: A set of 204 expressed sequence tag (EST)-derived simple sequence repeat (SSR) markers has been assigned to map positions using eight different ryegrass mapping populations. Marker properties of a subset of 64 EST-SSRs were assessed in six to eight individuals of each mapping population and revealed 83% of the markers to be polymorphic in at least one population and an average number of alleles of 4.88. EST-SSR markers polymorphic in multiple populations served as anchor markers and allowed the construction of the first comprehensive consensus map for ryegrass. The integrated map was complemented with 97 SSRs from previously published linkage maps and finally contained 284 EST-derived and genomic SSR markers. The total map length was 742 centiMorgan (cM), ranging for individual chromosomes from 70 cM of linkage group (LG) 6 to 171 cM of LG 2. CONCLUSIONS: The consensus linkage map for ryegrass based on eight mapping populations and constructed using a large set of publicly available Lolium EST-SSRs mapped for the first time together with previously mapped SSR markers will allow for consolidating existing mapping and QTL information in ryegrass. Map and markers presented here will prove to be an asset in the development for both molecular breeding of ryegrass as well as comparative genetics and genomics within grass species.
Project description:We analyzed the genetic mosaic of speciation in two hybridizing Mediterranean white oaks from the Iberian Peninsula (Quercus faginea Lamb. and Quercus pyrenaica Willd.). The two species show ecological divergence in flowering phenology, leaf morphology and composition, and in their basic or acidic soil preferences. Ninety expressed sequence tag-simple sequence repeats (EST-SSRs) and eight nuclear SSRs were genotyped in 96 trees from each species. Genotyping was designed in two steps. First, we used 69 markers evenly distributed over the 12 linkage groups (LGs) of the oak linkage map to confirm the species genetic identity of the sampled genotypes, and searched for differentiation outliers. Then, we genotyped 29 additional markers from the chromosome bins containing the outliers and repeated the multilocus scans. We found one or two additional outliers within four saturated bins, thus confirming that outliers are organized into clusters. Linkage disequilibrium (LD) was extensive; even for loosely linked and for independent markers. Consequently, score tests for association between two-marker haplotypes and the 'species trait' showed a broad genomic divergence, although substantial variation across the genome and within LGs was also observed. We discuss the influence of several confounding effects on neutrality tests and review the evolutionary processes leading to extensive LD. Finally, we examine how LD analyses within regions that contain outlier clusters and quantitative trait loci can help to identify regions of divergence and/or genomic hitchhiking in the light of predictions from ecological speciation theory.
Project description:Chloroplast DNA (cpDNA) is frequently used for species demography, evolution, and species discrimination of plants. However, the lack of efficient and universal markers often brings particular challenges for genetic studies across different plant groups. In this study, chloroplast genomes from two closely related species (Quercus rubra and Castanea mollissima) in Fagaceae were compared to explore universal cpDNA markers for the Chinese oak species in Quercus subgenus Quercus, a diverse species group without sufficient molecular differentiation. With the comparison, nine and 14 plastid markers were selected as barcoding and phylogeographic candidates for the Chinese oaks. Five (psbA-trnH, matK-trnK, ycf3-trnS, matK, and ycf1) of the nine plastid candidate barcodes, with the addition of newly designed ITS and a single-copy nuclear gene (SAP), were then tested on 35 Chinese oak species employing four different barcoding approaches (genetic distance-, BLAST-, character-, and tree-based methods). The four methods showed different species identification powers with character-based method performing the best. Of the seven barcodes tested, a barcoding gap was absent in all of them across the Chinese oaks, while ITS and psbA-trnH provided the highest species resolution (30.30%) with the character- and BLAST-based methods, respectively. The six-marker combination (psbA-trnH + matK-trnK + matK + ycf1 + ITS + SAP) showed the best species resolution (84.85%) using the character-based method for barcoding the Chinese oaks. The barcoding results provided additional implications for taxonomy of the Chinese oaks in subg. Quercus, basically identifying three major infrageneric clades of the Chinese oaks (corresponding to Groups Quercus, Cerris, and Ilex) referenced to previous phylogenetic classification of Quercus. While the morphology-based allocations proposed for the Chinese oaks in subg. Quercus were challenged. A low variation rate of the chloroplast genome, and complex speciation patterns involving incomplete lineage sorting, interspecific hybridization and introgression, possibly have negative impacts on the species assignment and phylogeny of oak species.
Project description:Genic microsatellite markers, also known as functional markers, are preferred over anonymous markers as they reveal the variation in transcribed genes among individuals. In this study, we developed a total of 707 expressed sequence tag-derived simple sequence repeat markers (EST-SSRs) and used for development of a high-density integrated map using four individual mapping populations of B. rapa. This map contains a total of 1426 markers, consisting of 306 EST-SSRs, 153 intron polymorphic markers, 395 bacterial artificial chromosome-derived SSRs (BAC-SSRs), and 572 public SSRs and other markers covering a total distance of 1245.9 cM of the B. rapa genome. Analysis of allelic diversity in 24 B. rapa germplasm using 234 mapped EST-SSR markers showed amplification of 2 alleles by majority of EST-SSRs, although amplification of alleles ranging from 2 to 8 was found. Transferability analysis of 167 EST-SSRs in 35 species belonging to cultivated and wild brassica relatives showed 42.51% (Sysimprium leteum) to 100% (B. carinata, B. juncea, and B. napus) amplification. Our newly developed EST-SSRs and high-density linkage map based on highly transferable genic markers would facilitate the molecular mapping of quantitative trait loci and the positional cloning of specific genes, in addition to marker-assisted selection and comparative genomic studies of B. rapa with other related species.
Project description:The 35S ribosomal DNA (rDNA) units, repeated in tandem at one or more chromosomal loci, are separated by an intergenic spacer (IGS) containing functional elements involved in the regulation of transcription of downstream rRNA genes. In the present work, we have compared the IGS molecular organizations in two divergent species of Fagaceae, Fagus sylvatica and Quercus suber, aiming to comprehend the evolution of the IGS sequences within the family. Self- and cross-hybridization FISH was done on representative species of the Fagaceae. The IGS length variability and the methylation level of 18 and 25S rRNA genes were assessed in representatives of three genera of this family: Fagus, Quercus and Castanea. The intergenic spacers in Beech and Cork Oak showed similar overall organizations comprising putative functional elements needed for rRNA gene activity and containing a non-transcribed spacer (NTS), a promoter region, and a 5'-external transcribed spacer. In the NTS: the sub-repeats structure in Beech is more organized than in Cork Oak, sharing some short motifs which results in the lowest sequence similarity of the entire IGS; the AT-rich region differed in both spacers by a GC-rich block inserted in Cork Oak. The 5'-ETS is the region with the higher similarity, having nonetheless different lengths. FISH with the NTS-5'-ETS revealed fainter signals in cross-hybridization in agreement with the divergence between genera. The diversity of IGS lengths revealed variants from ? 2 kb in Fagus, and Quercus up to 5.3 kb in Castanea, and a lack of correlation between the number of variants and the number of rDNA loci in several species. Methylation of 25S Bam HI site was confirmed in all species and detected for the first time in the 18S of Q. suber and Q. faginea. These results provide important clues for the evolutionary trends of the rDNA 25S-18S IGS in the Fagaceae family.
Project description:Simple sequence repeats (SSRs) are among the most important markers for population analysis and have been widely used in plant genetic mapping and molecular breeding. Expressed sequence tag-SSR (EST-SSR) markers, located in the coding regions, are potentially more efficient for QTL mapping, gene targeting, and marker-assisted breeding. In this study, we investigated 51,694 nonredundant unigenes, assembled from clean reads from deep transcriptome sequencing with a Solexa/Illumina platform, for identification and development of EST-SSRs in Chinese cabbage. In total, 10,420 EST-SSRs with over 12?bp were identified and characterized, among which 2744 EST-SSRs are new and 2317 are known ones showing polymorphism with previously reported SSRs. A total of 7877 PCR primer pairs for 1561 EST-SSR loci were designed, and primer pairs for twenty-four EST-SSRs were selected for primer evaluation. In nineteen EST-SSR loci (79.2%), amplicons were successfully generated with high quality. Seventeen (89.5%) showed polymorphism in twenty-four cultivars of Chinese cabbage. The polymorphic alleles of each polymorphic locus were sequenced, and the results showed that most polymorphisms were due to variations of SSR repeat motifs. The EST-SSRs identified and characterized in this study have important implications for developing new tools for genetics and molecular breeding in Chinese cabbage.
Project description:The availability of large expressed sequence tag (EST) and whole genome databases of oil palm enabled the development of a data base of microsatellite markers. For this purpose, an EST database consisting of 40,979 EST sequences spanning 27?Mb and a chromosome-wise whole genome databases were downloaded. A total of 3,950 primer pairs were identified and developed from EST sequences. The tri and tetra nucleotide repeat motifs were most prevalent (each 24.75%) followed by di-nucleotide repeat motifs. Whole genome-wide analysis found a total of 245,654 SSR repeats across the 16 chromosomes of oil palm, of which 38,717 were compound microsatellite repeats. A web application, OpSatdb, the first microsatellite database of oil palm, was developed using the PHP and MySQL database ( https://ssr.icar.gov.in/index.php ). It is a simple and systematic web-based search engine for searching SSRs based on repeat motif type, repeat type, and primer details. High synteny was observed between oil palm and rice genomes. The mapping of ESTs having SSRs by Blast2GO resulted in the identification of 19.2% sequences with gene ontology (GO) annotations. Randomly, a set of ten genic SSRs and five genomic SSRs were used for validation and genetic diversity on 100 genotypes belonging to the world oil palm genetic resources. The grouping pattern was observed to be broadly in accordance with the geographical origin of the genotypes. The identified genic and genome-wide SSRs can be effectively useful for various genomic applications of oil palm, such as genetic diversity, linkage map construction, mapping of QTLs, marker-assisted selection, and comparative population studies.