Project description:The availability of well-assembled genome sequences and reduced sequencing costs have enabled the resequencing of many additional accessions in several crops, thus facilitating the rapid discovery and development of simple sequence repeat (SSR) markers. Although the genome sequence of inbred spinach line Sp75 is available, previous efforts have resulted in a limited number of useful SSR markers. Identification of additional polymorphic SSR markers will support genetics and breeding research in spinach. This study aimed to use the available genomic resources to mine and catalog a large number of polymorphic SSR markers. A search for SSR loci on six chromosome sequences of spinach line Sp75 using GMATA identified a total of 42,155 loci with repeat motifs of two to six nucleotides in the Sp75 reference genome. Whole-genome sequences (30x) of additional 21 accessions were aligned against the chromosome sequences of the reference genome and in silico genotyped using the HipSTR program by comparing and counting repeat numbers variation across the SSR loci among the accessions. The HipSTR program generated SSR genotype data were filtered for monomorphic and high missing loci, and a final set of the 5986 polymorphic SSR loci were identified. The polymorphic SSR loci were present at a density of 12.9 SSRs/Mb and were physically mapped. Out of 36 randomly selected SSR loci for validation, two failed to amplify, while the remaining were all polymorphic in a set of 48 spinach accessions from 34 countries. Genetic diversity analysis performed using the SSRs allele score data on the 48 spinach accessions showed three main population groups. This strategy to mine and develop polymorphic SSR markers by a comparative analysis of the genome sequences of multiple accessions and computational genotyping of the candidate SSR loci eliminates the need for laborious experimental screening. Our approach increased the efficiency of discovering a large set of novel polymorphic SSR markers, as demonstrated in this report.
Project description:Cannabis has been used as a source of nutrition, medicine, and fiber. However, lack of genomic simple sequence repeat (SSR) markers had limited the genetic research on Cannabis species. In the present study, 92,409 motifs were identified, and 63,707 complementary SSR primer pairs were developed. The most abundant SSR motifs had six repeat units (36.60%). The most abundant type of motif was dinucleotides (70.90%), followed by trinucleotides, tetranucleotides, and pentanucleotides. We randomly selected 80 pairs of genomic SSR markers, of which 69 (86.25%) were amplified successfully; 59 (73.75%) of these were polymorphic. Genetic diversity and population structure were estimated using the 59 (72 loci) validated polymorphic SSRs and three phenotypic markers. Three hundred ten alleles were identified, and the major allele frequency ranged from 0.26 to 0.85 (average: 0.56), Nei's genetic diversity ranged from 0.28 to 0.82 (average: 0.56), and the expected heterozygosity ranged from 0.28 to 0.81 (average: 0.56). The polymorphism information content ranged from 0.25 to 0.79 (average: 0.50), the observed number of alleles ranged from 2 to 8 (average: 4.13), and the effective number of alleles ranged from 0.28 to 0.81 (average: 0.5). The Cannabis population did not show mutation-drift equilibrium following analysis via the infinite allele model. A cluster analysis was performed using the unweighted pair group method using arithmetic means based on genetic distances. Population structure analysis was used to divide the germplasms into two subgroups. These results provide guidance for the molecular breeding and further investigation of Cannabis.
Project description:BackgroundMaize (Zea mays ssp. mays L.), as the most important plant for staple food of several million people, animal feed and bioenergy productions, is widely cultivated around the world. Simple sequence repeats (SSRs) are widely used as molecular markers in maize genetics and breeding, but only two thousands pairs of SSRs have been published currently, which hardly satisfies for the increasing needs of geneticists and breeders. Furthermore, the increasing studies have revealed that SSRs also play a vital role in functional regulation and evolution. It is fortunate that the development of sequencing technology and bio-software provides the basis for characterization and development of SSRs in maize.ResultsIn this study, MISA was applied to identify overall 179,681 SSRs in maize reference genome B73, with an average distance of 11.46 Kbp. Their distributions within the genome in different regions were non-random, and the density followed in a descending order of UTR, promotor, intron, intergenic and CDS. Meanwhile, 82,694 (46.02%) SSRs with unique flanking sequences were selected, and then applied to analyze the polymorphism of next-generation sequencing data from 345 maize inbred lines and data from maize reference genome B73. There were 58,946 SSRs with length information results in ten or more than ten genomes, accounting for 71.28% of SSRs with unique flanking sequences, while 55,621 SSRs had polymorphism, with an average PIC value of 0.498. 250 pairs of SSR primers in different genomic regions covering all maize chromosomes were randomly chosen for the experimental validation, with an average PIC value of 0.63 in 11 elite maize inbred lines.ConclusionsOur work provided insight into the non-random distribution spatterns and compositions of SSRs in different regions of maize genome, and also developed more polymorphic SSR markers using next-generation sequencing reads. The genome-wide SSRs polymorphism markers could be useful for genetic analysis and marker-assisted selection in breeding practice, and it was also proved to be high efficient for molecular marker development via next-generation sequencing reads.
Project description:BackgroundMicrosatellites or simple sequence repeats (SSRs) have become the most significant DNA marker technology used in genetic research. The availability of complete draft genomes for a number of Palmae species has made it possible to perform genome-wide analysis of SSRs in these species. Palm trees are tropical and subtropical plants with agricultural and economic importance due to the nutritional value of their fruit cultivars.ObjectiveThis is the first comprehensive study examining and comparing microsatellites in completely-sequenced draft genomes of Palmae species.MethodsWe identified and compared perfect SSRs with 1-6 bp nucleotide motifs to characterize microsatellites in Palmae species using PERF v0.2.5. We analyzed their relative abundance, relative density, and GC content in five palm species: Phoenix dactylifera, Cocos nucifera, Calamus simplicifolius, Elaeis oleifera, and Elaeis guineensis.ResultsA total of 118241, 328189, 450753, 176608, and 70694 SSRs were identified, respectively. The six repeat types were not evenly distributed across the five genomes. Mono- and dinucleotide SSRs were the most abundant, and GC content was highest in tri- and hexanucleotide SSRs.ConclusionWe envisage that this analysis would further substantiate more in-depth computational, biochemical, and molecular studies on the roles SSRs may play in the genome organization of the palm species. The current study contributes a detailed characterization of simple sequence repeats in palm genomes.
Project description:Wheat genotypes should be improved through available germplasm genetic diversity to ensure food security. This study investigated the molecular diversity and population structure of a set of Türkiye bread wheat genotypes using 120 microsatellite markers. Based on the results, 651 polymorphic alleles were evaluated to determine genetic diversity and population structure. The number of alleles ranged from 2 to 19, with an average of 5.44 alleles per locus. Polymorphic information content (PIC) ranged from 0.031 to 0.915 with a mean of 0.43. In addition, the gene diversity index ranged from 0.03 to 0.92 with an average of 0.46. The expected heterozygosity ranged from 0.00 to 0.359 with a mean of 0.124. The unbiased expected heterozygosity ranged from 0.00 to 0.319 with an average of 0.112. The mean values of the number of effective alleles (Ne), genetic diversity of Nei (H) and Shannon's information index (I) were estimated at 1.190, 1.049 and 0.168, respectively. The highest genetic diversity (GD) was estimated between genotypes G1 and G27. In the UPGMA dendrogram, the 63 genotypes were grouped into three clusters. The three main coordinates were able to explain 12.64, 6.38 and 4.90% of genetic diversity, respectively. AMOVA revealed diversity within populations at 78% and between populations at 22%. The current populations were found to be highly structured. Model-based cluster analyses classified the 63 genotypes studied into three subpopulations. The values of F-statistic (Fst) for the identified subpopulations were 0.253, 0.330 and 0.244, respectively. In addition, the expected values of heterozygosity (He) for these sub-populations were recorded as 0.45, 0.46 and 0.44, respectively. Therefore, SSR markers can be useful not only in genetic diversity and association analysis of wheat but also in its germplasm for various agronomic traits or mechanisms of tolerance to environmental stresses.
Project description:Japanese quail is still used as a model for poultry research because of their usefulness as laying, meat, and laboratory animals. Microsatellite markers are the most widely used molecular markers, due to their relative ease of scoring and high levels of polymorphism. The objective of the research was to determine genetic diversity and population genetic structures of selected Japanese quail lines (high body weight 1 [HBW1], HBW2, low body weight [LBW], and layer [L]) throughout 15th generations and an unselected control (C). A total of 69 individuals from five quail lines were genotyped by fifteen microsatellite markers. When analyzed profiles of the markers the observed (Ho) and expected (He) heterozygosity ranged from 0.04 (GUJ0027) to 0.64 (GUJ0087) and 0.21 (GUJ0027) to 0.84 (GUJ0037), respectively. Also, Ho and He were separated from 0.30 (L and LBW) to 0.33 (C and HBW2) and from 0.52 (HBW2) to 0.58 (L and LBW), respectively. The mean polymorphic information content (PIC) ranged from 0.46 (HBW2) to 0.52 (L). Approximately half of the markers were informative (PIC≥0.50). Genetic distances were calculated from 0.09 (HBW1 and HBW2) to 0.33 (C and L). Phylogenetic dendrogram showed that the quail lines were clearly defined by the microsatellite markers used here. Bayesian model-based clustering supported the results from the phylogenetic tree. These results reflect that the set of studied markers can be used effectively to capture the magnitude of genetic variability in selected Japanese quail lines. Also, to identify markers and alleles which are specific to the divergence lines, further generations of selection are required.
Project description:BackgroundPear (Pyrus spp.) is an economically important temperate fruit tree worldwide. In the past decade, significant progress has been made in pear molecular genetics based on DNA research, but the number of molecular markers is still quite limited, which hardly satisfies the increasing needs of geneticists and breeders.ResultsIn this study, a total of 156,396 simple sequence repeat (SSR) loci were identified from a genome sequence of Pyrus bretschneideri 'Dangshansuli'. A total of 101,694 pairs of SSR primers were designed from the SSR loci, and 80,415 of the SSR loci were successfully located on 17 linkage groups (LGs). A total of 534 primer pairs were synthesized and preliminarily screened in four pear cultivars, and of these, 332 primer pairs were selected as clear, stable, and polymorphic SSR markers. Eighteen polymorphic SSR markers were randomly selected from the 332 polymorphic SSR markers in order to perform a further analysis of the genetic diversity among 44 pear cultivars. The 14 European pears and their hybrid materials were clustered into one group (European pear group); 29 Asian pear cultivars were clustered into one group (Asian pear group); and the Zangli pear cultivar 'Deqinli' from Yunnan Province, China, was grouped in an independent group, which suggested that the cultivar 'Deqinli' is a distinct and valuable germplasm resource. The population structure analysis partitioned the 44 cultivars into two populations, Pop 1 and Pop 2. Pop 2 was further divided into two subpopulations. Results from the population structure analysis were generally consistent with the results from the UPGMA cluster analysis.ConclusionsThe results of the present study showed that the use of next-generating sequencing to develop SSR markers is fast and effective, and the developed SSR markers can be utilized by researchers and breeders for future pear improvement.
Project description:Tropical rainforests in Southeast Asia are enriched by multifarious biota dominated by Dipterocarpaceae. In this family, Shorea robusta is an ecologically sensitive and economically important timber species whose genomic diversity and phylogeny remain understudied due to lack of datasets on genetic resources. Smattering availability of molecular markers impedes population genetic studies indicating a necessity to develop genomic databases and species-specific markers in S. robusta. Accordingly, the present study focused on fostering de novo low-depth genome sequencing, identification of reliable microsatellites markers, and their validation in various populations of S. robusta in Uttarakhand Himalayas. With 69.88 million raw reads assembled into 1,97,489 contigs (read mapped to 93.2%) and a genome size of 357.11 Mb (29 × coverage), Illumina paired-end sequencing technology arranged a library of sequence data of ~ 10 gigabases (Gb). From 57,702 microsatellite repeats, a total of 35,049 simple sequence repeat (SSR) primer pairs were developed. Afterward, among randomly selected 60 primer pairs, 50 showed successful amplification and 24 were found as polymorphic. Out of which, nine polymorphic loci were further used for genetic analysis in 16 genotypes each from three different geographical locations of Uttarakhand (India). Prominently, the average number of alleles per locus (Na), observed heterozygosity (Ho), expected heterozygosity (He), and the polymorphism information content (PIC) were recorded as 2.44, 0.324, 0.277 and 0.252, respectively. The accessibility of sequence information and novel SSR markers potentially enriches the current knowledge of the genomic background for S. robusta and to be utilized in various genetic studies in species under tribe Shoreae.
Project description:BackgroundCucumber, Cucumis sativus L. is an important vegetable crop worldwide. Until very recently, cucumber genetic and genomic resources, especially molecular markers, have been very limited, impeding progress of cucumber breeding efforts. Microsatellites are short tandemly repeated DNA sequences, which are frequently favored as genetic markers due to their high level of polymorphism and codominant inheritance. Data from previously characterized genomes has shown that these repeats vary in frequency, motif sequence, and genomic location across taxa. During the last year, the genomes of two cucumber genotypes were sequenced including the Chinese fresh market type inbred line '9930' and the North American pickling type inbred line 'Gy14'. These sequences provide a powerful tool for developing markers in a large scale. In this study, we surveyed and characterized the distribution and frequency of perfect microsatellites in 203 Mbp assembled Gy14 DNA sequences, representing 55% of its nuclear genome, and in cucumber EST sequences. Similar analyses were performed in genomic and EST data from seven other plant species, and the results were compared with those of cucumber.ResultsA total of 112,073 perfect repeats were detected in the Gy14 cucumber genome sequence, accounting for 0.9% of the assembled Gy14 genome, with an overall density of 551.9 SSRs/Mbp. While tetranucleotides were the most frequent microsatellites in genomic DNA sequence, dinucleotide repeats, which had more repeat units than any other SSR type, had the highest cumulative sequence length. Coding regions (ESTs) of the cucumber genome had fewer microsatellites compared to its genomic sequence, with trinucleotides predominating in EST sequences. AAG was the most frequent repeat in cucumber ESTs. Overall, AT-rich motifs prevailed in both genomic and EST data. Compared to the other species examined, cucumber genomic sequence had the highest density of SSRs (although comparable to the density of poplar, grapevine and rice), and was richest in AT dinucleotides. Using an electronic PCR strategy, we investigated the polymorphism between 9930 and Gy14 at 1,006 SSR loci, and found unexpectedly high degree of polymorphism (48.3%) between the two genotypes. The level of polymorphism seems to be positively associated with the number of repeat units in the microsatellite. The in silico PCR results were validated empirically in 660 of the 1,006 SSR loci. In addition, primer sequences for more than 83,000 newly-discovered cucumber microsatellites, and their exact positions in the Gy14 genome assembly were made publicly available.ConclusionsThe cucumber genome is rich in microsatellites; AT and AAG are the most abundant repeat motifs in genomic and EST sequences of cucumber, respectively. Considering all the species investigated, some commonalities were noted, especially within the monocot and dicot groups, although the distribution of motifs and the frequency of certain repeats were characteristic of the species examined. The large number of SSR markers developed from this study should be a significant contribution to the cucurbit research community.