Project description:The tomato hind (Cephalopholis sonnerati) is an emerging economically important grouper in recent years. With the increasing maturity of sequencing technologies and assembly methodologies, a higher quality reference genome has become both accessible and necessary. In this study, we present two telomere-to-telomere (T2T) gap-free haplotype assemblies of the tomato hind with lengths of 1039.53 Mb (YSFRI_Csonn_HA_1.0, N50 43.83 Mb) and 1039.91 Mb (YSFRI_Csonn_HB_1.0, N50 44.09 Mb). Reads from next-generation sequencing, ONT ultra-long sequencing, and PacBio HiFi sequencing exhibited mapping rates exceeding 99.8% when aligned to these two assemblies. Evaluation using Merqury indicated high accuracy for both assemblies, with average quality values of 51.80 and 51.83, respectively. Percentages of 97.9% and 97.8% of complete BUSCOs were achieved, and a total of 23,270 and 23,184 protein-code genes were inferred in each assembly. Moreover, telomere identification, centromere prediction, and repetitive sequence annotation were also successfully performed. These two assemblies provide robust foundation for the genetic analysis and development of molecular genetic breeding technologies in C. sonnerati.
Project description:Simple Summary C. sonnerati is an important marine fish species in coral reef ecosystems and has huge commercial value. This species serves as an excellent research model due to its complex social structures and behavior mechanisms. Nevertheless, owing to the lack of genomic resources, molecular genetic studies and genomic breeding remain unexplored in this species. Therefore, it is important to obtain more genome sequences of Cephalopholis grouper species for research on the classification, evolution, genetics, and biology of groupers. In the present study, we first assembled a high-quality, chromosome-level C.sonnerati genome, providing a valuable genome resource for further studies of the genetic conservation, resistance breeding, and evolution of C. sonnerati. Abstract The tomato hind, Cephalopholis sonnerati, is a bottom-dwelling coral reef fish, which is widely distributed in the Indo-Pacific and Red Sea. C. sonnerati also features complex social structures and behaviour mechanisms. Here, we present a high-quality, chromosome-level genome assembly for C. sonnerati that was derived using PacBio sequencing and Hi-C technologies. A 1043.66 Mb genome with an N50 length of 2.49 Mb was assembled, produced containing 795 contigs assembled into 24 chromosomes. Overall, 97.2% of the complete BUSCOs were identified in the genome. A total of 26,130 protein-coding genes were predicted, of which 94.26% were functionally annotated. Evolutionary analysis revealed that C. sonnerati diverged from its common ancestor with E. lanceolatus and E. akaara approximately 41.7 million years ago. In addition, comparative genome analyses indicated that the expanded gene families were highly enriched in the sensory system. Finally, we found the tissue-specific expression of 8108 genes. We found that these tissue-specific genes were highly enriched in the brain. In brief, the high-quality, chromosome-level reference genome will provide a valuable genome resource for studies of the genetic conservation, resistance breeding, and evolution of C. sonnerati.