Unknown

Dataset Information

0

PGD: a machine learning-based photosynthetic-related gene detection approach.


ABSTRACT:

Background

The primary determinant of crop yield is photosynthetic capacity, which is under the control of photosynthesis-related genes. Therefore, the mining of genes involved in photosynthesis is important for the study of photosynthesis. MapMan Mercator 4 is a powerful annotation tool for assigning genes into proper functional categories; however, in maize, the functions of approximately 22.15% (9520) of genes remain unclear and are labeled "not assigned", which may include photosynthesis-related genes that have not yet been identified. The fast-increasing usage of the machine learning approach in solving biological problems provides us with a new chance to identify novel photosynthetic genes from functional "not assigned" genes in maize.

Results

In this study, we proved the ensemble learning model using a voting eliminates the preferences of single machine learning models. Based on this evaluation, we implemented an ensemble based ML(Machine Learning) methods using a majority voting scheme and observed that including RNA-seq data from multiple photosynthetic mutants rather than only a single mutant could increase prediction accuracy. And we call this approach "A Machine Learning-based Photosynthetic-related Gene Detection approach (PGD)". Finally, we predicted 716 photosynthesis-related genes from the "not assigned" category of maize MapMan annotation. The protein localization prediction (TargetP) and expression trends of these genes from maize leaf sections indicated that the prediction was reliable and robust. And we put this approach online base on google colab.

Conclusions

This study reveals a new approach for mining novel genes related to a specific functional category and provides candidate genes for researchers to experimentally define their biological functions.

SUBMITTER: Wang Y 

PROVIDER: S-EPMC9112524 | biostudies-literature | 2022 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

PGD: a machine learning-based photosynthetic-related gene detection approach.

Wang Yunchuan Y   Dai Xiuru X   Fu Daohong D   Li Pinghua P   Du Baijuan B  

BMC bioinformatics 20220517 1


<h4>Background</h4>The primary determinant of crop yield is photosynthetic capacity, which is under the control of photosynthesis-related genes. Therefore, the mining of genes involved in photosynthesis is important for the study of photosynthesis. MapMan Mercator 4 is a powerful annotation tool for assigning genes into proper functional categories; however, in maize, the functions of approximately 22.15% (9520) of genes remain unclear and are labeled "not assigned", which may include photosynth  ...[more]

Similar Datasets

| S-EPMC8336856 | biostudies-literature
| S-EPMC2585161 | biostudies-literature
2022-09-14 | E-MTAB-11607 | biostudies-arrayexpress
| S-EPMC7591033 | biostudies-literature
| S-EPMC7732714 | biostudies-literature
2024-12-27 | GSE246294 | GEO
| S-EPMC7189237 | biostudies-literature
| S-EPMC10871075 | biostudies-literature
| S-EPMC8372004 | biostudies-literature
| S-EPMC11655100 | biostudies-literature