Project description:Aquatic toxicity is an important issue in pesticide development. In this study, using nine molecular fingerprints to describe pesticides, binary and ternary classification models were constructed to predict aquatic toxicity of pesticides via six machine learning methods: Naïve Bayes (NB), Artificial Neural Network (ANN), k-Nearest Neighbor (kNN), Classification Tree (CT), Random Forest (RF) and Support Vector Machine (SVM). For the binary models, local models were obtained with 829 pesticides on rainbow trout (RT) and 151 pesticides on lepomis (LP), and global models were constructed on the basis of 1258 diverse pesticides on RT and LP and 278 on other fish species. After analyzing the local binary models, we found that fish species caused influence in terms of accuracy. Considering the data size and predictive range, the 1258 pesticides were also used to build global ternary models. The best local binary models were Maccs_ANN for RT and Maccs_SVM for LP, which exhibited accuracies of 0.90 and 0.90, respectively. For global binary models, the best model was Graph_SVM with an accuracy of 0.89. Accuracy of the best global ternary model Graph_SVM was 0.81, which was a little lower than that of the best global binary model. In addition, several substructural alerts were identified including nitrobenzene, chloroalkene and nitrile, which could significantly correlate with pesticide aquatic toxicity. This study provides a useful tool for an early evaluation of pesticide aquatic toxicity in environmental risk assessment.
Project description:Cosmetic residues have been found in water resources, especially trace elements of precursors, couplers, and pigments of hair dyes, which are indiscriminately disposed of in the sewage system. These contaminants are persistent, bioactive, and bioaccumulative, and may pose risks to living beings. Thus, the present study assessed the ecotoxicity of two types of effluents generated in beauty salons after the hair dyeing process. The toxicity of effluent derived from capillary washing with water, shampoo, and conditioner (complete effluent-CE) and effluent not associated with these products (dye effluent-DE) was evaluated by tests carried out with the aquatic organisms Artemia salina, Daphnia similis, and Danio rerio. The bioindicators were exposed to pure samples and different dilutions of both effluents. The results showed toxicity in D. similis (CE50 of 3.43% and 0.54% for CE and DE, respectively); A. salina (LC50 8.327% and 3.874% for CE and DE, respectively); and D. rerio (LC50 of 4.25-4.59% and 7.33-8.18% for CE and DE, respectively). Given these results, we can infer that hair dyes, even at low concentrations, have a high toxic potential for aquatic biota, as they induced deleterious effects in all tested bioindicators.
Project description:Iron is a common pollutant in waters near coal and hard rock mine disturbances. The current 1000 µg/L total recoverable chronic criterion for iron (Fe) for protection of aquatic life in the United States was developed using very limited data in 1976 and has not been revised since. To develop a more scientifically based criterion, several chronic laboratory toxicity experiments (> 30 days) were conducted with ferric Fe at circumneutral pH on a taxonomically diverse group of organisms including brown trout (Salmo trutta), mountain whitefish (Prosopium williamsoni), boreal toad tadpoles (Bufo boreas), the oligochaete worm Lumbriculus variegatus, the mayfly Hexagenia limbata, and the planarian Dugesia dorotocephala. Results of these tests and those of previously published toxicity data were used to derive a Final Chronic Value (FCV) of 499 µg/L by using the US Environmental Protection Agency's recommended methods based on single species toxicity tests. In addition to single species toxicity tests, ferric Fe toxicity experiments (10 days) were performed on mesocosms containing naturally colonized communities of benthic macroinvertebrates. Fourteen genera in the mesocosms occurred at sufficient densities to estimate an iron concentration resulting in 20% reduction in abundance (EC20). Three of these taxa had EC20s less than the FCV of 499 µg/L derived from single species tests: the mayfly Epeorus sp. (335 µg/L), the caddisfly Micrasema sp. (356 µg/L), and midge Tanytarsini (234 µg/L). When mesocosm results were included, the FCV was lowered to 251 µg/L. These findings support the suggestion that modernization of water quality criteria should include data generated from mesocosm experiments and other lines of evidence.
Project description:Aquatic toxicity is a crucial endpoint for evaluating chemically adverse effects on ecosystems. Therefore, we developed in silico methods for the prediction of chemical aquatic toxicity in marine environment. At first, a diverse data set including different crustacean species was constructed. We then built local binary models using Mysidae data and global binary models using Mysidae, Palaemonidae, and Penaeidae data. Molecular fingerprints and descriptors were employed to represent chemical structures separately. All the models were built by six machine learning methods. The AUC (area under the receiver operating characteristic curve) values of the better local and global models were around 0.8 and 0.9 for the test sets, respectively. We also identified several chemicals with selective toxicity on different species. The analysis of selective toxicity would promote to design greener chemicals in a specific environment. Finally, to understand and interpret the models, we explored the relationships between chemical aquatic toxicity and the molecular descriptors. Our study would be helpful in gaining further insights into marine organisms, prediction of chemical aquatic toxicity and prioritization of environmental hazard assessment.
Project description:Machine learning approaches to predict essential genes have gained a lot of traction in recent years. These approaches predominantly make use of sequence and network-based features to predict essential genes. However, the scope of network-based features used by the existing approaches is very narrow. Further, many of these studies focus on predicting essential genes within the same organism, which cannot be readily used to predict essential genes across organisms. Therefore, there is clearly a need for a method that is able to predict essential genes across organisms, by leveraging network-based features. In this study, we extract several sets of network-based features from protein-protein association networks available from the STRING database. Our network features include some common measures of centrality, and also some novel recursive measures recently proposed in social network literature. We extract hundreds of network-based features from networks of 27 diverse organisms to predict the essentiality of 87000+ genes. Our results show that network-based features are statistically significantly better at classifying essential genes across diverse bacterial species, compared to the current state-of-the-art methods, which use mostly sequence and a few 'conventional' network-based features. Our diverse set of network properties gave an AUROC of 0.847 and a precision of 0.320 across 27 organisms. When we augmented the complete set of network features with sequence-derived features, we achieved an improved AUROC of 0.857 and a precision of 0.335. We also constructed a reduced set of 100 sequence and network features, which gave a comparable performance. Further, we show that our features are useful for predicting essential genes in new organisms by using leave-one-species-out validation. Our network features capture the local, global and neighbourhood properties of the network and are hence effective for prediction of essential genes across diverse organisms, even in the absence of other complex biological knowledge. Our approach can be readily exploited to predict essentiality for organisms in interactome databases such as the STRING, where both network and sequence are readily available. All codes are available at https://github.com/RamanLab/nbfpeg.
Project description:A k-nearest neighbor (k-NN) classification model was constructed for 118 RDT NEDO (Repeated Dose Toxicity New Energy and industrial technology Development Organization; currently known as the Hazard Evaluation Support System (HESS)) database chemicals, employing two acute toxicity (LD50)-based classes as a response and using a series of eight PaDEL software-derived fingerprints as predictor variables. A model developed using Estate type fingerprints correctly predicted the LD50 classes for 70 of 94 training set chemicals and 19 of 24 test set chemicals. An individual category was formed for each of the chemicals by extracting its corresponding k-analogs that were identified by k-NN classification. These categories were used to perform the read-across study for prediction of the chronic toxicity, i.e., Lowest Observed Effect Levels (LOEL). We have successfully predicted the LOELs of 54 of 70 training set chemicals (77%) and 14 of 19 test set chemicals (74%) to within an order of magnitude from their experimental LOEL values. Given the success thus far, we conclude that if the k-NN model predicts LD50 classes correctly for a certain chemical, then the k-analogs of such a chemical can be successfully used for data gap filling for the LOEL. This model should support the in silico prediction of repeated dose toxicity.
Project description:Modern industrialization has led to the creation of a wide range of organic chemicals, especially in the form of multicomponent mixtures, thus making the evaluation of environmental pollution more difficult by normal methods. In this paper, we attempt to use forward stepwise multiple linear regression (MLR) and nonlinear radial basis function neural networks (RBFNN) to establish quantitative structure-activity relationship models (QSARs) to predict the toxicity of 79 binary mixtures of aquatic organisms using different hypothetical descriptors. To search for the proper mixture descriptors, 11 mixture rules were performed and tested based on preliminary modeling results. The statistical parameters of the best derived MLR model were Ntrain = 62, R2 = 0.727, RMS = 0.494, F = 159.537, Q2LOO = 0.727, and Q2pred = 0.725 for the training set; and Ntest = 17, R2 = 0.721, RMS = 0.508, F = 38.773, and q2ext = 0.720 for the external test set. The RBFNN model gave the following statistical results: Ntrain = 62, R2 = 0.956, RMS = 0.199, F = 1279.919, Q2LOO = 0.955, and Q2pred = 0.855 for the training set; and Ntest = 17, R2 = 0.880, RMS = 0.367, F = 110.980, and q2ext = 0.853 for the external test set. The quality of the models was assessed by validating the relevant parameters, and the final results showed that the developed models are predictive and can be used for the toxicity prediction of binary mixtures within their applicability domain.
Project description:The application of layered double hydroxide (LDH) nanomaterials as catalysts has attracted great interest due to their unique structural features. It also triggered the need to study their fate and behavior in the aquatic environment. In the present study, Zn-Fe nanolayered double hydroxides (Zn-Fe LDHs) were synthesized using a co-precipitation method and characterized by X-ray diffraction (XRD), Fourier transform infrared spectroscopy (FT-IR), scanning electron microscopy (SEM), and nitrogen adsorption-desorption analyses. The toxicity of the home-made Zn-Fe LDHs catalyst was examined by employing a variety of aquatic organisms from different trophic levels, namely the marine photobacterium Vibrio fischeri, the freshwater microalga Pseudokirchneriella subcapitata, the freshwater crustacean Daphnia magna, and the duckweed Spirodela polyrhiza. From the experimental results, it was evident that the acute toxicity of the catalyst depended on the exposure time and type of selected test organism. Zn-Fe LDHs toxicity was also affected by its physical state in suspension, chemical composition, as well as interaction with the bioassay test medium.
Project description:Members of the family Coronaviridae are evolutionarily related and play an important role in human and veterinary medicine. Taxonomic classification is based on the ultrastructure and morphogenesis of viral particles and on biochemical and molecular features. The family Coronaviridae belongs to the order Nidovirales, and is divided into two subfamilies: Coronavirinae and Torovirinae. The number of coronaviruses isolated from aquatic organisms is negligible; indeed, coronaviruses have only been identified in aquatic mammals, including harbor seal (genus Alphacoronavirus), bottlenose dolphin and beluga whale (genus Gammacoronavirus). White bream virus, isolated from the teleost Blicca bjoerkna (L.), is the type species of the genus Bafinivirus within the subfamily, Torovirinae.