Project description:Disasters caused by mine water inflows significantly threaten the safety of coal mining operations. Deep mining complicates the acquisition of hydrogeological parameters, the mechanics of water inrush, and the prediction of sudden changes in mine water inflow. Traditional models and singular machine learning approaches often fail to accurately forecast abrupt shifts in mine water inflows. This study introduces a novel coupled decomposition-optimization-deep learning model that integrates Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN), Northern Goshawk Optimization (NGO), and Long Short-Term Memory (LSTM) networks. We evaluate three types of mine water inflow forecasting methods: a singular time series prediction model, a decomposition-prediction coupled model, and a decomposition-optimization-prediction coupled model, assessing their ability to capture sudden changes in data trends and their prediction accuracy. Results show that the singular prediction model is optimal with a sliding input step of 3 and a maximum of 400 epochs. Compared to the CEEMDAN-LSTM model, the CEEMDAN-NGO-LSTM model demonstrates superior performance in predicting local extreme shifts in mine water inflow volumes. Specifically, the CEEMDAN-NGO-LSTM model achieves scores of 96.578 in MAE, 1.471% in MAPE, 122.143 in RMSE, and 0.958 in NSE, representing average performance improvements of 44.950% and 19.400% over the LSTM model and CEEMDAN-LSTM model, respectively. Additionally, this model provides the most accurate predictions of mine water inflow volumes over the next five days. Therefore, the decomposition-optimization-prediction coupled model presents a novel technical solution for the safety monitoring of smart mines, offering significant theoretical and practical value for ensuring safe mining operations.
Project description:This paper addresses an important materials engineering question: How can one identify the complete space (or as much of it as possible) of microstructures that are theoretically predicted to yield the desired combination of properties demanded by a selected application? We present a problem involving design of magnetoelastic Fe-Ga alloy microstructure for enhanced elastic, plastic and magnetostrictive properties. While theoretical models for computing properties given the microstructure are known for this alloy, inversion of these relationships to obtain microstructures that lead to desired properties is challenging, primarily due to the high dimensionality of microstructure space, multi-objective design requirement and non-uniqueness of solutions. These challenges render traditional search-based optimization methods incompetent in terms of both searching efficiency and result optimality. In this paper, a route to address these challenges using a machine learning methodology is proposed. A systematic framework consisting of random data generation, feature selection and classification algorithms is developed. Experiments with five design problems that involve identification of microstructures that satisfy both linear and nonlinear property constraints show that our framework outperforms traditional optimization methods with the average running time reduced by as much as 80% and with optimality that would not be achieved otherwise.
Project description:The global food prices have surged to historical highs, and there is no consensus on the reasons behind this round of price increases in academia. Based on theoretical analysis, this study uses monthly data from January 2000 to May 2022 and machine learning models to examine the root causes of that period's global food price surge and global food security situation. The results show that: Firstly, the increase in the supply of US dollars and the rise in oil prices during pandemic are the two most important variables affecting food prices. The unlimited quantitative easing monetary policy of the US dollar is the primary factor driving the global food price surge, and the alternating impact of oil prices and excessive US dollar liquidity are key features of the surge. Secondly, in the context of the global food shortage, the impact of food production reduction and demand growth expectations on food prices will further increase. Thirdly, attention should be paid to potential agricultural import supply chain risks arising from international uncertainty factors such as the ongoing Russia-Ukraine conflict. The Russian-Ukrainian conflict has profoundly impacted the global agricultural supply chain, and crude oil and fertilizers have gradually become the main driving force behind the rise in food prices.
Project description:There has been a lot of research on pricing and lot-sizing practices for different payment methods; however, the majority has focused on the buyer's perspective. While accepting buyers' credit conditions positively impacts sales, requesting advance payments from purchasers tends to have a negative effect. Additionally, requiring a down payment has been found to generate interest revenue for the supplier without introducing default risk. However, extending the credit period, along with offering delayed payment options, has the potential to increase sales volume, albeit with an elevated risk of defaults. Taking these payment schemes into account, this study investigates and compares the per-unit profit for sellers across three distinct payment methods: advance payment, cash payment, and credit payment. The consumption rate of the product varies non-linearly not only with the time duration of different payment options but also with the price and the level of greenness of the product. The utmost objective of this work is to determine the optimal duration associated with payment schemes, selling price, green level, and replenishment period to maximize the seller's profit. The Teaching Learning Based Optimization Algorithm (TLBOA) is applied to address and solve three numerical examples, each corresponding to a distinct scenario of the considered payment schemes. Sensitivity analyses confirm that the seller's profit is markedly influenced by the environmental sustainability level of the product. Furthermore, the seller's profitability is more significantly affected by the selling price index compared to the indices of the payment scheme duration and the green level in the demand structure.
Project description:This study aims to solve the problems of insufficient accuracy and low efficiency of the existing methods in sprint pattern recognition to optimize the training and competition strategies of athletes. Firstly, the data collected in this study come from high-precision sensors and computer simulation, involving key biomechanical parameters in sprint, such as step frequency, stride length and acceleration. The dataset covers multiple tests of multiple athletes, ensuring the diversity of samples. Secondly, an optimized machine learning algorithm based on decision tree is adopted. It combines the advantages of Random Forest (RF) and Gradient Boosting Tree (GBT), and improves the accuracy and efficiency of the model in sprint pattern recognition by adaptively adjusting the hyperparameter and tree structure. Specifically, by introducing adaptive feature selection and ensemble learning methods, the decision tree algorithm effectively improves the recognition ability of the model for different athletes and sports states, thus reducing the over-fitting phenomenon and improving the generalization ability. In the process of model training, cross-validation and grid search optimization methods are adopted to ensure the reasonable selection of super parameters. Moreover, the superiority of the model is verified by comparing with the commonly used algorithms such as Support Vector Machine (SVM) and Convolutional Neural Network (CNN). The accuracy rate on the test set is 94.9%, which is higher than that of SVM (87.0%) and CNN (92.0%). In addition, the optimized decision tree algorithm performs well in computational efficiency. However, the training data of this model comes from the simulation environment, which may deviate from the real game data. Future research can verify the generalization ability of the model through more actual data.
Project description:Stem cell organoids are powerful models for studying organ development, disease modeling, drug screening, and regenerative medicine applications. The convergence of organoid technology, tissue engineering, and artificial intelligence (AI) could potentially enhance our understanding of the design principle for organoid engineering. In this study, we utilized micropatterning techniques to create a designer library of 230 cardiac organoids with 7 geometric designs (Circle 200, Circle 600, Circle 1000, Rectangle 1:1, Rectangle 1:4, Star 1:1, and Star 1:4). We employed manifold learning techniques to analyze single organoid heterogeneity based on 10 physiological parameters. We successfully clustered and refined our cardiac organoids based on their functional similarity using unsupervised machine learning approaches, thus elucidating unique functionalities associated with geometric designs. We also highlighted the critical role of calcium rising time in distinguishing organoids based on geometric patterns and clustering results. This innovative integration of organoid engineering and machine learning enhances our understanding of structure-function relationships in cardiac organoids, paving the way for more controlled and optimized organoid design.
Project description:This article addresses the problem of interval pricing for auction items by constructing an auction item price prediction model based on an adaptive learning algorithm. Firstly, considering the confusing class characteristics of auction item prices, a dynamic inter-class distance adaptive learning model is developed to identify confusing classes by calculating the differences in prediction values across multiple classifiers for target domain samples. The difference in the predicted values of the target domain samples on multiple classifiers is used to calculate the classification distance, distinguish the confusing classes, and make the similar samples in the target domain more clustered. Secondly, a deep clustering algorithm is constructed, which integrates the temporal characteristics and numerical differences of auction item prices, using DTW-K-medoids based dynamic time warping (DTW) and fuzzy C-means (FCM) algorithms for fine clustering. Finally, the KF-LSTM auction item interval price prediction model is constructed using long short-term memory (LSTM) and dual clustering. Experimental results show that the proposed KF-LSTM model significantly improves the prediction accuracy of auction item prices during fluctuation periods, with an average accuracy rate of 90.23% and an average MAPE of only 5.41%. Additionally, under confidence levels of 80%, 85%, and 90%, the KF-LSTM model achieves an interval coverage rate of over 85% for actual auction item prices, significantly enhancing the accuracy of auction item price predictions. This experiment demonstrates the stability and accuracy of the proposed model when applied to different sets of auction items, providing a valuable reference for research in the auction item price prediction field.
Project description:PurposeThe purpose is to accurately identify women at high risk of developing cervical cancer so as to optimize cervical screening strategies and make better use of medical resources. However, the predictive models currently in use require clinical physiological and biochemical indicators, resulting in a smaller scope of application. Stacking-integrated machine learning (SIML) is an advanced machine learning technique that combined multiple learning algorithms to improve predictive performance. This study aimed to develop a stacking-integrated model that can be used to identify women at high risk of developing cervical cancer based on their demographic, behavioral, and historical clinical factors.MethodsThe data of 858 women screened for cervical cancer at a Venezuelan Hospital were used to develop the SIML algorithm. The screening data were randomly split into training data (80%) that were used to develop the algorithm and testing data (20%) that were used to validate the accuracy of the algorithms. The random forest (RF) model and univariate logistic regression were used to identify predictive features for developing cervical cancer. Twelve well-known ML algorithms were selected, and their performances in predicting cervical cancer were compared. A correlation coefficient matrix was used to cluster the models based on their performance. The SIML was then developed using the best-performing techniques. The sensitivity, specificity, and area under the curve (AUC) of all models were calculated.ResultsThe RF model identified 18 features predictive of developing cervical cancer. The use of hormonal contraceptives was considered as the most important risk factor, followed by the number of pregnancies, years of smoking, and the number of sexual partners. The SIML algorithm had the best overall performance when compared with other methods and reached an AUC, sensitivity, and specificity of 0.877, 81.8%, and 81.9%, respectively.ConclusionThis study shows that SIML can be used to accurately identify women at high risk of developing cervical cancer. This model could be used to personalize the screening program by optimizing the screening interval and care plan in high- and low-risk patients based on their demographics, behavioral patterns, and clinical data.
Project description:BackgroundThe Department of Rehabilitation Medicine is key to improving patients' quality of life. Driven by chronic diseases and an aging population, there is a need to enhance the efficiency and resource allocation of outpatient facilities. This study aims to analyze the treatment preferences of outpatient rehabilitation patients by using data and a grading tool to establish predictive models. The goal is to improve patient visit efficiency and optimize resource allocation through these predictive models.MethodsData were collected from 38 Chinese institutions, including 4,244 patients visiting outpatient rehabilitation clinics. Data processing was conducted using Python software. The pandas library was used for data cleaning and preprocessing, involving 68 categorical and 12 continuous variables. The steps included handling missing values, data normalization, and encoding conversion. The data were divided into 80% training and 20% test sets using the Scikit-learn library to ensure model independence and prevent overfitting. Performance comparisons among XGBoost, random forest, and logistic regression were conducted using metrics, including accuracy and receiver operating characteristic (ROC) curves. The imbalanced learning library's SMOTE technique was used to address the sample imbalance during model training. The model was optimized using a confusion matrix and feature importance analysis, and partial dependence plots (PDP) were used to analyze the key influencing factors.ResultsXGBoost achieved the highest overall accuracy of 80.21% with high precision and recall in Category 1. random forest showed a similar overall accuracy. Logistic Regression had a significantly lower accuracy, indicating difficulties with nonlinear data. The key influencing factors identified include distance to medical institutions, arrival time, length of hospital stay, and specific diseases, such as cardiovascular, pulmonary, oncological, and orthopedic conditions. The tiered diagnosis and treatment tool effectively helped doctors assess patients' conditions and recommend suitable medical institutions based on rehabilitation grading.ConclusionThis study confirmed that ensemble learning methods, particularly XGBoost, outperform single models in classification tasks involving complex datasets. Addressing class imbalance and enhancing feature engineering can further improve model performance. Understanding patient preferences and the factors influencing medical institution selection can guide healthcare policies to optimize resource allocation, improve service quality, and enhance patient satisfaction. Tiered diagnosis and treatment tools play a crucial role in helping doctors evaluate patient conditions and make informed recommendations for appropriate medical care.
Project description:Quantum computing and artificial intelligence, combined together, may revolutionize future technologies. A significant school of thought regarding artificial intelligence is based on generative models. Here, we propose a general quantum algorithm for machine learning based on a quantum generative model. We prove that our proposed model is more capable of representing probability distributions compared with classical generative models and has exponential speedup in learning and inference at least for some instances if a quantum computer cannot be efficiently simulated classically. Our result opens a new direction for quantum machine learning and offers a remarkable example where a quantum algorithm shows exponential improvement over classical algorithms in an important application field.