Project description:Detection of subsurface hydrodynamic anomalies plays a significant role in groundwater resource management and environmental monitoring. In this paper, based on data from the groundwater level, atmospheric pressure, and precipitation in the Chengdu area of China, a method for detecting outliers considering the factors affecting groundwater levels is proposed. By analyzing the factors affecting groundwater levels in the monitoring site and eliminating them, simplified groundwater data is obtained. Applying sl-Pauta (self-learning-based Pauta), iForest (Isolated Forest), OCSVM (One-Class SVM), and KNN to synthetic data with known outliers, testing and evaluating the effectiveness of 4 technologies. Finally, the four methods are applied to the detection of outliers in simplified groundwater levels. The results show that in the detection of outliers in synthesized data, the OCSVM method has the best detection performance, with a precision rate of 88.89%, a recall rate of 91.43%, an F1 score of 90.14%, and an AUC value of 95.66%. In the detection of outliers in simplified groundwater levels, a qualitative analysis of the displacement data within the field of view indicates that the outlier detection performance of iForest and OCSVM is better than that of KNN. The proposed method for considering the factors affecting groundwater levels can improve the efficiency and accuracy of detecting outliers in groundwater level data.
Project description:Contamination of drinking water by nitrate is a growing problem in many agricultural areas of the country. Ingested nitrate can lead to the endogenous formation of N-nitroso compounds, potent carcinogens. We developed a predictive model for nitrate concentrations in private wells in Iowa. Using 34,084 measurements of nitrate in private wells, we trained and tested random forest models to predict log nitrate levels by systematically assessing the predictive performance of 179 variables in 36 thematic groups (well depth, distance to sinkholes, location, land use, soil characteristics, nitrogen inputs, meteorology, and other factors). The final model contained 66 variables in 17 groups. Some of the most important variables were well depth, slope length within 1 km of the well, year of sample, and distance to nearest animal feeding operation. The correlation between observed and estimated nitrate concentrations was excellent in the training set (r-square=0.77) and was acceptable in the testing set (r-square=0.38). The random forest model had substantially better predictive performance than a traditional linear regression model or a regression tree. Our model will be used to investigate the association between nitrate levels in drinking water and cancer risk in the Iowa participants of the Agricultural Health Study cohort.
Project description:Groundwater nitrate contamination poses a potential threat to human health and environmental safety globally. This study proposes an interpretable stacking ensemble learning (SEL) framework for enhancing and interpreting groundwater nitrate spatial predictions by integrating the two-level heterogeneous SEL model and SHapley Additive exPlanations (SHAP). In the SEL model, five commonly used machine learning models were utilized as base models (gradient boosting decision tree, extreme gradient boosting, random forest, extremely randomized trees, and k-nearest neighbor), whose outputs were taken as input data for the meta-model. When applied to the agricultural intensive area, the Eden Valley in the UK, the SEL model outperformed the individual models in predictive performance and generalization ability. It reveals a mean groundwater nitrate level of 2.22 mg/L-N, with 2.46% of sandstone aquifers exceeding the drinking standard of 11.3 mg/L-N. Alarmingly, 8.74% of areas with high groundwater nitrate remain outside the designated nitrate vulnerable zones. Moreover, SHAP identified that transmissivity, baseflow index, hydraulic conductivity, the percentage of arable land, and the C:N ratio in the soil were the top five key driving factors of groundwater nitrate. With nitrate threatening groundwater globally, this study presents a high-accuracy, interpretable, and flexible modeling framework that enhances our understanding of the mechanisms behind groundwater nitrate contamination. It implies that the interpretable SEL framework has great promise for providing valuable evidence for environmental management, water resource protection, and sustainable development, particularly in the data-scarce area.
Project description:For the designation of nitrate vulnerable zones under the EU Nitrate Directive, some German federal states use inverse distance weighting (IDW) as interpolation method. Our study quantifies the accuracy of IDW with respect to the designation of areas with a groundwater nitrate concentration above the threshold of 50 mg NO3/l using a dataset of 5790 groundwater monitoring sites in Bavaria. The results show that the absolute differences of nitrate concentrations between the monitoring sites are only weakly correlated within a range of no more than 0.4 km. The IDW cross-validated nitrate concentration of measurement sites shows a mean absolute error of 7.0 mg NO3/l and the number of measurement sites above 50 mg NO3/l is 44% too low by interpolation for all sites as a whole. The corresponding values for interpolation separately for the 18 hydrogeological regions in Bavaria are 7.1 mg NO3/l and 38%. The sensitivity and the accuracy of nitrate concentration maps due to the variation of IDW parameters and the position of sampling points are analysed by Monte Carlo IDW interpolations using a Random Forest modelled map as reference spatial distribution. Compared to this reference map, the area with a concentration above 50 mg NO3/l in groundwater is estimated by IDW to be 46% too low for the best IDW parametrization. Overall, IDW interpolation systematically underrates the occurrence of higher range nitrate concentrations. In view of these underestimations, IDW does not appear to be a suitable regionalization method for the designation of nitrate vulnerable zones, neither when applied for a federal state as a whole nor when interpolated separately for hydrogeological regions.
Project description:This study was carried out to develop a conceptual framework for determining the best interpolation method which mainly is employed to calculate the variability maps of electrical conductivity (EC) in neighboring regions. The considered case study is parts of the Khorasan Razavi province, Iran (including five aquifers Kashmar, Fariman, Doruneh, Sarakhs and Joveyn). In the first step, the empirical variogram (semi-variogram) was computed for the study area. The methods of the variability of a variable with spatial or temporal distance were considered to measure the semi-variogram function. In the next step, the best variogram model (e.g. spherical, exponential or Gaussian) was considered in the Geographic Information System (GIS) environment and f for the Environmental Sciences (GS+) software. By plotting the semi-variogram in GS+ program based on different method as Global Polynomial Interpolation (GPI), Inverse distance weighing (IDW), Radial basis function (RBF), Kriging method, Global Polynomial Interpolation (GPI), Local Polynomial Interpolation (LPI), the best variogram model fitted to spatial structure of the EC. Finally, by considering the acceptable range for different parameters which impact on EC and evaluating their impacts by scaling, the best interpolation method has been selected for that area for employing their neighborhood basin. Result indicated that the precipitation located within the range of 140 to 180 mm, RBI has the priority. This process is continued for all 14 parameters and eventually one method gets the most points.
Project description:Nitrate pollution in groundwater is a serious problem in many parts of the world. However, due to the diffuse and common spatially over-lapping character of potential several non-point pollution sources, it is often difficult to distinguish main nitrate sources responsible for the pollution. For this purpose, we present a novel methodology applied to groundwater for an intensely polluted area. Groundwater samples were collected monthly from April 2017 to March 2018 in Shimabara City, Nagasaki, Japan. Soil samples were collected seasonally at soil surface and 50 cm depth at 10 locations during the same period. Sequential extraction by water and extract agents was performed using calcium phosphate for anions and strontium chloride for cations. Mean nitrate concentration in groundwater close to a livestock waste disposal site (hereinafter called "LWDS") was 14.2 mg L-1, which is exceeding Japanese drinking water standards (10 mg L-1). We used coprostanol concentration, which is a fecal pollution indicator, to identify pollution sources related to livestock waste. For this purpose, we measured coprostanol (5β) and cholestanol (5α) and then calculated the sterol ratio (5β/(5β + 5α)). The ratios for three groundwater sampling sites were 0.28, 0.26, and 0.10, respectively. The sterol ratios indicated no pollution (< 0.3). However, the detection of coprostanol originating from animal and human waste showed that groundwater was clearly affected by this pollution source. Nitrate levels in the soil were relatively high in samples collected close to the LWDS and coprostanol contents were affected by livestock waste. Soil and groundwater nitrate concentrations displayed a complex but strong relationship. Nitrate contents were shown to be transported downstream from source areas in both soil and groundwater.
Project description:Nitrate ingested from drinking water has been linked to adverse health outcomes (e.g., cancer, birth defects) at levels as low as ∼2 mg/L NO3-N, far below the regulatory limits of 10 mg/L. In many areas, groundwater is a common drinking water source and may contain elevated nitrate, but limited data on the patterns and concentrations are available. Using an extensive regulatory data set of over 100,000 nitrate drinking water well samples, we developed new maps of groundwater nitrate concentrations from 76,724 wells in Michigan's Lower Peninsula, USA for the 2006-2015 period. Kriging, a geostatistical method, was used to interpolate concentrations and quantify probability of exceeding relevant thresholds (>0.4 [common detection limit], >2 mg/L NO3-N). We summarized this probability in small watersheds (∼80 km2) to identify correlated variables using the machine learning method classification and regression trees (CARTs). We found 79% of wells had concentrations below the detection limit in this analysis (<0.4 mg/L NO3-N). In the shallow aquifer (focus of study), 13% of wells exceeded 2 mg/L NO3-N and 2% exceeded the EPA maximum contaminant level of 10 mg/L. CART explained 40%-45% of variation in each model and identified three categories of critical correlated variables: source (high agricultural nitrogen inputs), vulnerable soil conditions (low soil organic carbon and high hydraulic conductivity), and transport mechanisms (high aquifer recharge). These findings add to the body of literature seeking to identify communities at risk of elevated nitrate and study associated adverse health outcomes.
Project description:Groundwater is a critical resource in India for the supply of drinking water and for irrigation. Its usage is limited not only by its quantity but also by its quality. Among the most important contaminants of groundwater in India is arsenic, which naturally accumulates in some aquifers. In this study we create a random forest model with over 145,000 arsenic concentration measurements and over two dozen predictor variables of surface environmental parameters to produce hazard and exposure maps of the areas and populations potentially exposed to high arsenic concentrations (>10 µg/L) in groundwater. Statistical relationships found between the predictor variables and arsenic measurements are broadly consistent with major geochemical processes known to mobilize arsenic in aquifers. In addition to known high arsenic areas, such as along the Ganges and Brahmaputra rivers, we have identified several other areas around the country that have hitherto not been identified as potential arsenic hotspots. Based on recent reported rates of household groundwater use for rural and urban areas, we estimate that between about 18-30 million people in India are currently at risk of high exposure to arsenic through their drinking water supply. The hazard models here can be used to inform prioritization of groundwater quality testing and environmental public health tracking programs.
Project description:The European Community asks its Member States to provide a comprehensive and coherent overview of their groundwater chemical status. It is stated that simple conceptual models are necessary to allow assessments of the risks of failing to meet quality objectives. In The Netherlands two monitoring networks (one for agriculture and one for nature) are operational, providing results which can be used for an overview. Two regression models, based upon simple conceptual models, link measured nitrate concentrations to data from remote sensing images of land use, national forest inventory, national cattle inventory, fertiliser use statistics, atmospheric N deposition, soil maps and weather monitoring. The models are used to draw a nitrate leaching map and to estimate the size of the area exceeding the EU limit value in the early 1990s. The 95% confidence interval for the fraction nature and agricultural areas where the EU limit value for nitrate (50 mg/l) was exceeded amounted to 0.77-0.85 while the lower 97.5% confidence limit for the fraction agricultural area where the EU limit value was exceeded amounted to 0.94. Although the two conceptual models can be regarded as simple, the use of the models to give an overview was experienced as complex.
Project description:Throughout the world, nitrogen (N) losses from intensive agricultural production may end up as undesirably high concentrations of nitrate in groundwater with a long-term impact on groundwater quality. This has human and environmental health consequences, due to the use of groundwater as a drinking water resource, and causes eutrophication of groundwater-dependent ecosystems such as wetlands, rivers and near-coastal areas. At national scale, the measured nitrate concentrations and trends in Danish oxic groundwater in the last 70 years correlate well with the annual agricultural N surpluses. We also show that the N use efficiency of agriculture is related to the groundwater nitrate concentrations. We demonstrate an inverted U-shape of annual nitrate concentrations as a function of economic growth from 1948 to 2014. Our analyses evidence a clear trend of a reversal at the beginning of the 1980s towards a more sustainable agricultural N management. This appears to be primarily driven by societal demand for groundwater protection linked to economic prosperity and an increased environmental awareness. However, the environmental and human health thresholds are still exceeded in many locations. Groundwater protection is of fundamental global importance, and this calls for further development of environmentally and economically sustainable N management in agriculture worldwide.