Project description:Much ecological research relies on existing multispecies distribution datasets. Such datasets, however, can vary considerably in quality, extent, resolution or taxonomic coverage. We provide a framework for a spatially-explicit evaluation of geographical representation within large-scale species distribution datasets, using the comparison of an occurrence atlas with a range atlas dataset as a working example. Specifically, we compared occurrence maps for 3773 taxa from the widely-used Atlas Florae Europaeae (AFE) with digitised range maps for 2049 taxa of the lesser-known Atlas of North European Vascular Plants. We calculated the level of agreement at a 50-km spatial resolution using average latitudinal and longitudinal species range, and area of occupancy. Agreement in species distribution was calculated and mapped using Jaccard similarity index and a reduced major axis (RMA) regression analysis of species richness between the entire atlases (5221 taxa in total) and between co-occurring species (601 taxa). We found no difference in distribution ranges or in the area of occupancy frequency distribution, indicating that atlases were sufficiently overlapping for a valid comparison. The similarity index map showed high levels of agreement for central, western, and northern Europe. The RMA regression confirmed that geographical representation of AFE was low in areas with a sparse data recording history (e.g., Russia, Belarus and the Ukraine). For co-occurring species in south-eastern Europe, however, the Atlas of North European Vascular Plants showed remarkably higher richness estimations. Geographical representation of atlas data can be much more heterogeneous than often assumed. Level of agreement between datasets can be used to evaluate geographical representation within datasets. Merging atlases into a single dataset is worthwhile in spite of methodological differences, and helps to fill gaps in our knowledge of species distribution ranges. Species distribution dataset mergers, such as the one exemplified here, can serve as a baseline towards comprehensive species distribution datasets.
Project description:The mobilization of large-scale datasets of specimen images and metadata through herbarium digitization provide a rich environment for the application and development of machine learning techniques. However, limited access to computational resources and uneven progress in digitization, especially for small herbaria, still present barriers to the wide adoption of these new technologies. Using deep learning to extract representations of herbarium specimens useful for a wide variety of applications, so-called "representation learning," could help remove these barriers. Despite its recent popularity for camera trap and natural world images, representation learning is not yet as popular for herbarium specimen images. We investigated the potential of representation learning with specimen images by building three neural networks using a publicly available dataset of over 2 million specimen images spanning multiple continents and institutions. We compared the extracted representations and tested their performance in application tasks relevant to research carried out with herbarium specimens. We found a triplet network, a type of neural network that learns distances between images, produced representations that transferred the best across all applications investigated. Our results demonstrate that it is possible to learn representations of specimen images useful in different applications, and we identify some further steps that we believe are necessary for representation learning to harness the rich information held in the worlds' herbaria.
Project description:The refuge strategy is used worldwide to delay the evolution of pest resistance to insecticides that are either sprayed or produced by transgenic Bacillus thuringiensis (Bt) crops. This strategy is based on the idea that refuges of host plants where pests are not exposed to an insecticide promote survival of susceptible pests. Despite widespread adoption of this approach, large-scale tests of the refuge strategy have been problematic. Here we tested the refuge strategy with 8 y of data on refuges and resistance to the insecticide pyriproxyfen in 84 populations of the sweetpotato whitefly (Bemisia tabaci) from cotton fields in central Arizona. We found that spatial variation in resistance to pyriproxyfen within each year was not affected by refuges of melons or alfalfa near cotton fields. However, resistance was negatively associated with the area of cotton refuges and positively associated with the area of cotton treated with pyriproxyfen. A statistical model based on the first 4 y of data, incorporating the spatial distribution of cotton treated and not treated with pyriproxyfen, adequately predicted the spatial variation in resistance observed in the last 4 y of the study, confirming that cotton refuges delayed resistance and treated cotton fields accelerated resistance. By providing a systematic assessment of the effectiveness of refuges and the scale of their effects, the spatially explicit approach applied here could be useful for testing and improving the refuge strategy in other crop-pest systems.
Project description:Conservation agriculture (CA) is widely promoted as a sustainable agricultural management strategy with the potential to alleviate some of the adverse effects of modern, industrial agriculture such as large-scale soil erosion, nutrient leaching and overexploitation of water resources. Moreover, agricultural land managed under CA is proposed to contribute to climate change mitigation and adaptation through reduced emission of greenhouse gases, increased solar radiation reflection, and the sustainable use of soil and water resources. Due to the lack of official reporting schemes, the amount of agricultural land managed under CA systems is uncertain and spatially explicit information about the distribution of CA required for various modeling studies is missing. Here, we present an approach to downscale present-day national-level estimates of CA to a 5 arcminute regular grid, based on multicriteria analysis. We provide a best estimate of CA distribution and an uncertainty range in the form of a low and high estimate of CA distribution, reflecting the inconsistency in CA definitions. We also design two scenarios of the potential future development of CA combining present-day data and an assessment of the potential for implementation using biophysical and socioeconomic factors. By our estimates, 122-215 Mha or 9%-15% of global arable land is currently managed under CA systems. The lower end of the range represents CA as an integrated system of permanent no-tillage, crop residue management and crop rotations, while the high estimate includes a wider range of areas primarily devoted to temporary no-tillage or reduced tillage operations. Our scenario analysis suggests a future potential of CA in the range of 533-1130 Mha (38%-81% of global arable land). Our estimates can be used in various ecosystem modeling applications and are expected to help identifying more realistic climate mitigation and adaptation potentials of agricultural practices.
Project description:The conservation of large carnivores often requires precise and accurate estimates of their populations. Being cryptic and occurring at low population densities, obtaining an unbiased population estimate is difficult in large carnivores. To overcome the uncertainties in the conventional capture-recapture (CR) methods used to estimate large carnivore densities, more robust methods such as spatially explicit capture-recapture (SECR) framework are now widely used. We modeled the CR data of tiger (Panthera tigris tigris) and leopard (Panthera pardus fusca) in the SECR framework with biotic and abiotic covariates likely believed to influence their densities. An effort of 2,211 trap nights resulted in the capture of 33 and 38 individual tigers and leopards. A total of 95 and 74 detections of tigers and leopards were achieved using 35 pairs of camera traps. Tiger and leopard density were estimated at 4.71 ± 1.20 (3.05-5.11) and 3.03 ± 0.78 (1.85-4.99) per 100 km2. Our results show that leopard density increased with high road density, high terrain ruggedness and habitats with high percentage of cropland and natural vegetation. The tiger density was positively influenced by the mosaic of cropland and natural vegetation. This study provides the first robust density estimates of tiger and leopard within the study area. Our results support the notion that large carnivores can attain moderate densities within human-dominated regions around protected areas relying on domestic livestock. Broader management strategies aimed at maintaining wild prey in the human-dominated areas around protected areas are necessary for large and endangered carnivores' sustenance in the buffer zones around protected areas.
Project description:We present a new hierarchical Bayesian method using multilocus genotypes to estimate recent seed and pollen migration rates in a spatially explicit framework that incorporates distance effects separately for each type of dispersal. The method additionally estimates population allelic frequencies, population divergence values, individual inbreeding coefficients, individual maternal and paternal ancestries, and allelic dropout rates. We conduct a numerical simulation analysis that indicates that the method can provide reliable estimates of seed and pollen migration rates and allow accurate inference of spatial effects on migration, at affordable sample sizes (25-50 individuals/population) when population genetic divergence is not low (FST≥0.05), or by increasing sampling (to at least 100 individuals/population) under weaker levels of divergence (FST=0.025). Simulations also show that the accuracy provided by assays with about one thousand unlinked polymorphic SNP loci may approach, for a given sample size, the theoretical maximum achievable under categorical origin discrimination. We apply our method to Taxus baccata data, revealing low but significant seed and pollen migration among nearby population remnants during the last generation, with a negative effect of interpopulation distance on migration that was detectable for pollen but not for seeds.
Project description:The retrieval of ancient DNA from osteological material provides direct evidence of human genetic diversity in the past. Ancient DNA samples are often used to investigate whether there was population continuity in the settlement history of an area. Methods based on the serial coalescent algorithm have been developed to test whether the population continuity hypothesis can be statistically rejected by analysing DNA samples from the same region but of different ages. Rejection of this hypothesis is indicative of a large genetic shift, possibly due to immigration occurring between two sampling times. However, this approach is only able to reject a model of full continuity model (a total absence of genetic input from outside), but admixture between local and immigrant populations may lead to partial continuity. We have recently developed a method to test for population continuity that explicitly considers the spatial and temporal dynamics of populations. Here, we extended this approach to estimate the proportion of genetic continuity between two populations, using ancient genetic samples. We applied our original approach to the question of the Neolithic transition in Central Europe. Our results confirmed the rejection of full continuity, but our approach represents an important step forward by estimating the relative contribution of immigrant farmers and of local hunter-gatherers to the final Central European Neolithic genetic pool. Furthermore, we show that a substantial proportion of genes brought by the farmers in this region were assimilated from other hunter-gatherer populations along the way from Anatolia, which was not detectable by previous continuity tests. Our approach is also able to jointly estimate demographic parameters, as we show here by finding both low density and low migration rate for pre-Neolithic hunter-gatherers. It provides a useful tool for the analysis of the numerous ancient DNA data sets that are currently being produced for many different species.
Project description:We develop a functional conditional autoregressive (CAR) model for spatially correlated data for which functions are collected on areal units of a lattice. Our model performs functional response regression while accounting for spatial correlations with potentially nonseparable and nonstationary covariance structure, in both the space and functional domains. We show theoretically that our construction leads to a CAR model at each functional location, with spatial covariance parameters varying and borrowing strength across the functional domain. Using basis transformation strategies, the nonseparable spatial-functional model is computationally scalable to enormous functional datasets, generalizable to different basis functions, and can be used on functions defined on higher dimensional domains such as images. Through simulation studies, we demonstrate that accounting for the spatial correlation in our modeling leads to improved functional regression performance. Applied to a high-throughput spatially correlated copy number dataset, the model identifies genetic markers not identified by comparable methods that ignore spatial correlations.