Project description:Reference sequence databases are critical to the accurate detection and identification of fungi in the environment. As repositories of large numbers of well-curated specimens, herbaria and fungal culture collections have the material resources to generate sequence data for large number of taxa, and could therefore allow filling taxonomic gaps often present in reference sequence databases. Financial resources to do that are however often lacking, so that recent efforts have focused on decreasing sequencing cost by increasing the number of multiplexed samples per sequencing run while maintaining high sequence quality. Following a previous study that aimed at decreasing sequencing cost for lichen specimens by generating fungal ITS barcodes for 96 specimens using PacBio amplicon sequencing, we present a method that further decreases lichen specimen metabarcoding costs. A total of 384 mixed DNA extracts obtained from lichen herbarium specimens, mostly from the four genera Buellia, Catillaria, Endocarpon and Parmotrema, were used to generate new fungal ITS sequences using a Sequel I sequencing platform and the PacBio M13 barcoded primers. The average success rate across all taxa was high (86.5%), with particularly high rates for the crustose saxicolous taxa (Buellia, Catillaria and others; 93.3%) and the terricolous squamulose taxa (Endocarpon and others; 96.5%). On the other hand, the success rate for the foliose genus Parmotrema was lower (60.4%). With this taxon sampling, greater specimen age did not appear to impact sequencing success. In fact, the 1966-1980 collection date category showed the highest success rate (97.3%). Compared to the previous study, the abundance-based sequence denoising method showed some limitations, but the cost of generating ITS barcodes was further decreased thanks to the higher multiplexing level. In addition to contributing new ITS barcodes for specimens of four interesting lichen genera, this study further highlights the potential and challenges of using new sequencing technologies on collection specimens to generate DNA sequences for reference databases.
Project description:Agriculture is one of the largest threats to global biodiversity. However, most studies have focused only on the direct effects of agriculture on biodiversity, and few have addressed the indirect effects, potentially over or under-estimating the overall impacts of agriculture on biodiversity. The indirect effect is the response not to the agricultural cover types or operations per se, but instead, to the way that agriculture influences the extent and configuration of different types of natural land cover in the landscape. We used structural equation modelling (SEM) to evaluate the direct, indirect, and total effects of agriculture on species richness of three bird guilds: forest birds, shrub-edge birds, and open country birds. We found that forest bird richness was driven by the negative indirect effect of cropland via forest loss. Shrub-edge and open country bird richness increased with the amount of agriculture land covers; however, importantly, we found negative indirect effects of agriculture on both guilds via a reduction in more natural land covers. This latter result highlights how we would have over-estimated the positive effects of agriculture on shrub-edge and open country bird richness had we not measured both direct and indirect effects (i.e., the total effect size is less than the direct effect size). Overall, our results suggest that a bird-friendly agricultural landscape in our region would have forest that is configured to maximize forest edge, and a high proportion of perennial forage within the agricultural portion of the landscape.Supplementary informationThe online version contains supplementary material available at 10.1007/s10531-023-02559-1.
Project description:Understanding the basis of brain function requires knowledge of cortical operations over wide spatial scales and the quantitative analysis of brain activity in well-defined brain regions. Matching an anatomical atlas to brain functional data requires substantial labor and expertise. Here, we developed an automated machine learning-based registration and segmentation approach for quantitative analysis of mouse mesoscale cortical images. A deep learning model identifies nine cortical landmarks using only a single raw fluorescent image. Another fully convolutional network was adapted to delimit brain boundaries. This anatomical alignment approach was extended by adding three functional alignment approaches that use sensory maps or spatial-temporal activity motifs. We present this methodology as MesoNet, a robust and user-friendly analysis pipeline using pre-trained models to segment brain regions as defined in the Allen Mouse Brain Atlas. This Python-based toolbox can also be combined with existing methods to facilitate high-throughput data analysis.
Project description:When a dense suspension is squeezed from a nozzle, droplet detachment can occur similar to that of pure liquids. While in pure liquids the process of droplet detachment is well characterized through self-similar profiles and known scaling laws, we show here the simple presence of particles causes suspensions to break up in a new fashion. Using high-speed imaging, we find that detachment of a suspension drop is described by a power law; specifically we find the neck minimum radius, r(m), scales like near breakup at time τ = 0. We demonstrate data collapse in a variety of particle/liquid combinations, packing fractions, solvent viscosities, and initial conditions. We argue that this scaling is a consequence of particles deforming the neck surface, thereby creating a pressure that is balanced by inertia, and show how it emerges from topological constraints that relate particle configurations with macroscopic Gaussian curvature. This new type of scaling, uniquely enforced by geometry and regulated by the particles, displays memory of its initial conditions, fails to be self-similar, and has implications for the pressure given at generic suspension interfaces.
Project description:Statistical modeling is commonly used to relate the performance of potato (Solanum tuberosum L.) to fertilizer requirements. Prescribing optimal nutrient doses is challenging because of the involvement of many variables including weather, soils, land management, genotypes, and severity of pests and diseases. Where sufficient data are available, machine learning algorithms can be used to predict crop performance. The objective of this study was to determine an optimal model predicting nitrogen, phosphorus and potassium requirements for high tuber yield and quality (size and specific gravity) as impacted by weather, soils and land management variables. We exploited a data set of 273 field experiments conducted from 1979 to 2017 in Quebec (Canada). We developed, evaluated and compared predictions from a hierarchical Mitscherlich model, k-nearest neighbors, random forest, neural networks and Gaussian processes. Machine learning models returned R2 values of 0.49-0.59 for tuber marketable yield prediction, which were higher than the Mitscherlich model R2 (0.37). The models were more likely to predict medium-size tubers (R2 = 0.60-0.69) and tuber specific gravity (R2 = 0.58-0.67) than large-size tubers (R2 = 0.55-0.64) and marketable yield. Response surfaces from the Mitscherlich model, neural networks and Gaussian processes returned smooth responses that agreed more with actual evidence than discontinuous curves derived from k-nearest neighbors and random forest models. When conditioned to obtain optimal dosages from dose-response surfaces given constant weather, soil and land management conditions, some disagreements occurred between models. Due to their built-in ability to develop recommendations within a probabilistic risk-assessment framework, Gaussian processes stood out as the most promising algorithm to support decisions that minimize economic or agronomic risks.
Project description:Numerosity, the set size of a group of items, helps guide behaviour and decisions. Non-symbolic numerosities are represented by the approximate number system. However, distinct behavioural performance suggests that small numerosities, i.e. subitizing range, are implemented differently in the brain than larger numerosities. Prior work has shown that neural populations selectively responding (i.e. hemodynamic responses) to small numerosities are organized into a network of topographical maps. Here, we investigate how neural populations respond to large numerosities, well into the ANS. Using 7 T fMRI and biologically-inspired analyses, we found a network of neural populations tuned to both small and large numerosities organized within the same topographic maps. These results demonstrate a continuum of numerosity preferences that progressively cover both the subitizing range and beyond within the same numerosity map, suggesting a single neural mechanism. We hypothesize that differences in map properties, such as cortical magnification and tuning width, underlie known differences in behaviour.
Project description:Land Use and Land Cover (LULC) maps are important tools for environmental planning and social-ecological modeling, as they provide critical information for evaluating risks, managing natural resources, and facilitating effective decision-making. This study aimed to generate a very high spatial resolution (0.5 m) and detailed (21 classes) LULC map for the greater Mariño watershed (Peru) in 2019, using the MORINGA processing chain. This new method for LULC mapping consisted in a supervised object-based LULC classification, using the random forest algorithm along with multi-sensor satellite imagery from which spectral and textural predictors were derived (a very high spatial resolution Pléiades image and a time serie of high spatial resolution Sentinel-2 images). The random forest classifier showed a very good performance and the LULC map was further improved through additional post-treatment steps that included cross-checking with external GIS data sources and manual correction using photointerpretation, resulting in a more accurate and reliable map. The final LULC provides new information for environmental management and monitoring in the greater Mariño watershed. With this study we contribute to the efforts to develop standardized and replicable methodologies for high-resolution and high-accuracy LULC mapping, which is crucial for informed decision-making and conservation strategies.
Project description:Despite decades of research, effective treatments for most cancers remain elusive. One reason is that different instances of cancer result from different combinations of multiple genetic mutations (hits). Therefore, treatments that may be effective in some cases are not effective in others. We previously developed an algorithm for identifying combinations of carcinogenic genes with mutations (multi-hit combinations), which could suggest a likely cause for individual instances of cancer. Most cancers are estimated to require three or more hits. However, the computational complexity of the algorithm scales exponentially with the number of hits, making it impractical for identifying combinations of more than two hits. To identify combinations of greater than two hits, we used a compressed binary matrix representation, and optimized the algorithm for parallel execution on an NVIDIA V100 graphics processing unit (GPU). With these enhancements, the optimized GPU implementation was on average an estimated 12,144 times faster than the original integer matrix based CPU implementation, for the 3-hit algorithm, allowing us to identify 3-hit combinations. The 3-hit combinations identified using a training set were able to differentiate between tumor and normal samples in a separate test set with 90% overall sensitivity and 93% overall specificity. We illustrate how the distribution of mutations in tumor and normal samples in the multi-hit gene combinations can suggest potential driver mutations for further investigation. With experimental validation, these combinations may provide insight into the etiology of cancer and a rational basis for targeted combination therapy.
Project description:The adoption of conservation agriculture methods, such as conservation tillage and cover cropping, is a viable alternative to conventional farming practices for improving soil health and reducing soil carbon losses. Despite their significance in mitigating climate change, there are very few studies that have assessed the overall spatial distribution of cover crops and tillage practices based on the farm's pedoclimatic and topographic characteristics. Hence, the primary objective of this study was to use multiple satellite-derived indices and environmental drivers to infer the level of tillage intensity and identify the presence of cover crops in eastern South Dakota (SD). We used a machine learning classifier trained with in situ field samples and environmental drivers acquired from different remote sensing datasets for 2022 and 2023 to map the conservation agriculture practices. Our classification accuracies (>80%) indicate that the employed satellite spectral indices and environmental variables could successfully detect the presence of cover crops and the tillage intensity in the study region. Our analysis revealed that 4% of the corn (Zea mays) and soybean (Glycine max) fields in eastern SD had a cover crop during either the fall of 2022 or the spring of 2023. We also found that environmental factors, specifically seasonal precipitation, growing degree days, and surface texture, significantly impacted the use of conservation practices. The methods developed through this research may provide a viable means for tracking and documenting farmers' agricultural management techniques. Our study contributes to developing a measurement, reporting, and verification (MRV) solution that could help used to monitor various climate-smart agricultural practices.
Project description:Metabolic rate, heart rate, lifespan, and many other physiological properties vary with body mass in systematic and interrelated ways. Present empirical data suggest that these scaling relationships take the form of power laws with exponents that are simple multiples of one quarter. A compelling explanation of this observation was put forward a decade ago by West, Brown, and Enquist (WBE). Their framework elucidates the link between metabolic rate and body mass by focusing on the dynamics and structure of resource distribution networks-the cardiovascular system in the case of mammals. Within this framework the WBE model is based on eight assumptions from which it derives the well-known observed scaling exponent of 3/4. In this paper we clarify that this result only holds in the limit of infinite network size (body mass) and that the actual exponent predicted by the model depends on the sizes of the organisms being studied. Failure to clarify and to explore the nature of this approximation has led to debates about the WBE model that were at cross purposes. We compute analytical expressions for the finite-size corrections to the 3/4 exponent, resulting in a spectrum of scaling exponents as a function of absolute network size. When accounting for these corrections over a size range spanning the eight orders of magnitude observed in mammals, the WBE model predicts a scaling exponent of 0.81, seemingly at odds with data. We then proceed to study the sensitivity of the scaling exponent with respect to variations in several assumptions that underlie the WBE model, always in the context of finite-size corrections. Here too, the trends we derive from the model seem at odds with trends detectable in empirical data. Our work illustrates the utility of the WBE framework in reasoning about allometric scaling, while at the same time suggesting that the current canonical model may need amendments to bring its predictions fully in line with available datasets.