Project description:Cough assessment is central to the clinical management of respiratory diseases, including tuberculosis (TB), but strategies to objectively and unobtrusively measure cough are lacking. Acoustic epidemiology is an emerging field that uses technology to detect cough sounds and analyze cough patterns to improve health outcomes among people with respiratory conditions linked to cough. This field is increasingly exploring the potential of artificial intelligence (AI) for more advanced applications, such as analyzing cough sounds as a biomarker for disease screening. While much of the data are preliminary, objective cough assessment could potentially transform disease control programs, including TB, and support individual patient management. Here, we present an overview of recent advances in this field and describe how cough assessment, if validated, could support public health programs at various stages of the TB care cascade.
Project description:A new item response theory model for count data is introduced. In contrast to models in common use, it does not assume a fixed distribution for the responses as, for example, the Poisson count model and extensions do. The distribution of responses is determined by difficulty functions which reflect the characteristics of items in a flexible way. Sparse parameterizations are obtained by choosing fixed parametric difficulty functions, more general versions use an approximation by basis functions. The model can be seen as constructed from binary response models as the Rasch model or the normal-ogive model to which it reduces if responses are dichotomized. It is demonstrated that the model competes well with advanced count data models. Simulations demonstrate that parameters and response distributions are recovered well. An application shows the flexibility of the model to account for strongly varying distributions of responses.
Project description:Studies in ecology are often describing observed variations in a certain ecological phenomenon by use of environmental explanatory variables. A common problem is that the numerical nature of the ecological phenomenon does not always fit the assumptions underlying traditional statistical tests. A text book example comes from pollination ecology where flower visits are normally reported as frequencies; number of visits per flower per unit time. Using visitation frequencies in statistical analyses comes with two major caveats: the lack of knowledge on its error distribution and that it does not include all information found in the data; 10 flower visits in 20 flowers is treated the same as recording 100 visits in 200 flowers. We simulated datasets with various "flower visitation distributions" over various numbers of flowers observed (exposure) and with different types of effects inducing variation in the data. The different datasets were then analyzed first with the traditional approach using number of visits per flower and then by using count data models. The analysis of count data gave a much better chance of detecting effects than the traditionally used frequency approach. We conclude that if the data structure, statistical analyses and interpretations of results are mixed up, valuable information can be lost.
Project description:Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.
Project description:We present a novel decomposition of nonnegative functional count data that draws on concepts from nonnegative matrix factorization. Our decomposition, which we refer to as NARFD (nonnegative and regularized function decomposition), enables the study of patterns in variation across subjects in a highly interpretable manner. Prototypic modes of variation are estimated directly on the observed scale of the data, are local, and are transparently added together to reconstruct observed functions. This contrasts with generalized functional principal component analysis, an alternative approach that estimates functional principal components on a transformed scale, produces components that typically vary across the entire functional domain, and reconstructs observations using complex patterns of cancellation and multiplication of functional principal components. NARFD is implemented using an alternating minimization algorithm, and we evaluate our approach in simulations. We apply NARFD to an accelerometer dataset comprising observations of physical activity for healthy older Americans.
Project description:Count data, or number of events per time interval, are discrete data arising from repeated time to event observations. Their mean count, or piecewise constant event rate, can be evaluated by discrete probability distributions from the Poisson model family. Clinical trial data characterization often involves population count analysis. This tutorial presents the basics and diagnostics of count modeling and simulation in the context of pharmacometrics. Consideration is given to overdispersion, underdispersion, autocorrelation, and inhomogeneity.
Project description:Statistical analyses are an integral component of scientific research, and for decades, biologists have applied transformations to data to meet the normal error assumptions for F and t tests. Over the years, there has been a movement from data transformation toward model reformation-the use of non-normal error structures within the framework of the generalized linear model (GLM). The principal advantage of model reformation is that parameters are estimated on the original, rather than the transformed scale. However, data transformation has been shown to give better control over type I error, for simulated data with known error structures. We conducted a literature review of statistical textbooks directed toward biologists and of journal articles published in the primary literature to determine temporal trends in both the text recommendations and the practice in the refereed literature over the past 35 years. In this review, a trend of increasing use of reformation in the primary literature was evident, moving from no use of reformation before 1996 to >50% of the articles reviewed applying GLM after 2006. However, no such trend was observed in the recommendations in statistical textbooks. We then undertook 12 analyses based on published datasets in which we compared the type I error estimates, residual plot diagnostics, and coefficients yielded by analyses using square root transformations, log transformations, and the GLM. All analyses yielded acceptable residual versus fit plots and had similar p-values within each analysis, but as expected, the coefficient estimates differed substantially. Furthermore, no consensus could be found in the literature regarding a procedure to back-transform the coefficient estimates obtained from linear models performed on transformed datasets. This lack of consistency among coefficient estimates constitutes a major argument for model reformation over data transformation in biology.
Project description:IntroductionPhysicians need to be able to communicate the myriad of management options clearly to patients and engage them in their health care decisions, even in the fast-paced environment of the emergency department. Shared decision making (SDM) is an effective communication strategy for physicians to share diagnostic uncertainty, avoid potentially harmful tests, and solicit patients' preferences for their care. Role-playing with just-in-time feedback is an effective method to learn and practice SDM before having these conversations with patients.MethodsThis flipped classroom workshop featured precourse materials and an in-class session incorporating a short lecture outlining a framework for SDM, followed by role-playing through patient scenarios. Learners took turns playing the physician or patient role and received feedback on their communication skills while in the physician role. A faculty examiner subsequently assessed skill attainment using a simulated patient encounter and checklist of critical actions.ResultsThe workshop was an interactive and effective way to teach SDM to 28 PGY 1 and PGY 2 emergency medicine residents. Two months after attending the workshop, over 75% of the first-year residents were able to complete all the elements of the SDM process in a simulated patient encounter; four residents required no prompting by the examiner.DiscussionA communications workshop that incorporates role-playing with different patient encounters is an interactive way to teach SDM for the emergency setting. Residents early in their clinical training can benefit from learning and practicing SDM in a simulated setting.
Project description:Nomadic pastoralists are among the world's hardest-to-reach and least served populations. Pastoralist communities are difficult to capture in household surveys because of factors including their high degree of mobility over remote terrain, fluid domestic arrangements, and cultural barriers. Most surveys use census-based sampling frames which do not accurately capture the demographic and health parameters of nomadic populations. As a result, pastoralists are "invisible" in population data such as the Demographic and Health Surveys (DHS). By combining remote sensing and geospatial analysis, we developed a sampling strategy designed to capture the current distribution of nomadic populations. We then implemented this sampling frame to survey a population of mobile pastoralists in southwest Ethiopia, focusing on maternal and child health (MCH) indicators. Using standardized instruments from DHS questionnaires, we draw comparisons with regional and national data finding disparities with DHS data in core MCH indicators, including vaccination coverage, skilled birth attendance, and nutritional status. Our field validation demonstrates that this method is a logistically feasible alternative to conventional sampling frames and may be used at the population level. Geospatial sampling methods provide cost-affordable and logistically feasible strategies for sampling mobile populations, a crucial first step toward reaching these groups with health services.
Project description:One of the most important geographical bottlenecks for migrating raptors in the east African-Palearctic migration system is situated between the easternmost tip of the Black Sea and the Lesser Caucasus, just north of Batumi, in the Republic of Georgia. Since 2008, citizen scientists of the Batumi Raptor Count (BRC) have monitored the autumn raptor passage daily from mid-August until mid-October, collecting also detailed information about the age and sex of focal species. The full BRC dataset was recently made available through the Global Biodiversity Information Facility (GBIF). Here we describe how count data were collected, managed, and processed for trend analysis over the past 10 years. This dataset offers a unique baseline for monitoring the state of migrant raptor populations in the east African-Palearctic flyway in the 21st century. We discuss potential pitfalls for users and hope that the open access publication of our data will stimulate flyway-scale and continent-wide collaboration for raptor migration monitoring in the Old World.