Project description:We propose a novel Bayesian methodology for analyzing nonstationary time series that exhibit oscillatory behavior. We approximate the time series using a piecewise oscillatory model with unknown periodicities, where our goal is to estimate the change-points while simultaneously identifying the potentially changing periodicities in the data. Our proposed methodology is based on a trans-dimensional Markov chain Monte Carlo algorithm that simultaneously updates the change-points and the periodicities relevant to any segment between them. We show that the proposed methodology successfully identifies time changing oscillatory behavior in two applications which are relevant to e-Health and sleep research, namely the occurrence of ultradian oscillations in human skin temperature during the time of night rest, and the detection of instances of sleep apnea in plethysmographic respiratory traces. Supplementary materials for this article are available online.
Project description:The time-varying power spectrum of a time series process is a bivariate function that quantifies the magnitude of oscillations at different frequencies and times. To obtain low-dimensional, parsimonious measures from this functional parameter, applied researchers consider collapsed measures of power within local bands that partition the frequency space. Frequency bands commonly used in the scientific literature were historically derived, but they are not guaranteed to be optimal or justified for adequately summarizing information from a given time series process under current study. There is a dearth of methods for empirically constructing statistically optimal bands for a given signal. The goal of this article is to provide a standardized, unifying approach for deriving and analyzing customized frequency bands. A consistent, frequency-domain, iterative cumulative sum based scanning procedure is formulated to identify frequency bands that best preserve nonstationary information. A formal hypothesis testing procedure is also developed to test which, if any, frequency bands remain stationary. The proposed method is used to analyze heart rate variability of a patient during sleep and uncovers a refined partition of frequency bands that best summarize the time-varying power spectrum.
Project description:Many studies of biomedical time series signals aim to measure the association between frequency-domain properties of time series and clinical and behavioral covariates. However, the time-varying dynamics of these associations are largely ignored due to a lack of methods that can assess the changing nature of the relationship through time. This article introduces a method for the simultaneous and automatic analysis of the association between the time-varying power spectrum and covariates, which we refer to as conditional adaptive Bayesian spectrum analysis (CABS). The procedure adaptively partitions the grid of time and covariate values into an unknown number of approximately stationary blocks and nonparametrically estimates local spectra within blocks through penalized splines. CABS is formulated in a fully Bayesian framework, in which the number and locations of partition points are random, and fit using reversible jump Markov chain Monte Carlo techniques. Estimation and inference averaged over the distribution of partitions allows for the accurate analysis of spectra with both smooth and abrupt changes. The proposed methodology is used to analyze the association between the time-varying spectrum of heart rate variability and self-reported sleep quality in a study of older adults serving as the primary caregiver for their ill spouse.
Project description:This article introduces a nonparametric approach to spectral analysis of a high-dimensional multivariate nonstationary time series. The procedure is based on a novel frequency-domain factor model that provides a flexible yet parsimonious representation of spectral matrices from a large number of simultaneously observed time series. Real and imaginary parts of the factor loading matrices are modeled independently using a prior that is formulated from the tensor product of penalized splines and multiplicative gamma process shrinkage priors, allowing for infinitely many factors with loadings increasingly shrunk towards zero as the column index increases. Formulated in a fully Bayesian framework, the time series is adaptively partitioned into approximately stationary segments, where both the number and locations of partition points are assumed unknown. Stochastic approximation Monte Carlo (SAMC) techniques are used to accommodate the unknown number of segments, and a conditional Whittle likelihood-based Gibbs sampler is developed for efficient sampling within segments. By averaging over the distribution of partitions, the proposed method can approximate both abrupt and slowly varying changes in spectral matrices. Performance of the proposed model is evaluated by extensive simulations and demonstrated through the analysis of high-density electroencephalography.
Project description:We present the AdaptSPEC-X method for the joint analysis of a panel of possibly nonstationary time series. The approach is Bayesian and uses a covariate-dependent infinite mixture model to incorporate multiple time series, with mixture components parameterized by a time-varying mean and log spectrum. The mixture components are based on AdaptSPEC, a nonparametric model which adaptively divides the time series into an unknown number of segments and estimates the local log spectra by smoothing splines. AdaptSPEC-X extends AdaptSPEC in three ways. First, through the infinite mixture, it applies to multiple time series linked by covariates. Second, it can handle missing values, a common feature of time series which can cause difficulties for nonparametric spectral methods. Third, it allows for a time-varying mean. Through these extensions, AdaptSPEC-X can estimate time-varying means and spectra at observed and unobserved covariate values, allowing for predictive inference. Estimation is performed by Markov chain Monte Carlo (MCMC) methods, combining data augmentation, reversible jump, and Riemann manifold Hamiltonian Monte Carlo techniques. We evaluate the methodology using simulated data, and describe applications to Australian rainfall data and measles incidence in the US. Software implementing the method proposed in this paper is available in the R package BayesSpec.
Project description:This article introduces a flexible nonparametric approach for analyzing the association between covariates and power spectra of multivariate time series observed across multiple subjects, which we refer to as multivariate conditional adaptive Bayesian power spectrum analysis (MultiCABS). The proposed procedure adaptively collects time series with similar covariate values into an unknown number of groups and nonparametrically estimates group-specific power spectra through penalized splines. A fully Bayesian framework is developed in which the number of groups and the covariate partition defining the groups are random and fit using Markov chain Monte Carlo techniques. MultiCABS offers accurate estimation and inference on power spectra of multivariate time series with both smooth and abrupt dynamics across covariate by averaging over the distribution of covariate partitions. Performance of the proposed method compared with existing methods is evaluated in simulation studies. The proposed methodology is used to analyze the association between fear of falling and power spectra of center-of-pressure trajectories of postural control while standing in people with Parkinson's disease.
Project description:The genetic architecture of adaptation in natural populations has not yet been resolved: it is not clear to what extent the spread of beneficial mutations (selective sweeps) or the response of many quantitative trait loci drive adaptation to environmental changes. Although much attention has been given to the genomic footprint of selective sweeps, the importance of selection on quantitative traits is still not well studied, as the associated genomic signature is extremely difficult to detect. We propose 'Evolve and Resequence' as a promising tool, to study polygenic adaptation of quantitative traits in evolving populations. Simulating replicated time series data we show that adaptation to a new intermediate trait optimum has three characteristic phases that are reflected on the genomic level: (1) directional frequency changes towards the new trait optimum, (2) plateauing of allele frequencies when the new trait optimum has been reached and (3) subsequent divergence between replicated trajectories ultimately leading to the loss or fixation of alleles while the trait value does not change. We explore these 3 phase characteristics for relevant population genetic parameters to provide expectations for various experimental evolution designs. Remarkably, over a broad range of parameters the trajectories of selected alleles display a pattern across replicates, which differs both from neutrality and directional selection. We conclude that replicated time series data from experimental evolution studies provide a promising framework to study polygenic adaptation from whole-genome population genetics data.
Project description:BackgroundMicroarray time series studies are essential to understand the dynamics of molecular events. In order to limit the analysis to those genes that change expression over time, a first necessary step is to select differentially expressed transcripts. A variety of methods have been proposed to this purpose; however, these methods are seldom applicable in practice since they require a large number of replicates, often available only for a limited number of samples. In this data-poor context, we evaluate the performance of three selection methods, using synthetic data, over a range of experimental conditions. Application to real data is also discussed.ResultsThree methods are considered, to assess differentially expressed genes in data-poor conditions. Method 1 uses a threshold on individual samples based on a model of the experimental error. Method 2 calculates the area of the region bounded by the time series expression profiles, and considers the gene differentially expressed if the area exceeds a threshold based on a model of the experimental error. These two methods are compared to Method 3, recently proposed in the literature, which exploits splines fit to compare time series profiles. Application of the three methods to synthetic data indicates that Method 2 outperforms the other two both in Precision and Recall when short time series are analyzed, while Method 3 outperforms the other two for long time series.ConclusionThese results help to address the choice of the algorithm to be used in data-poor time series expression study, depending on the length of the time series.
Project description:We describe a new approach to analyze chirp syllables of free-tailed bats from two regions of Texas in which they are predominant: Austin and College Station. Our goal is to characterize any systematic regional differences in the mating chirps and assess whether individual bats have signature chirps. The data are analyzed by modeling spectrograms of the chirps as responses in a Bayesian functional mixed model. Given the variable chirp lengths, we compute the spectrograms on a relative time scale interpretable as the relative chirp position, using a variable window overlap based on chirp length. We use 2D wavelet transforms to capture correlation within the spectrogram in our modeling and obtain adaptive regularization of the estimates and inference for the regions-specific spectrograms. Our model includes random effect spectrograms at the bat level to account for correlation among chirps from the same bat, and to assess relative variability in chirp spectrograms within and between bats. The modeling of spectrograms using functional mixed models is a general approach for the analysis of replicated nonstationary time series, such as our acoustical signals, to relate aspects of the signals to various predictors, while accounting for between-signal structure. This can be done on raw spectrograms when all signals are of the same length, and can be done using spectrograms defined on a relative time scale for signals of variable length in settings where the idea of defining correspondence across signals based on relative position is sensible.
Project description:Impulse response functions (IRFs) are useful for characterizing systems' dynamic behavior and gaining insight into their underlying processes, based on sensor data streams of their inputs and outputs. However, current IRF estimation methods typically require restrictive assumptions that are rarely met in practice, including that the underlying system is homogeneous, linear, and stationary, and that any noise is well behaved. Here, I present data-driven, model-independent, nonparametric IRF estimation methods that relax these assumptions, and thus expand the applicability of IRFs in real-world systems. These methods can accurately and efficiently deconvolve IRFs from signals that are substantially contaminated by autoregressive moving average (ARMA) noise or nonstationary ARIMA noise. They can also simultaneously deconvolve and demix the impulse responses of individual components of heterogeneous systems, based on their combined output (without needing to know the outputs of the individual components). This deconvolution-demixing approach can be extended to characterize nonstationary coupling between inputs and outputs, even if the system's impulse response changes so rapidly that different impulse responses overlap one another. These techniques can also be extended to estimate IRFs for nonlinear systems in which different input intensities yield impulse responses with different shapes and amplitudes, which are then overprinted on one another in the output. I further show how one can efficiently quantify multiscale impulse responses using piecewise linear IRFs defined at unevenly spaced lags. All of these methods are implemented in an R script that can efficiently estimate IRFs over hundreds of lags, from noisy time series of thousands or even millions of time steps.