Unknown

Dataset Information

0

Nunchaku: optimally partitioning data into piece-wise contiguous segments.


ABSTRACT:

Motivation

When analyzing 1D time series, scientists are often interested in identifying regions where one variable depends linearly on the other. Typically, they use an ad hoc and therefore often subjective method to do so.

Results

Here, we develop a statistically rigorous, Bayesian approach to infer the optimal partitioning of a dataset not only into contiguous piece-wise linear segments, but also into contiguous segments described by linear combinations of arbitrary basis functions. We therefore present a general solution to the problem of identifying discontinuous change points. Focusing on microbial growth, we use the algorithm to find the range of optical density where this density is linearly proportional to the number of cells and to automatically find the regions of exponential growth for both Escherichia coli and Saccharomyces cerevisiae. For budding yeast, we consequently are able to infer the Monod constant for growth on fructose. Our algorithm lends itself to automation and high throughput studies, increases reproducibility, and should facilitate data analyses for a broad range of scientists.

Availability and implementation

The corresponding Python package, entitled Nunchaku, is available at PyPI: https://pypi.org/project/nunchaku.

SUBMITTER: Huo Y 

PROVIDER: S-EPMC10697733 | biostudies-literature | 2023 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Nunchaku: optimally partitioning data into piece-wise contiguous segments.

Huo Yu Y   Li Hongpei H   Wang Xiao X   Du Xiaochen X   Swain Peter S PS  

Bioinformatics (Oxford, England) 20231201 12


<h4>Motivation</h4>When analyzing 1D time series, scientists are often interested in identifying regions where one variable depends linearly on the other. Typically, they use an ad hoc and therefore often subjective method to do so.<h4>Results</h4>Here, we develop a statistically rigorous, Bayesian approach to infer the optimal partitioning of a dataset not only into contiguous piece-wise linear segments, but also into contiguous segments described by linear combinations of arbitrary basis funct  ...[more]

Similar Datasets

| S-EPMC6992702 | biostudies-literature
| S-EPMC4652579 | biostudies-literature
| S-EPMC2693521 | biostudies-literature
| S-EPMC8497621 | biostudies-literature
| S-EPMC4035737 | biostudies-literature
| S-EPMC4965793 | biostudies-literature
| S-EPMC9138039 | biostudies-literature
| S-EPMC8751086 | biostudies-literature
| S-EPMC7039547 | biostudies-literature
| S-EPMC7743982 | biostudies-literature