Unknown

Dataset Information

0

Clumppling: cluster matching and permutation program with integer linear programming.


ABSTRACT:

Motivation

In the mixed-membership unsupervised clustering analyses commonly used in population genetics, multiple replicate data analyses can differ in their clustering solutions. Combinatorial algorithms assist in aligning clustering outputs from multiple replicates so that clustering solutions can be interpreted and combined across replicates. Although several algorithms have been introduced, challenges exist in achieving optimal alignments and performing alignments in reasonable computation time.

Results

We present Clumppling, a method for aligning replicate solutions in mixed-membership unsupervised clustering. The method uses integer linear programming for finding optimal alignments, embedding the cluster alignment problem in standard combinatorial optimization frameworks. In example analyses, we find that it achieves solutions with preferred values of a desired objective function relative to those achieved by Pong and that it proceeds with less computation time than Clumpak. It is also the first method to permit alignments across replicates with multiple arbitrary values of the number of clusters K.

Availability and implementation

Clumppling is available at https://github.com/PopGenClustering/Clumppling.

SUBMITTER: Liu X 

PROVIDER: S-EPMC10766593 | biostudies-literature | 2024 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Clumppling: cluster matching and permutation program with integer linear programming.

Liu Xiran X   Kopelman Naama M NM   Rosenberg Noah A NA  

Bioinformatics (Oxford, England) 20240101 1


<h4>Motivation</h4>In the mixed-membership unsupervised clustering analyses commonly used in population genetics, multiple replicate data analyses can differ in their clustering solutions. Combinatorial algorithms assist in aligning clustering outputs from multiple replicates so that clustering solutions can be interpreted and combined across replicates. Although several algorithms have been introduced, challenges exist in achieving optimal alignments and performing alignments in reasonable comp  ...[more]

Similar Datasets

| S-EPMC6573476 | biostudies-literature
| S-EPMC4016706 | biostudies-literature
| S-EPMC8599758 | biostudies-literature
| S-EPMC7148046 | biostudies-literature
| S-EPMC9794831 | biostudies-literature
| S-EPMC4392707 | biostudies-literature
| S-EPMC3683041 | biostudies-literature
| S-EPMC10628435 | biostudies-literature
| S-EPMC6156097 | biostudies-literature
| S-EPMC2396433 | biostudies-literature