Unknown

Dataset Information

0

Descriptive statistics and visualization of data from the R datasets package with implications for clusterability.


ABSTRACT: The manuscript describes and visualizes datasets from the datasets package in the R statistical software, focusing on descriptive statistics and visualizations that provide insights into the clusterability of these datasets. These publicly available datasets are contained in the R software system, and can be downloaded at https://www.r-project.org/, with documentation provided at https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html. Further information on clusterability is found in the companion to this article, To Cluster or Not to Cluster: An Analysis of Clusterability Methods? (https://doi.org/10.1016/j.patcog.2018.10.026). Brief descriptions and graphs of the variables contained in each dataset are provided in the form of means, extrema, quartiles, standard deviation and standard error. Two-dimensional plots for each pair of variables are provided. Original references to the data sets are included when available. Further, each dataset is reduced to a single dimension by each of two different methods: pairwise distances and principal component analysis. For the latter, only the first component is used. Histograms of the reduced data are included for every dataset using both methods.

SUBMITTER: Brownstein NC 

PROVIDER: S-EPMC6612012 | biostudies-literature | 2019 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Descriptive statistics and visualization of data from the <i>R</i> datasets package with implications for clusterability.

Brownstein Naomi C NC   Adolfsson Andreas A   Ackerman Margareta M  

Data in brief 20190524


The manuscript describes and visualizes datasets from the <i>datasets</i> package in the <i>R</i> statistical software, focusing on descriptive statistics and visualizations that provide insights into the clusterability of these datasets. These publicly available datasets are contained in the <i>R</i> software system, and can be downloaded at https://www.r-project.org/, with documentation provided at https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html. Further information on  ...[more]

Similar Datasets

| S-EPMC10186671 | biostudies-literature
| S-EPMC10511070 | biostudies-literature
| S-EPMC10518075 | biostudies-literature
| S-EPMC8714909 | biostudies-literature
| S-EPMC9513427 | biostudies-literature
| S-EPMC11349193 | biostudies-literature
| S-EPMC5064610 | biostudies-literature
2016-11-08 | GSE73638 | GEO
| S-EPMC5870797 | biostudies-literature
| S-EPMC7347224 | biostudies-literature