Unknown

Dataset Information

0

Eight practices for data management to enable team data science.


ABSTRACT:

Introduction

In clinical and translational research, data science is often and fortuitously integrated with data collection. This contrasts to the typical position of data scientists in other settings, where they are isolated from data collectors. Because of this, effective use of data science techniques to resolve translational questions requires innovation in the organization and management of these data.

Methods

We propose an operational framework that respects this important difference in how research teams are organized. To maximize the accuracy and speed of the clinical and translational data science enterprise under this framework, we define a set of eight best practices for data management.

Results

In our own work at the University of Rochester, we have strived to utilize these practices in a customized version of the open source LabKey platform for integrated data management and collaboration. We have applied this platform to cohorts that longitudinally track multidomain data from over 3000 subjects.

Conclusions

We argue that this has made analytical datasets more readily available and lowered the bar to interdisciplinary collaboration, enabling a team-based data science that is unique to the clinical and translational setting.

SUBMITTER: McDavid A 

PROVIDER: S-EPMC8057476 | biostudies-literature | 2020 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Eight practices for data management to enable team data science.

McDavid Andrew A   Corbett Anthony M AM   Dutra Jennifer L JL   Straw Andrew G AG   Topham David J DJ   Pryhuber Gloria S GS   Caserta Mary T MT   Gill Steven R SR   Scheible Kristin M KM   Holden-Wiltse Jeanne J  

Journal of clinical and translational science 20200623 1


<h4>Introduction</h4>In clinical and translational research, data science is often and fortuitously integrated with data collection. This contrasts to the typical position of data scientists in other settings, where they are isolated from data collectors. Because of this, effective use of data science techniques to resolve translational questions requires innovation in the organization and management of these data.<h4>Methods</h4>We propose an operational framework that respects this important d  ...[more]

Similar Datasets

| S-EPMC10603362 | biostudies-literature
| S-EPMC7309241 | biostudies-literature
| S-EPMC10346083 | biostudies-literature
| S-EPMC4183174 | biostudies-other
| S-EPMC6638994 | biostudies-literature
| S-EPMC6284825 | biostudies-literature
| S-EPMC11340681 | biostudies-literature
2017-08-25 | GSE103063 | GEO
| S-EPMC11374999 | biostudies-literature
| S-EPMC5263095 | biostudies-literature