Unknown

Dataset Information

0

The Biological Object Notation (BON): a structured file format for biological data.


ABSTRACT: The large size and high complexity of biological data can represent a major methodological challenge for the analysis and exchange of data sets between computers and applications. There has also been a substantial increase in the amount of metadata associated with biological data sets, which is being increasingly incorporated into existing data formats. Despite the existence of structured formats based on XML, biological data sets are mainly formatted using unstructured file formats, and the incorporation of metadata results in increasingly complex parsing routines such that they become more error prone. To overcome these problems, we present the "biological object notation" (BON) format, a new way to exchange and parse nearly all biological data sets more efficiently and with less error than other currently available formats. Based on JavaScript Object Notation (JSON), BON simplifies parsing by clearly separating the biological data from its metadata and reduces complexity compared to XML based formats. The ability to selectively compress data up to 87% compared to other file formats and the reduced complexity results in improved transfer times and less error prone applications.

SUBMITTER: Buchmann JP 

PROVIDER: S-EPMC6018389 | biostudies-literature | 2018 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

The Biological Object Notation (BON): a structured file format for biological data.

Buchmann Jan P JP   Fourment Mathieu M   Holmes Edward C EC  

Scientific reports 20180625 1


The large size and high complexity of biological data can represent a major methodological challenge for the analysis and exchange of data sets between computers and applications. There has also been a substantial increase in the amount of metadata associated with biological data sets, which is being increasingly incorporated into existing data formats. Despite the existence of structured formats based on XML, biological data sets are mainly formatted using unstructured file formats, and the inc  ...[more]

Similar Datasets

| S-EPMC8487582 | biostudies-literature
| S-EPMC4874736 | biostudies-literature
| S-EPMC8648559 | biostudies-literature
| S-EPMC2945790 | biostudies-literature
| S-EPMC9237710 | biostudies-literature
| S-EPMC2655813 | biostudies-literature
| S-EPMC7265431 | biostudies-literature
| S-EPMC1913543 | biostudies-literature