Unknown

Dataset Information

0

Linking scientific instruments and computation: Patterns, technologies, and experiences.


ABSTRACT: Powerful detectors at modern experimental facilities routinely collect data at multiple GB/s. Online analysis methods are needed to enable the collection of only interesting subsets of such massive data streams, such as by explicitly discarding some data elements or by directing instruments to relevant areas of experimental space. Thus, methods are required for configuring and running distributed computing pipelines-what we call flows-that link instruments, computers (e.g., for analysis, simulation, artificial intelligence [AI] model training), edge computing (e.g., for analysis), data stores, metadata catalogs, and high-speed networks. We review common patterns associated with such flows and describe methods for instantiating these patterns. We present experiences with the application of these methods to the processing of data from five different scientific instruments, each of which engages powerful computers for data inversion,model training, or other purposes. We also discuss implications of such methods for operators and users of scientific facilities.

SUBMITTER: Vescovi R 

PROVIDER: S-EPMC9583115 | biostudies-literature | 2022 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Linking scientific instruments and computation: Patterns, technologies, and experiences.

Vescovi Rafael R   Chard Ryan R   Saint Nickolaus D ND   Blaiszik Ben B   Pruyne Jim J   Bicer Tekin T   Lavens Alex A   Liu Zhengchun Z   Papka Michael E ME   Narayanan Suresh S   Schwarz Nicholas N   Chard Kyle K   Foster Ian T IT  

Patterns (New York, N.Y.) 20221014 10


Powerful detectors at modern experimental facilities routinely collect data at multiple GB/s. Online analysis methods are needed to enable the collection of only interesting subsets of such massive data streams, such as by explicitly discarding some data elements or by directing instruments to relevant areas of experimental space. Thus, methods are required for configuring and running distributed computing pipelines-what we call flows-that link instruments, computers (e.g., for analysis, simulat  ...[more]

Similar Datasets

| S-EPMC5003368 | biostudies-literature
| S-EPMC7728532 | biostudies-literature
| 2366841 | ecrin-mdr-crc
| S-EPMC6731666 | biostudies-literature
| S-EPMC8298562 | biostudies-literature
| S-EPMC11397671 | biostudies-literature
| S-EPMC6615640 | biostudies-literature
| S-EPMC6965551 | biostudies-literature
| S-EPMC11676554 | biostudies-literature
| S-EPMC1924584 | biostudies-literature