Dataset Information

SparkBLAST: scalable BLAST processing using in-memory operations.

ABSTRACT:

Background

The demand for processing ever increasing amounts of genomic data has raised new challenges for the implementation of highly scalable and efficient computational systems. In this paper we propose SparkBLAST, a parallelization of a sequence alignment application (BLAST) that employs cloud computing for the provisioning of computational resources and Apache Spark as the coordination framework. As a proof of concept, some radionuclide-resistant bacterial genomes were selected for similarity analysis.

Results

Experiments in Google and Microsoft Azure clouds demonstrated that SparkBLAST outperforms an equivalent system implemented on Hadoop in terms of speedup and execution times.

Conclusions

The superior performance of SparkBLAST is mainly due to the in-memory operations available through the Spark framework, consequently reducing the number of local I/O operations required for distributed BLAST processing.

SUBMITTER: de Castro MR

PROVIDER: S-EPMC5488373 | biostudies-literature | 2017 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

SparkBLAST: scalable BLAST processing using in-memory operations.

de Castro Marcelo Rodrigo MR Tostes Catherine Dos Santos CDS Dávila Alberto M R AMR Senger Hermes H da Silva Fabricio A B FAB

BMC bioinformatics 20170627 1

<h4>Background</h4>The demand for processing ever increasing amounts of genomic data has raised new challenges for the implementation of highly scalable and efficient computational systems. In this paper we propose SparkBLAST, a parallelization of a sequence alignment application (BLAST) that employs cloud computing for the provisioning of computational resources and Apache Spark as the coordination framework. As a proof of concept, some radionuclide-resistant bacterial genomes were selected for ...[more]

PMID: 28655296

Similar Datasets

Project description:Exposure to blast overpressure has been a pervasive feature of combat-related injuries. Studies exploring the neurological correlates of repeated low-level blast exposure in career "breachers" demonstrated higher levels of tumor necrosis factor alpha (TNFα) and interleukin (IL)-6 and decreases in IL-10 within brain-derived extracellular vesicles (BDEVs). The current pilot study was initiated in partnership with the U.S. Special Operations Command (USSOCOM) to explore whether neuroinflammation is seen within special operators with prior blast exposure. Data were analyzed from 18 service members (SMs), inclusive of 9 blast-exposed special operators with an extensive career history of repeated blast exposures and 9 controls matched by age and duration of service. Neuroinflammation was assessed utilizing positron emission tomography (PET) imaging with [18F]DPA-714. Serum was acquired to assess inflammatory biomarkers within whole serum and BDEVs. The Blast Exposure Threshold Survey (BETS) was acquired to determine blast history. Both self-report and neurocognitive measures were acquired to assess cognition. Similarity-driven Multi-view Linear Reconstruction (SiMLR) was used for joint analysis of acquired data. Analysis of BDEVs indicated significant positive associations with a generalized blast exposure value (GBEV) derived from the BETS. SiMLR-based analyses of neuroimaging demonstrated exposure-related relationships between GBEV, PET-neuroinflammation, cortical thickness, and volume loss within special operators. Affected brain networks included regions associated with memory retrieval and executive functioning, as well as visual and heteromodal processing. Post hoc assessments of cognitive measures failed to demonstrate significant associations with GBEV. This emerging evidence suggests neuroinflammation may be a key feature of the brain response to blast exposure over a career in operational personnel. The common thread of neuroinflammation observed in blast-exposed populations requires further study.

Dataset Information

SparkBLAST: scalable BLAST processing using in-memory operations.

Background

Results

Conclusions

Publications

SparkBLAST: scalable BLAST processing using in-memory operations.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets