Sensitivity analysis of agent-based simulation utilizing massively parallel computation and interactive data visualization.
ABSTRACT: An essential step in the analysis of agent-based simulation is sensitivity analysis, which namely examines the dependency of parameter values on simulation results. Although a number of approaches have been proposed for sensitivity analysis, they still have limitations in exhaustivity and interpretability. In this study, we propose a novel methodology for sensitivity analysis of agent-based simulation, MASSIVE (Massively parallel Agent-based Simulations and Subsequent Interactive Visualization-based Exploration). MASSIVE takes a unique paradigm, which is completely different from those of sensitivity analysis methods developed so far, By combining massively parallel computation and interactive data visualization, MASSIVE enables us to inspect a broad parameter space intuitively. We demonstrated the utility of MASSIVE by its application to cancer evolution simulation, which successfully identified conditions that generate heterogeneous tumors. We believe that our approach would be a de facto standard for sensitivity analysis of agent-based simulation in an era of evergrowing computational technology. All the results form our MASSIVE analysis are available at https://www.hgc.jp/~niiyan/massive.
Project description:Neuronal network models and corresponding computer simulations are invaluable tools to aid the interpretation of the relationship between neuron properties, connectivity, and measured activity in cortical tissue. Spatiotemporal patterns of activity propagating across the cortical surface as observed experimentally can for example be described by neuronal network models with layered geometry and distance-dependent connectivity. In order to cover the surface area captured by today's experimental techniques and to achieve sufficient self-consistency, such models contain millions of nerve cells. The interpretation of the resulting stream of multi-modal and multi-dimensional simulation data calls for integrating interactive visualization steps into existing simulation-analysis workflows. Here, we present a set of interactive visualization concepts called views for the visual analysis of activity data in topological network models, and a corresponding reference implementation VIOLA (VIsualization Of Layer Activity). The software is a lightweight, open-source, web-based, and platform-independent application combining and adapting modern interactive visualization paradigms, such as coordinated multiple views, for massively parallel neurophysiological data. For a use-case demonstration we consider spiking activity data of a two-population, layered point-neuron network model incorporating distance-dependent connectivity subject to a spatially confined excitation originating from an external population. With the multiple coordinated views, an explorative and qualitative assessment of the spatiotemporal features of neuronal activity can be performed upfront of a detailed quantitative data analysis of specific aspects of the data. Interactive multi-view analysis therefore assists existing data analysis workflows. Furthermore, ongoing efforts including the European Human Brain Project aim at providing online user portals for integrated model development, simulation, analysis, and provenance tracking, wherein interactive visual analysis tools are one component. Browser-compatible, web-technology based solutions are therefore required. Within this scope, with VIOLA we provide a first prototype.
Project description:Agent-based models (ABM) are widely used to study immune systems, providing a procedural and interactive view of the underlying system. The interaction of components and the behavior of individual objects is described procedurally as a function of the internal states and the local interactions, which are often stochastic in nature. Such models typically have complex structures and consist of a large number of modeling parameters. Determining the key modeling parameters which govern the outcomes of the system is very challenging. Sensitivity analysis plays a vital role in quantifying the impact of modeling parameters in massively interacting systems, including large complex ABM. The high computational cost of executing simulations impedes running experiments with exhaustive parameter settings. Existing techniques of analyzing such a complex system typically focus on local sensitivity analysis, i.e. one parameter at a time, or a close "neighborhood" of particular parameter settings. However, such methods are not adequate to measure the uncertainty and sensitivity of parameters accurately because they overlook the global impacts of parameters on the system. In this article, we develop novel experimental design and analysis techniques to perform both global and local sensitivity analysis of large-scale ABMs. The proposed method can efficiently identify the most significant parameters and quantify their contributions to outcomes of the system. We demonstrate the proposed methodology for ENteric Immune SImulator (ENISI), a large-scale ABM environment, using a computational model of immune responses to Helicobacter pylori colonization of the gastric mucosa.
Project description:How various types of focus differ with respect to exhaustivity has been a topic of enduring interest in language studies. However, most of the theoretical work explicating such associations has done so cross-linguistically, and little research has been done on how people process and respond to them during language comprehension. This study therefore investigates the associations between the concept of exhaustivity and three focus types in Chinese (wh, cleft, and only foci) using a trichotomous-response design in two experiments: a forced-choice judgment and a self-paced reading experiment, both with adult native speakers. Its results show that, whether engaged in conscious decision-making or an implicit comprehension process, the participants distinguished only-focus and cleft-focus from wh-focus clearly, and also that there are specific differences between only-focus and cleft-focus in conscious decision-making. This implies that, in terms of the relationship between exhaustivity and the focus types under investigation, cleft-focus and only-focus behave very similarly during language comprehension despite the existence of some fine distinctions between them. In other words, the potential linguistic levels that exhaustivity encodes in Chinese cleft-focus render it more similar to only-focus than to wh-focus. These results are broadly in line with the semantic account that distinguishes cleft from only-focus, i.e., that cleft encodes exhaustivity in not-at-issue presupposition and only-focus encodes exhaustivity in at-issue assertion, while both express semantically encoded exhaustivity, triggering robust language-processing patterns that differ from patterns of wh-focus in Chinese.
Project description:Interactive multi-beam laser machining simulation is crucial in the context of tool path planning and optimization of laser machining parameters. Current simulation approaches for heat transfer analysis (1) rely on numerical Finite Element methods (or any of its variants), non-suitable for interactive applications; and (2) require the multiple laser beams to be completely synchronized in trajectories, parameters and time frames. To overcome this limitation, this manuscript presents an algorithm for interactive simulation of the transient temperature field on the sheet metal. Contrary to standard numerical methods, our algorithm is based on an analytic solution in the frequency domain, allowing arbitrary time/space discretizations without loss of precision and non-monotonic retrieval of the temperature history. In addition, the method allows complete asynchronous laser beams with independent trajectories, parameters and time frames. Our implementation in a GPU device allows simulations at interactive rates even for a large amount of simultaneous laser beams. The presented method is already integrated into an interactive simulation environment for sheet cutting. Ongoing work addresses thermal stress coupling and laser ablation.
Project description:A recently discovered tea [Camellia sinensis (L.) O. Kuntze] cultivar can generate tender shoots in winter. We performed comparative proteomics to analyze the differentially accumulated proteins between winter and spring tender shoots of this clonal cultivar to reveal the physiological basis of its evergrowing character during winter.We extracted proteins from the winter and spring tender shoots (newly formed two leaves and a bud) of the evergrowing tea cultivar "Dongcha11" respectively. Thirty-three differentially accumulated high-confidence proteins were identified by matrix-assisted laser desorption ionization time of flight mass spectrometry (MALDI-TOF / TOF MS). Among these, 24 proteins had increased abundance while nine showed were decreased abundance in winter tender shoots as compared with the spring tender shoots. We categorized the differentially accumulated proteins into eight critical biological processes based on protein function annotation including photosynthesis, cell structure, protein synthesis & destination, transporters, metabolism of sugars and polysaccharides, secondary metabolism, disease/defense and proteins with unknown functions. Proteins with increased abundance in winter tender shoots were mainly related to the processes of photosynthesis, cytoskeleton and protein synthesis, whereas those with decreased abundance were correlated to metabolism and the secondary metabolism of polyphenolic flavonoids. Biochemical analysis showed that the total contents of soluble sugar and amino acid were higher in winter tender shoots while tea polyphenols were lower as compared with spring tender shoots.Our study suggested that the simultaneous increase in the abundance of photosynthesis-related proteins rubisco, plastocyanin, and ATP synthase delta chain, metabolism-related proteins eIF4 and protease subunits, and the cytoskeleton-structure associated proteins phosphatidylinositol transfer protein and profilin may be because of the adaptation of the evergrowing tea cultivar "Dongcha11" to low temperature and light conditions. Histone H4, Histone H2A.1, putative In2.1 protein and protein lin-28 homologs may also regulate the development of winter shoots and their response to adverse conditions.
Project description:Molecular simulation has historically been a low-throughput technique, but faster computers and increasing amounts of genomic and structural data are changing this by enabling large-scale automated simulation of, for instance, many conformers or mutants of biomolecules with or without a range of ligands. At the same time, advances in performance and scaling now make it possible to model complex biomolecular interaction and function in a manner directly testable by experiment. These applications share a need for fast and efficient software that can be deployed on massive scale in clusters, web servers, distributed computing or cloud resources.Here, we present a range of new simulation algorithms and features developed during the past 4 years, leading up to the GROMACS 4.5 software package. The software now automatically handles wide classes of biomolecules, such as proteins, nucleic acids and lipids, and comes with all commonly used force fields for these molecules built-in. GROMACS supports several implicit solvent models, as well as new free-energy algorithms, and the software now uses multithreading for efficient parallelization even on low-end systems, including windows-based workstations. Together with hand-tuned assembly kernels and state-of-the-art parallelization, this provides extremely high performance and cost efficiency for high-throughput as well as massively parallel simulations.GROMACS is an open source and free software available from http://www.gromacs.org.Supplementary data are available at Bioinformatics online.
Project description:Genetic correlation is a key population parameter that describes the shared genetic architecture of complex traits and diseases. It can be estimated by current state-of-art methods, i.e., linkage disequilibrium score regression (LDSC) and genomic restricted maximum likelihood (GREML). The massively reduced computing burden of LDSC compared to GREML makes it an attractive tool, although the accuracy (i.e., magnitude of standard errors) of LDSC estimates has not been thoroughly studied. In simulation, we show that the accuracy of GREML is generally higher than that of LDSC. When there is genetic heterogeneity between the actual sample and reference data from which LD scores are estimated, the accuracy of LDSC decreases further. In real data analyses estimating the genetic correlation between schizophrenia (SCZ) and body mass index, we show that GREML estimates based on ?150,000 individuals give a higher accuracy than LDSC estimates based on ?400,000 individuals (from combined meta-data). A GREML genomic partitioning analysis reveals that the genetic correlation between SCZ and height is significantly negative for regulatory regions, which whole genome or LDSC approach has less power to detect. We conclude that LDSC estimates should be carefully interpreted as there can be uncertainty about homogeneity among combined meta-datasets. We suggest that any interesting findings from massive LDSC analysis for a large number of complex traits should be followed up, where possible, with more detailed analyses with GREML methods, even if sample sizes are lesser.
Project description:Next generation sequencing is widely used to link genetic variants to diseases, and it has massively accelerated the diagnosis and characterization of rare genetic diseases. After initial bioinformatic data processing, the interactive analysis of genome, exome, and panel sequencing data typically starts from lists of genetic variants in VCF format. Medical geneticists filter and annotate these lists to identify variants that may be relevant for the disease under investigation, or to select variants that are reported in a clinical diagnostics setting. We developed VCF.Filter to facilitate the search for disease-linked variants, providing a standalone Java program with a user-friendly interface for interactive variant filtering and annotation. VCF.Filter allows the user to define a broad range of filtering criteria through a graphical interface. Common workflows such as trio analysis and cohort-based filtering are pre-configured, and more complex analyses can be performed using VCF.Filter's support for custom annotations and filtering criteria. All filtering is documented in the results file, thus providing traceability of the interactive variant prioritization. VCF.Filter is an open source tool that is freely and openly available at http://vcffilter.rarediseases.at.
Project description:BACKGROUND: The tremendous output of massive parallel sequencing technologies requires automated robust and scalable sample preparation methods to fully exploit the new sequence capacity. METHODOLOGY: In this study, a method for automated library preparation of RNA prior to massively parallel sequencing is presented. The automated protocol uses precipitation onto carboxylic acid paramagnetic beads for purification and size selection of both RNA and DNA. The automated sample preparation was compared to the standard manual sample preparation. CONCLUSION/SIGNIFICANCE: The automated procedure was used to generate libraries for gene expression profiling on the Illumina HiSeq 2000 platform with the capacity of 12 samples per preparation with a significantly improved throughput compared to the standard manual preparation. The data analysis shows consistent gene expression profiles in terms of sensitivity and quantification of gene expression between the two library preparation methods.
Project description:Personal-genomics endeavors, such as the 1000 Genomes project, are generating maps of genomic structural variants by analyzing ends of massively sequenced genome fragments. To process these we developed Paired-End Mapper (PEMer; http://sv.gersteinlab.org/pemer). This comprises an analysis pipeline, compatible with several next-generation sequencing platforms; simulation-based error models, yielding confidence-values for each structural variant; and a back-end database. The simulations demonstrated high structural variant reconstruction efficiency for PEMer's coverage-adjusted multi-cutoff scoring-strategy and showed its relative insensitivity to base-calling errors.