Project description:PyMOL is often used to generate images of biomolecular structures. Hundreds of parameters in PyMOL provide precise control over the appearance of structures. We developed 241 Python functions-called "shortcuts"-that extend and ease the use of PyMOL. A user runs a shortcut by entering its name at the PyMOL prompt. We clustered the shortcuts by functionality into 25 groups for faster look-up. One set of shortcuts generates new styles of molecular representation. Another group saves files with time stamps in the file names; the unique filenames avoid overwriting files that have already been developed. A third group submits search terms in the user's web browser. The help function prints the function's documentation to the command history window. This documentation includes the PyMOL commands that the user can reuse by copying and pasting onto the command line or into a script file. The shortcuts should save the average PyMOL user many hours per year searching for code fragments in their computer or on-line. STATEMENT FOR LAY PUBLIC: Computer-generated images of protein structures are vital to the interpretation of and communication about the molecular structure of proteins. PyMOL is a popular computer program for generating such images. We made a large collection of macros or shortcuts that save time by executing complex operations with a few keystrokes.
Project description:During training, models can exploit spurious correlations as shortcuts, resulting in poor generalization performance when shortcuts do not persist. In this work, assuming access to a representation based on domain knowledge (i.e., known concepts) that is invariant to shortcuts, we aim to learn robust and accurate models from biased training data. In contrast to previous work, we do not rely solely on known concepts, but allow the model to also learn unknown concepts. We propose two approaches for mitigating shortcuts that incorporate domain knowledge, while accounting for potentially important yet unknown concepts. The first approach is two-staged. After fitting a model using known concepts, it accounts for the residual using unknown concepts. While flexible, we show that this approach is vulnerable when shortcuts are correlated with the unknown concepts. This limitation is addressed by our second approach that extends a recently proposed regularization penalty. Applied to two real-world datasets, we demonstrate that both approaches can successfully mitigate shortcut learning.
Project description:Human language contains regular syntactic structures and grammatical patterns that should be detectable in their co-occurence networks. However, most standard complex network measures can hardly differentiate between co-occurence networks built from an empirical corpus and a body of scrambled text. In this work, we employ a motif extraction procedure to show that empirical networks have much greater motif densities. We demonstrate that motifs function as efficient and effective shortcuts in language networks, potentially explaining why we are able to generate and decipher language expressions so rapidly. Finally we suggest a link between motifs and constructions in Construction Grammar as well as speculate on the mechanisms behind the emergence of constructions in the early stages of language acquisition.
Project description:The quantum perceptron is a fundamental building block for quantum machine learning. This is a multidisciplinary field that incorporates abilities of quantum computing, such as state superposition and entanglement, to classical machine learning schemes. Motivated by the techniques of shortcuts to adiabaticity, we propose a speed-up quantum perceptron where a control field on the perceptron is inversely engineered leading to a rapid nonlinear response with a sigmoid activation function. This results in faster overall perceptron performance compared to quasi-adiabatic protocols, as well as in enhanced robustness against imperfections in the controls.
Project description:Based on research on expertise a person can be said to possess integrated conceptual knowledge when she/he is able to spontaneously identify task relevant information in order to solve a problem efficiently. Despite the lack of instruction or explicit cueing, the person should be able to recognize which shortcut strategy can be applied--even when the task context differs from the one in which procedural knowledge about the shortcut was originally acquired. For mental arithmetic, first signs of such adaptive flexibility should develop already in primary school. The current study introduces a paper-and-pencil-based as well as an eyetracking-based approach to unobtrusively measure how students spot and apply (known) shortcut options in mental arithmetic. We investigated the development and the relation of the spontaneous use of two strategies derived from the mathematical concept of commutativity. Children from grade 2 to grade 7 and university students solved three-addends addition problems, which are rarely used in class. Some problems allowed the use of either of two commutativity-based shortcut strategies. Results suggest that from grade three onwards both of the shortcuts were used spontaneously and application of one shortcut correlated positively with application of the other. Rate of spontaneous usage was substantial but smaller than in an instructed variant. Eyetracking data suggested similar fixation patterns for spontaneous an instructed shortcut application. The data are consistent with the development of an integrated concept of the mathematical principle so that it can be spontaneously applied in different contexts and strategies.
Project description:Shortcuts to adiabaticity are powerful quantum control methods, allowing quick evolution into target states of otherwise slow adiabatic dynamics. Such methods have widespread applications in quantum technologies, and various shortcuts to adiabaticity protocols have been demonstrated in closed systems. However, realizing shortcuts to adiabaticity for open quantum systems has presented a challenge due to the complex controls in existing proposals. Here, we present the experimental demonstration of shortcuts to adiabaticity for open quantum systems, using a superconducting circuit quantum electrodynamics system. By applying a counterdiabatic driving pulse, we reduce the adiabatic evolution time of a single lossy mode from 800 ns to 100 ns. In addition, we propose and implement an optimal control protocol to achieve fast and qubit-unconditional equilibrium of multiple lossy modes. Our results pave the way for precise time-domain control of open quantum systems and have potential applications in designing fast open-system protocols of physical and interdisciplinary interest, such as accelerating bioengineering and chemical reaction dynamics.
Project description:Artificial intelligence (AI) researchers and radiologists have recently reported AI systems that accurately detect COVID-19 in chest radiographs. However, the robustness of these systems remains unclear. Using state-of-the-art techniques in explainable AI, we demonstrate that recent deep learning systems to detect COVID-19 from chest radiographs rely on confounding factors rather than medical pathology, creating an alarming situation in which the systems appear accurate, but fail when tested in new hospitals. We observe that the approach to obtain training data for these AI systems introduces a nearly ideal scenario for AI to learn these spurious "shortcuts." Because this approach to data collection has also been used to obtain training data for detection of COVID-19 in computed tomography scans and for medical imaging tasks related to other diseases, our study reveals a far-reaching problem in medical imaging AI. In addition, we show that evaluation of a model on external data is insufficient to ensure AI systems rely on medically relevant pathology, since the undesired "shortcuts" learned by AI systems may not impair performance in new hospitals. These findings demonstrate that explainable AI should be seen as a prerequisite to clinical deployment of ML healthcare models.
Project description:Inference and interpretation of evolutionary processes, in particular of the types and targets of natural selection affecting coding sequences, are critically influenced by the assumptions built into statistical models and tests. If certain aspects of the substitution process (even when they are not of direct interest) are presumed absent or are modeled with too crude of a simplification, estimates of key model parameters can become biased, often systematically, and lead to poor statistical performance. Previous work established that failing to accommodate multinucleotide (or multihit, MH) substitutions strongly biases dN/dS-based inference towards false-positive inferences of diversifying episodic selection, as does failing to model variation in the rate of synonymous substitution (SRV) among sites. Here, we develop an integrated analytical framework and software tools to simultaneously incorporate these sources of evolutionary complexity into selection analyses. We found that both MH and SRV are ubiquitous in empirical alignments, and incorporating them has a strong effect on whether or not positive selection is detected (1.4-fold reduction) and on the distributions of inferred evolutionary rates. With simulation studies, we show that this effect is not attributable to reduced statistical power caused by using a more complex model. After a detailed examination of 21 benchmark alignments and a new high-resolution analysis showing which parts of the alignment provide support for positive selection, we show that MH substitutions occurring along shorter branches in the tree explain a significant fraction of discrepant results in selection detection. Our results add to the growing body of literature which examines decades-old modeling assumptions (including MH) and finds them to be problematic for comparative genomic data analysis. Because multinucleotide substitutions have a significant impact on natural selection detection even at the level of an entire gene, we recommend that selection analyses of this type consider their inclusion as a matter of routine. To facilitate this procedure, we developed, implemented, and benchmarked a simple and well-performing model testing selection detection framework able to screen an alignment for positive selection with two biologically important confounding processes: site-to-site synonymous rate variation, and multinucleotide instantaneous substitutions.
Project description:A method is proposed to drive an ultrafast non-adiabatic dynamics of an ultracold gas trapped in a time-dependent box potential. The resulting state is free from spurious excitations associated with the breakdown of adiabaticity, and preserves the quantum correlations of the initial state up to a scaling factor. The process relies on the existence of an adiabatic invariant and the inversion of the dynamical self-similar scaling law dictated by it. Its physical implementation generally requires the use of an auxiliary expulsive potential. The method is extended to a broad family of interacting many-body systems. As illustrative examples we consider the ultrafast expansion of a Tonks-Girardeau gas and of Bose-Einstein condensates in different dimensions, where the method exhibits an excellent robustness against different regimes of interactions and the features of an experimentally realizable box potential.
Project description:Systems Toxicology is the integration of classical toxicology with quantitative analysis of large networks of molecular and functional changes occurring across multiple levels of biological organization. Society demands increasingly close scrutiny of the potential health risks associated with exposure to chemicals present in our everyday life, leading to an increasing need for more predictive and accurate risk-assessment approaches. Developing such approaches requires a detailed mechanistic understanding of the ways in which xenobiotic substances perturb biological systems and lead to adverse outcomes. Thus, Systems Toxicology approaches offer modern strategies for gaining such mechanistic knowledge by combining advanced analytical and computational tools. Furthermore, Systems Toxicology is a means for the identification and application of biomarkers for improved safety assessments. In Systems Toxicology, quantitative systems-wide molecular changes in the context of an exposure are measured, and a causal chain of molecular events linking exposures with adverse outcomes (i.e., functional and apical end points) is deciphered. Mathematical models are then built to describe these processes in a quantitative manner. The integrated data analysis leads to the identification of how biological networks are perturbed by the exposure and enables the development of predictive mathematical models of toxicological processes. This perspective integrates current knowledge regarding bioanalytical approaches, computational analysis, and the potential for improved risk assessment.