Project description:Deep sequencing of transcriptomes has become an indispensable tool for biology, enabling expression levels for thousands of genes to be compared across multiple samples. Since transcript counts scale with sequencing depth, counts from different samples must be normalized to a common scale prior to comparison. We analyzed fifteen existing and novel algorithms for normalizing transcript counts, and evaluated the effectiveness of the resulting normalizations. For this purpose we defined two novel and mutually independent metrics: (1) the number of "uniform" genes (genes whose normalized expression levels have a sufficiently low coefficient of variation), and (2) low Spearman correlation between normalized expression profiles of gene pairs. We also define four novel algorithms, one of which explicitly maximizes the number of uniform genes, and compared the performance of all fifteen algorithms. The two most commonly used methods (scaling to a fixed total value, or equalizing the expression of certain 'housekeeping' genes) yielded particularly poor results, surpassed even by normalization based on randomly selected gene sets. Conversely, seven of the algorithms approached what appears to be optimal normalization. Three of these algorithms rely on the identification of "ubiquitous" genes: genes expressed in all the samples studied, but never at very high or very low levels. We demonstrate that these include a "core" of genes expressed in many tissues in a mutually consistent pattern, which is suitable for use as an internal normalization guide. The new methods yield robustly normalized expression values, which is a prerequisite for the identification of differentially expressed and tissue-specific genes as potential biomarkers.
Project description:Density functional theory offers accurate structure prediction at acceptable computational cost, but commonly used approximations suffer from delocalization error; this results in inaccurate predictions of quantities such as energy band gaps of finite and bulk systems, energy level alignments, and electron distributions at interfaces. The localized orbital scaling correction (LOSC) was developed to correct delocalization error by using orbitals localized in space and energy. These localized orbitals span both the occupied and unoccupied spaces and can have fractional occupations in order to correct both the total energy and the one-electron energy eigenvalues. We extend the LOSC method to periodic systems, in which the localized orbitals employed are dually localized Wannier functions. In light of the effect of the bulk environment on the electrostatic interaction between localized orbitals, we modify the LOSC energy correction to include a screened Coulomb kernel. For a test set of semiconductors and large-gap insulators, we show that the screened LOSC method consistently improves the band gap compared to the parent density functional approximation.
Project description:The wide diversity of dendritic trees is one of the most striking features of neural circuits. Here we develop a general quantitative theory relating the total length of dendritic wiring to the number of branch points and synapses. We show that optimal wiring predicts a 2/3 power law between these measures. We demonstrate that the theory is consistent with data from a wide variety of neurons across many different species and helps define the computational compartments in dendritic trees. Our results imply fundamentally distinct design principles for dendritic arbors compared with vascular, bronchial, and botanical trees.
Project description:Gene expression microarray data is notoriously subject to high signal variability. Moreover, unavoidable variation in the concentration of transcripts applied to microarrays may result in poor scaling of the summarized data which can hamper analytical interpretations. This is especially relevant in a systems biology context, where systematic biases in the signals of particular genes can have severe effects on subsequent analyses. Conventionally it would be necessary to replace the mismatched arrays, but individual time points cannot be rerun and inserted because of experimental variability. It would therefore be necessary to repeat the whole time series experiment, which is both impractical and expensive.We explain how scaling mismatches occur in data summarized by the popular MAS5 (GCOS; Affymetrix) algorithm, and propose a simple recursive algorithm to correct them. Its principle is to identify a set of constant genes and to use this set to rescale the microarray signals. We study the properties of the algorithm using artificially generated data and apply it to experimental data. We show that the set of constant genes it generates can be used to rescale data from other experiments, provided that the underlying system is similar to the original. We also demonstrate, using a simple example, that the method can successfully correct existing imbalances in the data.The set of constant genes obtained for a given experiment can be applied to other experiments, provided the systems studied are sufficiently similar. This type of rescaling is especially relevant in systems biology applications using microarray data.
Project description:We introduce a repeater scheme to efficiently distribute multipartite entangled states in a quantum network with optimal scaling. The scheme allows to generate graph states such as 2D and 3D cluster states of growing size or GHZ states over arbitrary distances, with a constant overhead per node/channel that is independent of the distance. The approach is genuine multipartite, and is based on the measurement-based implementation of multipartite hashing, an entanglement purification protocol that operates on a large ensemble together with local merging/connection of elementary building blocks. We analyze the performance of the scheme in a setting where local or global storage is limited, and compare it to bipartite and hybrid approaches that are based on the distribution of entangled pairs. We find that the multipartite approach offers a storage advantage, which results in higher efficiency and better performance in certain parameter regimes. We generalize our approach to arbitrary network topologies and different target graph states.
Project description:Recently it has been shown that the control energy required to control a dynamical complex network is prohibitively large when there are only a few control inputs. Most methods to reduce the control energy have focused on where, in the network, to place additional control inputs. Here, in contrast, we show that by controlling the states of a subset of the nodes of a network, rather than the state of every node, while holding the number of control signals constant, the required energy to control a portion of the network can be reduced substantially. The energy requirements exponentially decay with the number of target nodes, suggesting that large networks can be controlled by a relatively small number of inputs as long as the target set is appropriately sized. We validate our conclusions in model and real networks to arrive at an energy scaling law to better design control objectives regardless of system size, energy restrictions, state restrictions, input node choices and target node choices.
Project description:A reliable method for the determination of bulk-solvent model parameters and an overall anisotropic scale factor is of increasing importance as structure determination becomes more automated. Current protocols require the manual inspection of refinement results in order to detect errors in the calculation of these parameters. Here, a robust method for determining bulk-solvent and anisotropic scaling parameters in macromolecular refinement is described. The implementation of a maximum-likelihood target function for determining the same parameters is also discussed. The formulas and corresponding derivatives of the likelihood function with respect to the solvent parameters and the components of anisotropic scale matrix are presented. These algorithms are implemented in the CCTBX bulk-solvent correction and scaling module.