Community structure detection for overlapping modules through mathematical programming in protein interaction networks.
ABSTRACT: Community structure detection has proven to be important in revealing the underlying properties of complex networks. The standard problem, where a partition of disjoint communities is sought, has been continually adapted to offer more realistic models of interactions in these systems. Here, a two-step procedure is outlined for exploring the concept of overlapping communities. First, a hard partition is detected by employing existing methodologies. We then propose a novel mixed integer non linear programming (MINLP) model, known as OverMod, which transforms disjoint communities to overlapping. The procedure is evaluated through its application to protein-protein interaction (PPI) networks of the rat, E. coli, yeast and human organisms. Connector nodes of hard partitions exhibit topological and functional properties indicative of their suitability as candidates for multiple module membership. OverMod identifies two types of connector nodes, inter and intra-connector, each with their own particular characteristics pertaining to their topological and functional role in the organisation of the network. Inter-connector proteins are shown to be highly conserved proteins participating in pathways that control essential cellular processes, such as proliferation, differentiation and apoptosis and their differences with intra-connectors is highlighted. Many of these proteins are shown to possess multiple roles of distinct nature through their participation in different network modules, setting them apart from proteins that are simply 'hubs', i.e. proteins with many interaction partners but with a more specific biochemical role.
Project description:Finding communities in multilayer networks is a vital step in understanding the structure and dynamics of these layers, where each layer represents a particular type of relationship between nodes in the natural world. However, most community discovery methods for multilayer networks may ignore the interplay between layers or the unique topological structure in a layer. Moreover, most of them can only detect non-overlapping communities. In this paper, we propose a new community discovery method for multilayer networks, which leverages the interplay between layers and the unique topology in a layer to reveal overlapping communities. Through a comprehensive analysis of edge behaviors within and across layers, we first calculate the similarities for edges from the same layer and the cross layers. Then, by leveraging these similarities, we can construct a dendrogram for the multilayer networks that takes both the unique topological structure and the important interplay into consideration. Finally, by introducing a new community density metric for multilayer networks, we can cut the dendrogram to get the overlapping communities for these layers. By applying our method on both synthetic and real-world datasets, we demonstrate that our method has an accurate performance in discovering overlapping communities in multilayer networks.
Project description:Real-world complex networks are composed of non-random quantitative interactions. Identifying communities of nodes that tend to interact more with each other than the network as a whole is a key research focus across multiple disciplines, yet many community detection algorithms only use information about the presence or absence of interactions between nodes. Weighted modularity is a potential method for evaluating the quality of community partitions in quantitative networks. In this framework, the optimal community partition of a network can be found by searching for the partition that maximizes modularity. Attempting to find the partition that maximizes modularity is a computationally hard problem requiring the use of algorithms. QuanBiMo is an algorithm that has been proposed to maximize weighted modularity in bipartite networks. This paper introduces two new algorithms, LPAwb+ and DIRTLPAwb+, for maximizing weighted modularity in bipartite networks. LPAwb+ and DIRTLPAwb+ robustly identify partitions with high modularity scores. DIRTLPAwb+ consistently matched or outperformed QuanBiMo, while the speed of LPAwb+ makes it an attractive choice for detecting the modularity of larger networks. Searching for modules using weighted data (rather than binary data) provides a different and potentially insightful method for evaluating network partitions.
Project description:Large-scale analysis of functional MRI data has revealed that brain regions can be grouped into stable "networks" or communities. In many instances, the communities are characterized as relatively disjoint. Although recent work indicates that brain regions may participate in multiple communities (for example, hub regions), the extent of community overlap is poorly understood. To address these issues, here we investigated large-scale brain networks based on "rest" and task human functional MRI data by employing a mixed-membership Bayesian model that allows each brain region to belong to all communities simultaneously with varying membership strengths. The approach allowed us to 1) compare the structure of disjoint and overlapping communities; 2) determine the relationship between functional diversity (how diverse is a region's functional activation repertoire) and membership diversity (how diverse is a region's affiliation to communities); 3) characterize overlapping community structure; 4) characterize the degree of non-modularity in brain networks; 5) study the distribution of "bridges", including bottleneck and hub bridges. Our findings revealed the existence of dense community overlap that was not limited to "special" hubs. Furthermore, the findings revealed important differences between community organization during rest and during specific task states. Overall, we suggest that dense overlapping communities are well suited to capture the flexible and task dependent mapping between brain regions and their functions.
Project description:Minimal hepatic encephalopathy (MHE) is associated with changes in functional connectivity. To investigate the patterns of modular changes of the functional connectivity in the progression of MHE, resting-state functional magnetic resonance imaging was acquired in 24 MHE patients, 31 cirrhotic patients without minimal hepatic encephalopathy (non-HE), and 38 healthy controls. Newman's metric, the modularity Q value, was maximized and compared in three groups. Topological roles with the progression of MHE were illustrated by intra- and intermodular connectivity changes. Results showed that the Q value of MHE patients was significantly lower than that of controls (P < 0.01) rather than that of non-HE patients (P > 0.05), which was correlated with neuropsychological test scores rather than the ammonia level and Child-Pugh score. Less intrasubcortical connections and more isolated subcortical modules were found with the progression of MHE. The non-HE patients had the same numbers of connect nodes as controls and had more hubs compared with MHE patients and healthy controls. Our findings supported that both intra- and intermodular connectivity, especially those related to subcortical regions, were continuously impaired in cirrhotic patients. The adjustments of hubs and connector nodes in non-HE patients could be a compensation for the decreased modularity in their functional connectivity networks.
Project description:BACKGROUND:Networks whose nodes have labels can seem complex. Fortunately, many have substructures that occur often ("motifs"). A societal example of a motif might be a household. Replacing such motifs by named supernodes reduces the complexity of the network and can bring out insightful features. Doing so repeatedly may give hints about higher level structures of the network. We call this recursive process Recursive Supernode Extraction. RESULTS:This paper describes algorithms and a tool to discover disjoint (i.e. non-overlapping) motifs in a network, replacing those motifs by new nodes, and then recursing. We show applications in food-web and protein-protein interaction (PPI) networks where our methods reduce the complexity of the network and yield insights. CONCLUSIONS:SuperNoder is a web-based and standalone tool which enables the simplification of big graphs based on the reduction of high frequency motifs. It applies various strategies for identifying disjoint motifs with the goal of enhancing the understandability of networks.
Project description:Community detection algorithms have been widely used to study the organization of complex networks like the brain. These techniques provide a partition of brain regions (or nodes) into clusters (or communities), where nodes within a community are densely interconnected with one another. In their simplest application, community detection algorithms are agnostic to the presence of community hierarchies: clusters embedded within clusters of other clusters. To address this limitation, we exercise a multi-scale extension of a common community detection technique, and we apply the tool to synthetic graphs and to graphs derived from human neuroimaging data, including structural and functional imaging data. Our multi-scale community detection algorithm links a graph to copies of itself across neighboring topological scales, thereby becoming sensitive to conserved community organization across levels of the hierarchy. We demonstrate that this method is sensitive to topological inhomogeneities of the graph's hierarchy by providing a local measure of community stability and inter-scale reliability across topological scales. We compare the brain's structural and functional network architectures, and we demonstrate that structural graphs display a more prominent hierarchical community organization than functional graphs. Finally, we build an explicitly multimodal multiplex graph that combines both structural and functional connectivity in a single model, and we identify the topological scales where resting state functional connectivity and underlying structural connectivity show similar versus unique hierarchical community architecture. Together, our results demonstrate the advantages of the multi-scale community detection algorithm in studying hierarchical community structure in brain graphs, and they illustrate its utility in modeling multimodal neuroimaging data.
Project description:It has been a long-standing goal in systems biology to find relations between the topological properties and functional features of protein networks. However, most of the focus in network studies has been on highly connected proteins ("hubs"). As a complementary notion, it is possible to define bottlenecks as proteins with a high betweenness centrality (i.e., network nodes that have many "shortest paths" going through them, analogous to major bridges and tunnels on a highway map). Bottlenecks are, in fact, key connector proteins with surprising functional and dynamic properties. In particular, they are more likely to be essential proteins. In fact, in regulatory and other directed networks, betweenness (i.e., "bottleneck-ness") is a much more significant indicator of essentiality than degree (i.e., "hub-ness"). Furthermore, bottlenecks correspond to the dynamic components of the interaction network-they are significantly less well coexpressed with their neighbors than non-bottlenecks, implying that expression dynamics is wired into the network topology.
Project description:Community detection is important for understanding networks. Previous studies observed that communities are not necessarily disjoint and might overlap. It is also agreed that some outlier vertices participate in no community, and some hubs in a community might take more important roles than others. Each of these facts has been independently addressed in previous work. But there is no algorithm, to our knowledge, that can identify these three structures altogether. To overcome this limitation, we propose a novel model where vertices are measured by their centrality in communities, and define the identification of overlapping communities, hubs, and outliers as an optimization problem, calculated by nonnegative matrix factorization. We test this method on various real networks, and compare it with several competing algorithms. The experimental results not only demonstrate its ability of identifying overlapping communities, hubs, and outliers, but also validate its superior performance in terms of clustering quality.
Project description:Network-based analyses of brain imaging data consistently reveal distinct modules and connector nodes with diverse global connectivity across the modules. How discrete the functions of modules are, how dependent the computational load of each module is to the other modules' processing, and what the precise role of connector nodes is for between-module communication remains underspecified. Here, we use a network model of the brain derived from resting-state functional MRI (rs-fMRI) data and investigate the modular functional architecture of the human brain by analyzing activity at different types of nodes in the network across 9,208 experiments of 77 cognitive tasks in the BrainMap database. Using an author-topic model of cognitive functions, we find a strong spatial correspondence between the cognitive functions and the network's modules, suggesting that each module performs a discrete cognitive function. Crucially, activity at local nodes within the modules does not increase in tasks that require more cognitive functions, demonstrating the autonomy of modules' functions. However, connector nodes do exhibit increased activity when more cognitive functions are engaged in a task. Moreover, connector nodes are located where brain activity is associated with many different cognitive functions. Connector nodes potentially play a role in between-module communication that maintains the modular function of the brain. Together, these findings provide a network account of the brain's modular yet integrated implementation of cognitive functions.
Project description:Clustering and community detection provide a concise way of extracting meaningful information from large datasets. An ever growing plethora of data clustering and community detection algorithms have been proposed. In this paper, we address the question of ranking the performance of clustering algorithms for a given dataset. We show that, for hard clustering and community detection, Linsker's Infomax principle can be used to rank clustering algorithms. In brief, the algorithm that yields the highest value of the entropy of the partition, for a given number of clusters, is the best one. We show indeed, on a wide range of datasets of various sizes and topological structures, that the ranking provided by the entropy of the partition over a variety of partitioning algorithms is strongly correlated with the overlap with a ground truth partition The codes related to the project are available in https://github.com/Sandipan99/Ranking_cluster_algorithms.