Project description:BackgroundGene and protein interaction data are often represented as interaction networks, where nodes stand for genes or gene products and each edge stands for a relationship between a pair of gene nodes. Commonly, that relationship within a pair is specified by high similarity between profiles (vectors) of experimentally defined interactions of each of the two genes with all other genes in the genome; only gene pairs that interact with similar sets of genes are linked by an edge in the network. The tight groups of genes/gene products that work together in a cell can be discovered by the analysis of those complex networks.ResultsWe show that the choice of the similarity measure between pairs of gene vectors impacts the properties of networks and of gene modules detected within them. We re-analyzed well-studied data on yeast genetic interactions, constructed four genetic networks using four different similarity measures, and detected gene modules in each network using the same algorithm. The four networks induced different numbers of putative functional gene modules, and each similarity measure induced some unique modules. In an example of a putative functional connection suggested by comparing genetic interaction vectors, we predict a link between SUN-domain proteins and protein glycosylation in the endoplasmic reticulum.ConclusionsThe discovery of molecular modules in genetic networks is sensitive to the way of measuring similarity between profiles of gene interactions in a cell. In the absence of a formal way to choose the "best" measure, it is advisable to explore the measures with different mathematical properties, which may identify different sets of connections between genes.
Project description:We present a simple and general approach termed REsemble for quantifying population overlap and structural similarity between ensembles. This approach captures improvements in the quality of ensembles determined using increasing input experimental data--improvements that go undetected when conventional methods for comparing ensembles are used--and reveals unexpected similarities between RNA ensembles determined using NMR and molecular dynamics simulations.
Project description:The comparison of protein sequences according to similarity is a fundamental aspect of today's biomedical research. With the developments of sequencing technologies, a large number of protein sequences increase exponentially in the public databases. Famous sequences' comparison methods are alignment based. They generally give excellent results when the sequences under study are closely related and they are time consuming. Herein, a new alignment-free method is introduced. Our technique depends on a new graphical representation and descriptor. The graphical representation of protein sequence is a simple way to visualize protein sequences. The descriptor compresses the primary sequence into a single vector composed of only two values. Our approach gives good results with both short and long sequences within a little computation time. It is applied on nine beta globin, nine ND5 (NADH dehydrogenase subunit 5), and 24 spike protein sequences. Correlation and significance analyses are also introduced to compare our similarity/dissimilarity results with others' approaches, results, and sequence homology.
Project description:BackgroundIn medical training, statistics is considered a very difficult course to learn and teach. Current studies have found that students' attitudes toward statistics can influence their learning process. Measuring, evaluating and monitoring the changes of students' attitudes toward statistics are important. Few studies have focused on the attitudes of postgraduates, especially medical postgraduates. Our purpose was to understand current attitudes regarding statistics held by medical postgraduates and explore their effects on students' achievement. We also wanted to explore the influencing factors and the sources of these attitudes and monitor their changes after a systematic statistics course.MethodsA total of 539 medical postgraduates enrolled in a systematic statistics course completed the pre-form of the Survey of Attitudes Toward Statistics -28 scale, and 83 postgraduates were selected randomly from among them to complete the post-form scale after the course.ResultsMost medical postgraduates held positive attitudes toward statistics, but they thought statistics was a very difficult subject. The attitudes mainly came from experiences in a former statistical or mathematical class. Age, level of statistical education, research experience, specialty and mathematics basis may influence postgraduate attitudes toward statistics. There were significant positive correlations between course achievement and attitudes toward statistics. In general, student attitudes showed negative changes after completing a statistics course.ConclusionsThe importance of student attitudes toward statistics must be recognized in medical postgraduate training. To make sure all students have a positive learning environment, statistics teachers should measure their students' attitudes and monitor their change of status during a course. Some necessary assistance should be offered for those students who develop negative attitudes.
Project description:Sequence comparison is one of the foundations in bioinformatics, which can be used to study evolutionary relations among the sequences. In this study, a 2D spectrum-like graphical representation of protein sequences is presented based on the hydrophobicity scale of amino acids. The frequencies of amplitudes of 4-subsequences are adopted to characterize a spectrum-like graph, and a 17D vector is used as the descriptor of protein sequence. The χ(2) value of compatibility test is performed. New similarity analysis approach is illustrated on the all protein sequences, which are encoded by the mitochondrion genome of 20 different species. Finally, comparison with the ClustalW method shows the utility of our method.
Project description:ObjectiveRAId is a software package that has been actively developed for the past 10 years for computationally and visually analyzing MS/MS data. Founded on rigorous statistical methods, RAId's core program computes accurate E-values for peptides and proteins identified during database searches. Making this robust tool readily accessible for the proteomics community by developing a graphical user interface (GUI) is our main goal here.ResultsWe have constructed a graphical user interface to facilitate the use of RAId on users' local machines. Written in Java, RAId_GUI not only makes easy executions of RAId but also provides tools for data/spectra visualization, MS-product analysis, molecular isotopic distribution analysis, and graphing the retrieval versus the proportion of false discoveries. The results viewer displays and allows the users to download the analyses results. Both the knowledge-integrated organismal databases and the code package (containing source code, the graphical user interface, and a user manual) are available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads/raid.html .
Project description:This study introduces two new statistics for measuring the score comparability of computerized adaptive tests (CATs) based on comparing conditional standard errors of measurement (CSEMs) for examinees that achieved the same scale scores. One statistic is designed to evaluate score comparability of alternate CAT forms for individual scale scores, while the other statistic is designed to evaluate the overall score comparability of alternate CAT forms. The effectiveness of the new statistics is illustrated using data from grade 3 through 8 reading and math CATs. Results suggest that both CATs demonstrated reasonably high levels of score comparability, that score comparability was less at very high or low scores where few students score, and that using random samples with fewer students per grade did not have a big impact on score comparability. Results also suggested that score comparability was sometimes higher when the bottom 20% of scorers were used to calculate overall score comparability compared to all students. Additional discussion related to applying the statistics in different contexts is provided.
Project description:We provide a comprehensive review of simple and advanced statistical analyses using an intuitive visual approach explicitly modeling Latent Variables (LV). This method can better illuminate what is assumed in each analytical method and what is actually estimated, by translating the causal relationships embedded in the graphical models in equation form. We recommend the graphical display rooted in the century old path analysis, that details all parameters of each statistical model, and suggest labeling that clarifies what is given vs. what is estimated. We link in the process classical and modern analyses under the encompassing broader umbrella of Generalized Latent Variable Modeling, and demonstrate that LVs are omnipresent in all statistical approaches, yet until directly 'seeing' them in visual graphical displays, they are unnecessarily overlooked. The advantages of directly modeling LVs are shown with examples of analyses from the ActiveS intervention designed to increase physical activity.
Project description:MotivationAlignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods.ResultsWe propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes.Availability and implementationThe K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz).Contactyueljiang@163.com.Supplementary informationSupplementary data are available at Bioinformatics online.
Project description:Experiencing music often entails the perception of a periodic beat. Despite being a widespread phenomenon across cultures, the nature and neural underpinnings of beat perception remain largely unknown. In the last decade, there has been a growing interest in developing methods to probe these processes, particularly to measure the extent to which beat-related information is contained in behavioral and neural responses. Here, we propose a theoretical framework and practical implementation of an analytic approach to capture beat-related periodicity in empirical signals using frequency-tagging. We highlight its sensitivity in measuring the extent to which the periodicity of a perceived beat is represented in a range of continuous time-varying signals with minimal assumptions. We also discuss a limitation of this approach with respect to its specificity when restricted to measuring beat-related periodicity only from the magnitude spectrum of a signal and introduce a novel extension of the approach based on autocorrelation to overcome this issue. We test the new autocorrelation-based method using simulated signals and by re-analyzing previously published data and show how it can be used to process measurements of brain activity as captured with surface EEG in adults and infants in response to rhythmic inputs. Taken together, the theoretical framework and related methodological advances confirm and elaborate the frequency-tagging approach as a promising window into the processes underlying beat perception and, more generally, temporally coordinated behaviors.