Complex networks with causal relationships among variables are pervasive in biology. Their study, however, requires special modeling approaches. Structural equation models (SEM) allow the representation of causal mechanisms among phenotypic traits and inferring the magn ...
In this chapter we describe the Association Weight Matrix (AWM), a novel procedure to exploit the results from genome-wide association studies (GWAS) and, in combination with network inference algorithms, generate gene networks with regulatory and functional significance. In sim ...
In recent years R has become de facto statistical programming language of choice for statisticians and it is also arguably the most widely used generic environment for analysis of high-throughput genomic data. In this chapter we discuss some approaches to improve performance of R when worki ...
Validation of the results of genome-wide association studies or genomic selection studies is an essential component of the experimental program. Validation allows users to quantify the benefit of applying gene tests or genomic prediction, relative to the costs of implementing the pr ...
Genotype imputation is a cost-effective way to increase the power of genomic selection or genome-wide association studies. While several genotype imputation algorithms are available, this chapter focuses on a heuristic algorithm, as implemented in the AlphaImpute software. This ...
Knowledge of phase has many potential applications for empowering genomic information. For example, phase can facilitate the identification of identical by descent sharing between pairs of individuals, as part of the process of genotype imputation, or to facilitate parent of origin of ...
We herein present a haplotype-based method to perform genome-wide association studies. The method relies on hidden Markov models to describe haplotypes from a population as a mosaic of a set of ancestral haplotypes. For a given position in the genome, haplotypes deriving from the same ancest ...
Genomic best linear unbiased prediction (gBLUP) is a method that utilizes genomic relationships to estimate the genetic merit of an individual. For this purpose, a genomic relationship matrix is used, estimated from DNA marker information. The matrix defines the covariance between in ...
Homozygosity is a component of genetic patterning that can be used to search for the cause of genetic disease. In this chapter, methods are presented to analyze SNP data for the presence of homozygosity. Two exercises demonstrate methods to define runs of homozygosity, to identify shared homoz ...
The BLR (Bayesian linear regression) package of R implements several Bayesian regression models for continuous traits. The package was originally developed for implementing the Bayesian LASSO (BL) of Park and Casella (J Am Stat Assoc 103(482):681–686, 2008), extended to accommodate fi ...
Genomic prediction exploits historical genotypic and phenotypic data to predict performance on selection candidates based only on their genotypes. It achieves this by a process known as training that derives the values of all the chromosome fragments that can be characterized by regr ...
Bayesian multiple-regression methods are being successfully used for genomic prediction and selection. These regression models simultaneously fit many more markers than the number of observations available for the analysis. Thus, the Bayes theorem is used to combine prior belie ...
Fingerprints are bit string representations of molecular structure that typically encode structural fragments, topological features, or pharmacophore patterns. Various fingerprint designs are utilized in virtual screening and their search performance essentially ...
The Na�ve Bayesian Classifier, as well as related classification and regression approaches based on Bayes’ theorem, has experienced increased attention in the cheminformatics world in recent years. In this contribution, we first review the mathematical framework on which Bayes’ me ...
This chapter reviews the application of fragment descriptors at different stages of virtual screening: filtering, similarity search, and direct activity assessment using QSAR/QSPR models. Several case studies are considered. It is demonstrated that the power of fragment descri ...
For chemical genetics and chemical biology, an important task is the identification of small molecules that are selective against individual targets and can be used as molecular probes for specific biological functions. To aid in the development of computational methods for selectiv ...
This introductory chapter gives a brief overview of the history of cheminformatics, and then summarizes some recent trends in computing, cultures, open systems, chemical structure representation, docking, de novo design, fragment-based drug design, molecular similarity, quan ...
The development of computational methods that can estimate the various pharmacodynamic and pharmacokinetic parameters that characterise the interaction of drugs with biological systems has been a highly pursued objective over the last 50 years. Among all, methods based on ligand i ...
There is a critical need for improving the level of chemistry awareness in systems biology. The data and information related to modulation of genes and proteins by small molecules continue to accumulate at the same time as simulation tools in systems biology and whole body physiologically ba ...
The aim of this chapter is to describe the stages of early drug discovery that can be assisted by techniques commonly used in the field of cheminformatics. In fact, cheminformatics tools can be applied all the way from the design of compound libraries and the analysis of HTS results, to the discovery of fu ...