S of DNA should lead to higher levels of RNA expression. It is therefore of great interest to study the intensity of such interaction, if there is any, between aCGH and RNA expression measurements on different genes. Gene expression and copy number variation data have been broadly studied, to assess differential expression of genes [6] and to find segments along the DNA that show CNAs [7], [8]. Statistical and computational models for integrating different types of data are becoming a popular topic in the recent literature, even thoughBayesian Models and Integration Genomic Platformsonly few considered full model-based approaches. [9] was among the earliest to investigate the direct association between the two types of data in breast cancer cell lines and tissue samples, and their approach was based mainly on descriptive statistics. Van Wieringen and Van de Wiel [10], attempting to mitigate the high noise in the raw expression measurements of the DNA and RNAs, proposed a sampling model for RNA expression incorporating estimated probabilities of corresponding CNAs. They subsequently developed nonparametric adaptive tests to study whether the estimated copy number variations in the DNA level would induce differential gene expression at the RNA level. More recently [11] presented a double-layered mixture model (DLLM ) that directly modeled segmental patterns in the copy number data to produce CNA profiles, and simultaneously scored the association between copy number and gene expression data. The DLMM assigned high scores to elevated or reduced expression Title Loaded From File measurement only if the expression changes are observed consistently across samples with copy number aberration. An important biological premise to the description of the model is that by integrating DNA copy number and RNA expression data, we will gain more knowledge about the underlying biological process. For example, a high or low correlation between a copy number aberration (CNA) for a gene marker and its abnormal RNA expression would indicate different carcinogenic mechanism and therefore different treatment selections [12] [13]. We describe a Bayesian Mixture Model that converts the noisy raw intensity measurement of the DNA and RNAs into probability of expression, which are subsequently modeled as latent parameters. Thus the integration of the two platforms is realized by joint Title Loaded From File modeling the probabilities of expression through a probit regression. Our aim, however, is not only to evaluate the relative contribution of large genetic variants such as CNAs, to gene expression but also make inference using both differential expression of the genes and differential copy number variations of the same set of genes. Moreover our full model-based approach allows us, after new information on the patients in the study are acquired, to exploit the latent integrated structure of our model and achieve better predictive performances for the clinical outcome of new patients coming into the study. In the next paragraph we present a motivating example with matched arrayCGH and microarray samples from breast cancer patients. In the materials and methods section we introduce probability models with a particular focus on the probit regression that allows for integration of both platforms, along with some simulation studies. Thus, in the result section, the focus is on posterior inference of the interaction between the two platforms, differential behaviour, which takes into account both differential gene.S of DNA should lead to higher levels of RNA expression. It is therefore of great interest to study the intensity of such interaction, if there is any, between aCGH and RNA expression measurements on different genes. Gene expression and copy number variation data have been broadly studied, to assess differential expression of genes [6] and to find segments along the DNA that show CNAs [7], [8]. Statistical and computational models for integrating different types of data are becoming a popular topic in the recent literature, even thoughBayesian Models and Integration Genomic Platformsonly few considered full model-based approaches. [9] was among the earliest to investigate the direct association between the two types of data in breast cancer cell lines and tissue samples, and their approach was based mainly on descriptive statistics. Van Wieringen and Van de Wiel [10], attempting to mitigate the high noise in the raw expression measurements of the DNA and RNAs, proposed a sampling model for RNA expression incorporating estimated probabilities of corresponding CNAs. They subsequently developed nonparametric adaptive tests to study whether the estimated copy number variations in the DNA level would induce differential gene expression at the RNA level. More recently [11] presented a double-layered mixture model (DLLM ) that directly modeled segmental patterns in the copy number data to produce CNA profiles, and simultaneously scored the association between copy number and gene expression data. The DLMM assigned high scores to elevated or reduced expression measurement only if the expression changes are observed consistently across samples with copy number aberration. An important biological premise to the description of the model is that by integrating DNA copy number and RNA expression data, we will gain more knowledge about the underlying biological process. For example, a high or low correlation between a copy number aberration (CNA) for a gene marker and its abnormal RNA expression would indicate different carcinogenic mechanism and therefore different treatment selections [12] [13]. We describe a Bayesian Mixture Model that converts the noisy raw intensity measurement of the DNA and RNAs into probability of expression, which are subsequently modeled as latent parameters. Thus the integration of the two platforms is realized by joint modeling the probabilities of expression through a probit regression. Our aim, however, is not only to evaluate the relative contribution of large genetic variants such as CNAs, to gene expression but also make inference using both differential expression of the genes and differential copy number variations of the same set of genes. Moreover our full model-based approach allows us, after new information on the patients in the study are acquired, to exploit the latent integrated structure of our model and achieve better predictive performances for the clinical outcome of new patients coming into the study. In the next paragraph we present a motivating example with matched arrayCGH and microarray samples from breast cancer patients. In the materials and methods section we introduce probability models with a particular focus on the probit regression that allows for integration of both platforms, along with some simulation studies. Thus, in the result section, the focus is on posterior inference of the interaction between the two platforms, differential behaviour, which takes into account both differential gene.