Washington University School of Medicine

Division of Biostatistics
Seminar Series Spring 2008

Statistical Methods for Network-Based Analysis of Genomic Data

Zhi Wei
Genomics and Computational Biology
University of Pennsylvania, School of Medicine

Wednesday, February 6, 2008, 12:30–1:30 pm

GEMS classroom, 3rd Floor in Shriner's Building
Coffee, tea, and cookies will be provided


Abstract

A central problem in genomic research is the identification of genes and pathways that are involved in diseases or perturbed during a biological process. Many methods have been developed for identifying genes in regression frameworks. The genes identified are often linked to known biological pathways through gene set enrichment analysis in order to identify the pathways involved. However, most of the procedures of identifying the biologically relevant genes do not utilize the known pathway information. In this talk, I present hidden Markov random field (HMRF)-based methods for identifying genes and subnetworks that are activated or perturbed by diseases or biological processes, where the latent gene differential expression states are modeled by a discrete Markov random field. Simulation studies indicated that the methods are effective in identifying genes and subnetworks that are related to disease and have higher sensitivity and lower false discovery rates than the commonly used procedures that do not use the pathway structure information. I will demonstrate these methods by analyzing three problems: 1) a breast cancer gene expression dataset to identify the modules related to cancer metastasis, 2) a short time course neuroblastoma gene expression dataset to elucidate molecular events underlying the different biological and clinical behavior of TrkA- and TrkB-expressing neuroblastomas, and 3) a systemic immune response time course gene expression dataset to identify the subnetworks involved in human immune response to endotoxin.