Statistical Methods for
Network-Based Analysis of Genomic Data
Zhi Wei
Genomics and Computational Biology
University of Pennsylvania, School of Medicine
Wednesday, February 6, 2008, 12:30–1:30 pm
GEMS classroom, 3rd Floor in
Shriner's Building
Coffee, tea, and cookies will be provided
Abstract
A central problem in genomic research is the identification of genes and
pathways that are involved in diseases or perturbed during a biological
process. Many methods have been developed for identifying genes in
regression frameworks. The genes identified are often linked to known
biological pathways through gene set enrichment analysis in order to
identify the pathways involved. However, most of the procedures of
identifying the biologically relevant genes do not utilize the known
pathway information. In this talk, I present hidden Markov random field
(HMRF)-based methods for identifying genes and subnetworks that are
activated or perturbed by diseases or biological processes, where the
latent gene differential expression states are modeled by a discrete
Markov random field. Simulation studies indicated that the methods are
effective in identifying genes and subnetworks that are related to
disease and have higher sensitivity and lower false discovery rates than
the commonly used procedures that do not use the pathway structure
information. I will demonstrate these methods by analyzing three problems:
1) a breast cancer gene expression dataset to identify the modules
related to cancer
metastasis, 2) a short time course neuroblastoma gene expression dataset
to elucidate molecular events underlying the different biological and
clinical behavior of TrkA- and TrkB-expressing neuroblastomas, and 3)
a systemic immune response time course gene expression dataset to identify
the subnetworks involved in human immune response to endotoxin.