Gene set enrichment analysis (GSEA) is a method to identify groups of genes, which are statistically more differentially expressed than all other genes across different treatments within a microarray study. Most of the existing approaches have largely relied on nonparametric methods and require repeated computation of permutation and resampling data to assess the significance of a gene set. In this dissertation, we study parametric approaches for GSEA by formulating the enrichment analysis into a simple model comparison problem. The methods not only gain the flexibility in statistical modeling corresponding to biological problems but also achieve computational efficiency.
First, we propose a likelihood based approach assuming a finite mixture model for a two-class comparison problem and the implementation of the analysis is achieved by a likelihood ratio based testing approach. In addition we extend the parametric methods to flexible two-component mixture models for one-sided enrichment analysis which aims to test for enrichment of up (or down) regulation only. Also, we develop chi-square mixture models which incorporate the idea of two-class comparison studies into multiple category microarray experiments. Applications to gene expression data, along with simulations, demonstrate the computational efficiency and the competitive performance of the proposed methods.