Microarray data contain information about the level of expression of genes that can be
informative of changes taking place in cells. This information has been widely used to
study the changes in gene expression between normal and cancer cells. Gene expression
has been used as a biomarker predictive of the progression of a disease and to identify
drug targets specifically expressed in cancer tissue. Although the level of expression of a
gene can change by multiple folds across tissues, the simplest information (on/off status)
about a gene is whether it is or is not expressed in a given tissue.
In this study we propose to take advantage of a large set of tissue specific gene
expression data to study the profile of gene expression in pathways and tissues. To
perform this analysis, we leverage and improve a method that computes the on/off status
of a gene from their level of expression.
The percent of genes with on status in a given tissue was selected to summarize across
bio specimen. The gene state method was applied to sets of tissue specific expression
microarray extracted from the GEO database. We then studied the profile of on/off state
of genes in KEGG pathways across several tissues. The data were then used to calculate a
distance between gene sets. Using all genes, a distance could be calculated between
normal and cancer tissue, as well as pairwise comparisons between each tissue type. The
gene sets were then narrowed and selected based on pathway annotation from KEGG.
This demonstrated an ability to identify known cancer pathways based on their gene
signature distances. The results affirm known cancer pathways by calculating relative