Browsing by Author "Lim, Kelvin"
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Item A pattern mining based integrative framework for biomarker discovery(2012-02-10) Dey, Sanjoy; Atluri, Gowtham; Steinbach, Michael; MacDonald, Angus; Lim, Kelvin; Kumar, VipinRecent advancement in high throughput data collection technologies has resulted in the availability of diverse biomedical datasets that capture complementary information pertaining to the biological processes in an organism. Biomarkers that are discovered by integrating these datasets obtained from a case-control studies have the potential to elucidate the biological mechanisms behind complex human diseases. In this paper we define an interaction-type integrative biomarker as one whose features together can explain the disease, but not individually. In this paper, we propose a pattern mining based integrative framework (PAMIN) to discover an interaction-type integrative biomarkers from diverse case control datasets. PAMIN first finds patterns form individual datasets to capture the available information separately and then combines these patterns to find integrated patterns (IPs) consisting of variables from multiple datasets. We further use several interestingness measures to characterize the IPs into specific categories. Using synthetic data we compare the IPs found using our approach with those of CCA and discriminative-CCA (dCCA). Our results indicate that PAMIN can discover interaction type patterns that competing approaches like CCA and discriminative-CCA cannot find. Using real datasets we also show that PAMIN discovers a large number of statistically significant IPs than the competing approaches.Item Discovering Groups of Time Series with Similar Behavior in Multiple Small Intervals of Time(2014-01-22) Atluri, Gowtham; Steinbach, Michael; Lim, Kelvin; MacDonald, Angus; Kumar, VipinThe focus of this paper is to address the problem of discovering groups of time series that share similar behavior in multiple small intervals of time. This problem has two characteristics: i) There are exponentially many combinations of time series that needs to be explored to find these groups, ii) The groups of time series of interest need to have similar behavior only in some subsets of the time dimension. We present an Apriori based approach to address this problem. We evaluate it on a synthetic dataset and demonstrate that our approach can directly find all the short-living trends without finding spurious trends unlike other alternative approaches that find many spurious trends. We also demonstrate, using a neuroimaging dataset, that our approach can be used to discover significantly reproducible groups of shared trends when applied on independent sets of time series data. In addition, we demonstrate the utility of our approach on an S&P 500 stocks data set.Item Discovering the Longest Set of Distinct Maximal Correlated Intervals in Time Series Data(2014-10-01) Atluri, Gowtham; Steinbach, Michael; Lim, Kelvin; MacDonald, Angus; Kumar, VipinIn this paper we focus on finding all maximal correlated intervals where a given pair of time series have correlation above a user provided threshold for all its subintervals and for none of its immediate subsuming intervals. Our objective then is to find a longest set of such maximal correlated intervals. We propose a two step solution to achieve this objective. In the first step an efficient bottom-up approach is proposed to discover maximal correlated intervals. In the second step we use a dynamic programming approach to select the longest non-overlapping set. We evaluate the efficiency of our approach on synthetic datasets and compare it with that of a bruteforce approach. Using neuroimaging data that contains activity time series from brain regions, we show the utility of our approach in studying transient nature of relationships between different brain regions.