Recent times have seen an explosive growth in the availability ofvarious kinds of data. It has resulted in an unprecedented opportunity todevelop automated data-driven techniques of extracting useful knowledge.Data mining, an important step in this process of knowledge discovery,consists of methods that discover interesting, non-trivial, and usefulpatterns hidden in the data.To date, the primary driving force behind the research in data mininghas been the development of algorithms for data-sets arising in variousbusiness, information retrieval, and financial applications.Due to the latest technological advances,very large data-sets are becoming available in many scientificdisciplines as well. The rate of production of such data-sets far outstripsthe ability to analyze them manually.Data mining techniques hold great promises for developing new sets of toolsthat can be used to automatically analyze the massive data-sets resultingfrom such simulations, and thushelp engineers and scientists unravel the causal relationships in theunderlying mechanisms of the dynamic physical processes.The huge size of the available data-sets and their high-dimensionalitymake large-scale data mining applications computationally very demanding,to an extent that high-performance parallel computing is fast becomingan essential component of the solution.Moreover, the quality of the data mining results often depends directlyon the amount of computing resources available.In fact, data mining applications are poised to become the dominant consumersof supercomputing in the near future. There is a necessity to developeffective parallel algorithms for various data mining techniques.However, designing such algorithms is challenging.In this paper, we will describe the parallel formulations of twoimportant data mining algorithms: discovery of association rules, andinduction of decision trees for classification.
Joshi, Mahesh; Han, Euihong; Karypis, George; Kumar, Vipin.
Parallel Algorithms in Data Mining.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.