Extracting the Textual and Temporal Structure of Supercomputing Logs

Loading...
Thumbnail Image

View/Download File

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Extracting the Textual and Temporal Structure of Supercomputing Logs

Published Date

2009-06-01

Publisher

Type

Report

Abstract

Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource of information about their operational status and health. However, their massive size, complexity, and lack of standard format makes it difficult to automatically extract information that can be used to improve system management. In this work we propose a novel method to succinctly represent the contents of supercomputing logs, by using textual clustering to automatically find the syntactic structures of log messages. This information is used to automatically classify messages into semantic groups via an online clustering algorithm. Further, we describe a methodology for using the temporal proximity between groups of log messages to identify correlated events in the system. We apply our proposed methods to two large, publicly available supercomputing logs and show that our technique features nearly perfect accuracy for online log-classification and extracts meaningful structural and temporal message patterns that can be used to improve the accuracy of other log analysis techniques.

Keywords

Description

Related to

Replaces

License

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Suggested citation

Jain, Sourabh; Singh, Inderpreet; Chandra, Abhishek; Zhang, Zhi-Li; Bronevetsky, Greg. (2009). Extracting the Textual and Temporal Structure of Supercomputing Logs. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/215805.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.