Understanding Anomaly Detection Techniques for Symbolic Sequences
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Authors
Published Date
Publisher
Type
Abstract
We present a comparative evaluation of a large number of anomaly detection techniques on a variety of publicly available as well as artificially generated data sets. Many of these are existing techniques while some are slight variants and/or adaptations of traditional anomaly detection techniques to sequence data.
The specific contributions of this paper are as follows:
(i). This evaluation facilitates understanding of the relative strengths and weaknesses of different techniques. Through careful experimentation, we illustrate that the performance of different techniques is dependent on the nature of sequences, and the nature of anomalies in the sequences. No one technique outperforms all others. For most techniques we also identify some data sets on which they perform very well, and some on which they perform poorly.
(ii). We investigate variants that have not been tried before. For example, we evaluate a k-nearest neighbor based technique that performs better than a clustering based technique that was proposed for sequences. Also, we propose FSA-z, a variant of an existing Finite State Automaton (FSA) based technique, which performs consistently superior to the original FSA based technique.
(iii). We propose a novel way of generating artificial sequence data sets to evaluate anomaly detection techniques.
(iv). We characterize the nature of normal and anomalous test sequences, and associate the performance of each technique to one or more of such characteristics.
Keywords
Description
Related to
Replaces
License
Series/Report Number
Technical Report; 09-001
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Chandola, Varun; Mithal, Varun; Kumar, Vipin. (2009). Understanding Anomaly Detection Techniques for Symbolic Sequences. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/215788.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.