Between Dec 19, 2024 and Jan 2, 2025, datasets can be submitted to DRUM but will not be processed until after the break. Staff will not be available to answer email during this period, and will not be able to provide DOIs until after Jan 2. If you are in need of a DOI during this period, consider Dryad or OpenICPSR. Submission responses to the UDC may also be delayed during this time.
 

Providing network profiling and tracking utility in large distributed systems.

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Providing network profiling and tracking utility in large distributed systems.

Published Date

2011-05

Publisher

Type

Thesis or Dissertation

Abstract

Within the past few years, the Internet has, to a great extent, impacted every aspect of our daily life. Such impact has played a major role in influencing the design, deployment and functionality of enterprise, campus and even home computer networks. As we increasingly depend on computer networks for communication, information access and storage; entertainment and other activities, managing and securing such networks are critical. Due to its scale and complexity, managing and securing today’s large campus or enterprise networks is a challenging task. The scale and complexity comes not only from the number of heterogeneous hosts and devices on the network (e.g., various servers, desktop office client machines, laptops, lab machines, wireless access points, routers and so forth), but also from a wide range of diverse applications running on these machines. In this thesis, we conduct a study for developing methodologies to profile and track activities within networks by addressing two key problems: capturing the dynamic interaction represented by Internet traffic between inside and outside hosts at the block level; and synthesizing static knowledge-base on hosts and networks to map dynamic interaction to interpretable profiles. We develop methodologies utilizing machine learning techniques for capturing, characterizing and profiling activities within the network. Next, we take these techniques one step further by proposing tools and systems that address profiling and tracking as a utility in a large-scale distributed system. More specifically, we propose a Hierarchical Extraction of Activity Patterns (HEAPs) methodology to characterize and profile activity patterns within the subnet. We express activities in a host-port association matrix and apply Probabilistic Latent Semantic Analysis (pLSA) to co-cluster dominant and significant activities within the subnet. We also propose a Block-wise (host) Port Activity Matrix (BPAM) to describe the traffic within a block. We then apply Singular Value Decomposition (SVD) low-rank approximation techniques to obtain the low-dimensional subspace representation which captures the typical activities within the block and consequently assign a high-level descriptive label summarizing the activities within the block. We also develop methods to track and quantify changes in the activity within the subnet (or block) over time and demonstrate how to utilize these methods to identify major changes and anomalies within the network. We demonstrate the utility of a light-weigh self-contained tool for multi-level analysis of activities within the network. While the tool does not solve a specific security problem, it helps users and operators localize problems within a small network or individual host. While our methodologies provide the dynamic interaction within the network, it lacks additional information that help validate the profiling results. Towards that end, we develop a methodology to differentiate dynamic from static IP address blocks. More specifically, we propose a scanning-based technique for identifying dynamic IP addresses blocks within the network. We also include other statistic information by building a system that maps dynamic interaction to static information in a datacenter-like environment. Our system addresses key design issues for providing network management and profiling services in a collaborative system with interpretable characterization and profiling utility. The thesis serves 1) to propose various novel methodologies utilizing machine learning techniques to extract and profile the behavior of hosts and blocks within the network; 2) to pinpoint design principles for building light-weight as well as large-scale systems for profiling and tracking activities in the network; 3) to propose how to incorporate static information readily available within on-line tools to provide interpretation and mapping for network dynamic interaction.

Description

University of Minnesota Ph.D. dissertation. May 2011. Major: Computer science. Advisor: Zhi-Li Zhang. 1 computer file (PDF); xi, 154 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Sharafuddin, Esam Ahmed. (2011). Providing network profiling and tracking utility in large distributed systems.. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/108282.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.