InfoScope: Continuous Information Monitoring for Large-Scale Distributed Systems


Background


Large-scale distributed computing infrastructures have become important platforms for many real world applications such as corporate data centers, multi-tier web servers, massive data analysis, and service-oriented cloud computing. Information management service is one of the fundamental building blocks of automatic system management, which collects dynamic system information  (e.g., resource availability, service response time, virtual machine (VM) states, application component states) and resolves information queries issued by system administrators and other system controllers. However, it is a challenging task to provide scalable and efficient information management for large-scale distributed systems. Existing system monitoring solutions lack scalability, adaptability, and query support, which make them insufficient for managing large-scale computing infrastructures.  The goal of this project is to develop a scalable information management service that performs adaptive information collection to resolve various information queries (e.g., multi-attribute range queries, aggregation queries, top-k queries) with minimum monitoring overhead. InfoScope aims at achieving the following goals:

Live Demo

People


Faculty

Students

Collaborators

Publications

Related Projects:

Data Release


The following data are downloadable at our project website. They are collected at different times using our InfoScope software. We have a PlanetLab node pool of about 400 nodes. We deploy a rdaemon sensor on each of them and collect up to 66 system-level metrics. The sampling interval is 10 seconds. All information are sent to a central management node which is a dedicated server of our research group through UDP. We would be glad if anyone finds our data helpful to his or her research work and publications. Please cite the following paper if you use our data:

Ying Zhao, Yongmin Tan, Zhenhuan Gong, Xiaohui Gu, Mike Wamboldt, "Self-Correlating Predictive Information Tracking for Large-Scale Production Systems", IEEE International Conference on Autonomic Computing and Communications (ICAC), Barcelona, Spain, June, 2009. 

The format of each log file is as follows:

Timestamp MetricName1 MetricValue1 .... MetricNameN MetricValueN

PlanetLab1:   PlanetLab traces collected from 01/29/2009 - 02/06/2009
PlanetLab2:   PlanetLab traces collected from 09/20/2008 - 10/02/2008
PlanetLab3:   PlanetLab traces collected from 10/02/2008 - 10/15/2008