Cloud systems have become increasingly popular by
obviating the need for users to maintain complex computing
infrastructures themselves. Users can dynamically reserve
resources in a pay-as-you-go fashion. Many cloud systems
such as Amazon Elastic Computing Cloud (EC2)
and NCSU Virtual Computing Lab (VCL) provision
resources in a form of virtual machines (VMs) that are
installed with desired application software and operating
systems. Cloud systems can encounter different runtime
problems such as hardware failures, software misconfigurations
and corrupted VM images. Cloud system management nodes often continuously
produce console logs to record important runtime operations
and their status. We address the challenge of troubleshooting cloud
systems using the management console logs. Our research aims to
devise automated techniques and tools that can perform system troubleshooting
using the management console logs.
Figure: Cloud system operation