Leveraging many simple statistical models to adaptively monitor software systems

Author: Munawar Mohammad A.  

Publisher: Inderscience Publishers

ISSN: 1740-0562

Source: International Journal of High Performance Computing and Networking, Vol.7, Iss.1, 2011-02, pp. : 29-39

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

Ensuring that a software system meets its objectives requires continuous monitoring. In practice, monitoring is either insufficient to effectively detect and diagnose failures, or is too costly to use in production. An alternative is adaptive monitoring, where the system is monitored at a minimal level to determine system health, and if a problem is suspected, the monitoring level is automatically increased to determine faults. To model the system at different monitoring levels, we employ statistical techniques to identify stable relationships in the monitored data. These relationships characterise normal operation and can help detect anomalies. We describe our approach in the context of a J2EE-based system. We show that adaptive monitoring is a cost-effective alternative to continuous detailed monitoring. We inject 29 different faults, and show that we detect the faults in 80% of cases and shortlist the faulty component in 65% of the detected cases.