New Strategy Helps Spot Critical Software ‘Hangs’

Of the many problems that have haunted computer users since the dawn of the Information Age, one that never seems to go away is the occasional frozen screen that renders a software application completely inoperative and often requires a reboot. The rebooted system, in turn, may not have saved the work that was going on when the “hang” occurred.

While most of us have learned to live with this unpleasant fact of life, we can hardly be complacent when such a hang wipes out critical work on our most important systems. Of course, we may have backups available on- or off-site that could save us, but if the need is immediate—as in the case of air traffic controllers, for example—backup may be a moot point. Even in the everyday business of insurance and financial services, a delay of just a few minutes could result in a disastrously wrong decision and a consequent loss of time and money, especially if the hang is not immediately noticeable.

A recent posting on ScienceDaily delved into this topic, reporting that researchers at the Università degli Studi di Napoli Federico II and at Naples-based company SESM SCARL have developed a software tool that works at the operating system (OS) level and can detect when a computer program hangs to allow a safe exit from any given system without crashing the computer as a whole or requiring a reboot of important systems.

The software “framework” allows the non-intrusive monitoring of complex systems, based on multiple sources of data gathered at the OS level and the collected data are then combined to reveal hang failures automatically, says the article. Once a software system is in place, any number of problems can occur, possibly revealing bugs that were missed in initial testing of the application. Such problems can cause just one critical component of the system to hang without crashing the whole system, thus those using the system may not be aware of a problem until it is too late.

The new approach taken by the Italian team relies on several simple monitors that exploit the OS support to trigger alarms when the behavior of the system differs from the norm, the article notes. Experimental results reportedly show that this framework increases the overall capacity of detecting hang failures while keeping the number of false positives to less than six percent.

As our insurance and financial systems are pushed to faster and faster speeds, the need to immediately catch and deal with even the smallest hang in our data processing chain becomes increasingly critical. When we consider data analytics and decision support systems, the need is especially acute. It seems clear that the once-sleepy IT environment of the insurance enterprise cannot long endure.

More importantly, the more we rely on our systems to not only provide information, but to also make decisions about risks and coverages, the more we must ensure that they function flawlessly. IT shops throughout our industries would do well to look into this technology strategy now and to look for ways to incorporate it into their enterprises.

Ara C. Trembly (www.aratremblytechnology.com) is the founder of Ara Trembly, The Tech Consultant, and a longtime observer of technology in insurance and financial services.

Readers are encouraged to respond to Ara using the “Add Your Comments” box below. He can also be reached at ara@aratremblytechnology.com.

This blog was exclusively written for Insurance Networking News. It may not be reposted or reused without permission from Insurance Networking News.

The opinions of bloggers on www.insurancenetworking.com do not necessarily reflect those of Insurance Networking News.

For reprint and licensing requests for this article, click here.
Policy adminstration
MORE FROM DIGITAL INSURANCE