Coming Clean With Data Quality

October 01, 2001, 1:00 a.m. EDT 10 Min Read

While recently examining data stored within one of its auto policy databases, executives with New York City-based Metropolitan Insurance Co. uncovered a troubling glitch. It was an irregularity that any seasoned IT troubleshooter would certainly have appreciated.

MetLife's systems analysts discovered that due to a systems default, most customers on a particular source database somehow had either "0" or "9" automobiles covered under their MetLife auto insurance plan.

Erroneous data like this could send an operation reeling-particularly if the information were to be used as the basis for a marketing campaign. Fortunately, the applications and business functions using the data weren't interested in automobile ownership trends, so the accuracy-or lack thereof-wasn't relevant.

"This example illustrates that downstream processes cannot always rely on the quality of upstream data collection," says Carol Stewart, MetLife's vice president for data administration. "It helps if the upstream department has a business reason of its own for ensuring data is correct."

Reconciling data in the insurance industry is important not only to ensure operational efficiency but to foster business opportunities. With corporate profits lagging, carriers have identified customer relationship management (CRM) as a key strategy to reverse their fortunes.

Data quality is a key driver of CRM because with quality data, a carrier can proceed to build a complete profile of a customer-stitching together pieces of data spread across multiple platforms. With a profile in tow, carriers can proceed to cross-sell products and services.

"We have a segment of customers with very sophisticated needs," Stewart notes. "Clean data enables us to identify these needs. It's all part of our emphasis on data stewardship."

Raising the Stakes

As MetLife places an emphasis on corporate growth, it also raises the stakes of data quality assurance. For instance, MetLife's upper management has a goal to increase its number of institutional (33 million), individual (9 million) and international (3 million) customers from a current 45 million to 100 million by 2010, Stewart explains.

The projected growth reflects not only insurance products such as life, annuities, auto, boat, home and investment services, but also bank products, which MetLife has now begun marketing. Technology can ensure the accuracy of MetLife's data systems as it attempts to accomplish this goal.

"The influx of dirty data becomes exponentially larger as companies grow," Stewart declares. "Without technology, detecting these kinds of problems is left to individuals, and that would take months. People don't know what to look for-an intuitive technology solution does."

While carriers such as MetLife can only hope that an enterprisewide commitment to data quality assurance comes to fruition, they can at least rely on software applications that detect and correct flawed data.

Using a data analysis tool known as Axio, MetLife has been able to examine its data stores to uncover patterns and anomalies within and between data elements and across files. "It has significantly reduced human effort and the time associated with iterative data cleanup," Stewart says.

Developed by San Francisco-based Evoke Software Corp., Axio uncovers these patterns and creates accurate data from inconsistent, redundant or corrupted data commonly found in a corporate data system.

Soiled silos

Data quality becomes evermore essential as a carrier's operation grows. However, red tape and corporate bureaucracy can often undermine data management efforts.

That said, even the most fluid organization still cannot escape one chronic challenge-managing multiple and disparate databases built across the enterprise.

"It's no secret that the information that carriers have on their customers resides in data silo repositories," says Kimberly Harris, senior analyst with Durham, N.C.-based Gartner Financial Services. "Each business unit has been accustomed to building its own databases, each with its own individual procedures. Sometimes there's really no rhyme or reason to the way data fields are determined. So standardization across the enterprise has been a big challenge."

Dirty data can lead carriers to overstate the number of customers they have, understate profitability and misstate expenses, Harris explains. Data structures are built in such a way that the same client record may exists in more than one database.

"The records may indicate that John Doe has an auto policy with a carrier, while Jack Doe has a homeowner's policy," Harris explains. "In actuality, they are the same individual. Maybe the individual moved his residence, and then bought a homeowner's policy. The auto database, meanwhile, still uses the original address. The variation in the first name also adds confusion. A good technology solution can sift this out."

As the aforementioned example attests, data cleansing is often associated in the context of nonstandardization that exists across multiple databases. But even within one database, confusion can reign supreme.

"It's not uncommon for a carrier embarking on a data conversion initiative to discover data fields that were created years ago that nobody in the organization can identify," Harris says.

With all these scenarios, it's no surprise that many carriers admit that they lack confidence in the quality of their data-either data that's internally produced or the data supplied by third-parties (see chart).

Aside from the gridlocking effects of corporate bureaucracy and red tape, some carriers have other reasons to eschew best-practices data management: Time and cost constraints have discouraged many from embarking on ambitious data conversion or data warehouse projects.

Data warehousing is perceived as a synergistic partner to data cleansing. If a carrier plans to build a data warehouse, cleansing data before it's stored in the warehouse is imperative, or the effort is performed in vain.

However, data warehousing has proved to be a hit-or-miss proposition in the insurance industry (see "Building A Better Data Warehouse," March 2001). Many projects were discontinued due to lack of vision, if not poor execution.

"As data accumulates, companies find ways to skirt the issue by writing new layers of code onto existing data systems," says Dan O'Hara, president of Addison, Texas-based Universal Conversion Technologies (UCT), which provides technology solutions for data conversion projects in the insurance industry.

So when the time comes to convert data, carriers must endure what O'Hara calls "pain for the sins of the past."

But carriers no longer can afford to employ this tactic with so much at stake. Data management is inevitable for a carrier to perform effective CRM. It's also essential for a carrier engaged in aggressive merger and acquisition activity, where disparate data of a seller has to be combined with disparate data of a buyer.

Mums the word

If carriers are following through on implementing data management programs, it's difficult to provide concrete evidence.

Few carriers are eager to discuss specific aspects of their data-cleansing initiatives, such as costs to launch a data management program, estimated annual savings, return on investment, and just about anything else that involves numbers.

That's because anything pertaining to customer data-including warehousing, cleansing and mining-is a sensitive subject to carriers.

Several insurance companies contacted for this article claimed that discussing raw numbers would tip their hand to a competitor. Efforts to discuss specific details about data management with such providers as Columbus, Ohio-based Nationwide Financial and Minneapolis-based American Express Financial were flatly denied.

Although costs for a data conversion program can range widely, O'Hara of UCT says that in his company's experience a data conversion initiative can range from $500,000 to $2 million.

That expense reflects the design and installation of a new database, plus costs to connect with third parties such as agency management systems that will serve as the main repository for business data.

O'Hara adds that 40% of the expense incurred to implement a data conversion project should be devoted to post-conversion testing and balancing. "I've seen train wrecks where a company decided to do a data conversion project by taking shortcuts and almost lost an entire agent field force because of it," O'Hara says.

Nip in the bud

Some carriers, such as State Farm Mutual Automobile Insurance Co., approach data cleansing as an ongoing process, rather than a one-time initiative that can prove to be costly.

Roberta Park, director of data and information strategies for Bloomington, Ill.-based State Farm, says data cleansing is never done as part of a standalone initiative. "It's all a function of our ongoing data warehouse initiative," she says.

Launching a data warehouse program about four years ago, State Farm continues to refine and expand the warehouse on an ongoing basis. The data warehousing project enables State Farm to cross-sell insurance products to customers and even bank products from State Farm Bank.

"We clean or correct errors in data at the source-rarely will we fix or clean data when it gets downstream," Park says.

"We grew up in a silo world. We collected customer information from multiple sources and those sources didn't talk to one another," she adds. "The data on an auto customer policy had been separate from the insurance data on a fire insurance policy. It was a policy view rather than a customer view. We've made strides to change that mentality because for a long time the right hand didn't know what the left hand was doing."

Using a product developed by IBM Corp. called Insurance Application Architecture (IAA), a State Farm analyst can examine the combination of city, state and ZIP code to scope for inefficiencies. "For instance, this ZIP code doesn't match with this city. The software identifies this mistake so a programmer can correct it," Park says.

This is often a crucial correction. "Address correction and verification software can increase direct marketing effectiveness by helping provide on-time delivery of mailings, goods and services," Park explains.

By verifying addresses, carriers can qualify for postal discounts and eliminate operational costs associated with improper delivery.

Epitomizing the importance of data quality at State Farm was the creation earlier this year of a new business unit known as Data and Information Services, which Park leads. The group, which has a team of about 120 individuals, operates within State Farm's information systems units, which consist of about 5,500 individuals-mostly business and systems analysts.

But in keeping with the reticent ways of carriers and their data projects, Park declined to details on data cleansing, in terms of annual budget allocation or ROI.

Data stewardship

All the factors that hinge on data management add up to some slow-moving shifts in attitudes. Overall, global corporations are coming to grips with the relevance of data management within their operations, a sign that data cleansing is being taken more seriously.

A recent Global Data Management survey by New York City-based PricewaterhouseCoopers indicated that investments in data management has enabled 60% of firms to cut processing costs, and 40% to boost sales through better analysis of customer data. The poll was based on interviews with 600 executives in the United States, the United Kingdom and Australia.

"If I'm in charge of a business unit, then I'm responsible for making data available as it moves downstream," says Stewart of MetLife.

"As it moves downstream, the quality of the data must be guaranteed. You have to examine all the business processes that can touch data, and execute changes to the data because those processes are very long-who touches it, what they did to modify it. Quality has to be inherently shared across the enterprise, but you have to appoint one person to be in charge of it."

Those who are appointed to spearhead data management efforts will have to grapple with a huge challenge. Similar to a data warehouse program where a lack of focus aborted programs in midstream, best-practices data cleansing starts with a proper approach.

"At State Farm, we use an analogy of the 'polluted lake and polluted stream' to keep this in perspective," Park explains. "If a polluted stream is feeding a lake, the proper approach is to clean up the stream first. With data cleansing, it's the same-fix and clean data at the source, and so far it has worked well."