Track 6: Big data – big risks: How to use AI in preventing bias

While major insurance carriers have unlimited access to big data to project risk and define customer financial landscapes more accurately this data doesn't come without risks. Not only is data privacy a concern but so is unintended bias and structural racism. 

In this session, shares  lessons learned and best practices in leveraging the technology platforms to use AI/ML to offer higher risk auto insurance customers competitive rates based solely on their driving records.

Key Takeaways: 
  • How to effectively consolidate large sets of insurance data to get ready for AI/ML predictions.
  • Best practices for identifying if an insurance AI solution is biased.
  • Tips for how to avoid pitfalls when designing ML models to make unbiased and accurate predictions that can elevate the quality of insurance products
Transcript:

Doug Benalan (00:10):

Good afternoon everyone. We are excited to have this opportunity to discuss about our AI risk and big data. I'm Doug Benalan, held various roles in the technology space, currently working as a CIO and Head of Digital Transformation for cure order insurance.

Robert Clark (00:31):

And I'm Robert Clark. I'm the CEO and Founder of Cloverleaf Analytics. I have over 25 years of experience within the insurance industry working from everything from managing general agents to program business carriers to direct writers, reinsurers, and even did a stint as a global BI for the AXA group working with 32 countries and 62 currencies. And then ultimately went into the vendor side and have successfully implemented three different software companies all in the analytics space.

Doug Benalan (01:10):

Let me start with the story. Few of my friends went for a wine tasting event and the host of the event had two categories or two brand of wines. One is the cheapest, one is the most expensive one. So the goal here is to evaluate these wines blindfolded, meaning they're not going to know the brand name or the details of the, so they're going to evaluate and give an unbiased results. So that's a goal for the session one. So technically they completed the rating and the session two is basically the host is going to give the visibility of the brand name and the details of the wine. So then again, they're going to evaluate the wine with the details for the rating. And I guess you can understand where I'm going with this, right? So technically on this rating wise from the session one, the cheapest wine had a 55% rating, the higher rating versus the most inexpensive ones. And in the session two, when they know the brand names, when they'll know the details, the rating went up to 90% on the expensive one they got basically, some of the people even don't know the wine details or the brand names on the group and they voted for the higher brand because they got influenced with the other resources or other guests in the invite.

(02:32)

So what is this relevance to the AI model? So technically AI is programmed by humans, humans as humans. Basically our bias is shaped by how we perceive our environments and experiences and AI perceive experience in the form of data selected or provided by humans and its SS bias. So looking back in the history when AI got originated or in the early stages of ai, we all thought, basically many of us thought math AI is driven by hardcore mathematical logic. So the results will be unbiased or neutral, but this is not real in the real, I mean this is not true in the real world. And Rob, as you know, in cases even this is amplified, this biases with the more and more play, I mean big place in the industry getting sued for AI discrimination. It's very important to look at the ethical ai probably agree.

Robert Clark (03:27):

Yeah, I agree. One of the things that you'll see within the AI space is we all assume, like Doug's saying, is it's computer based, so it just makes decisions based on inputs and outputs. But as we'll illustrate, and as Doug was saying, there's a bias when you, the only information you have is a small subset. In the case of the wine tasting, he was talking about how when they're blindfolded doing the wine tasting, they choose kind of one wine or the other. But then when they know which wine is the expensive wine, suddenly there's a bias towards that wine even though they may have liked the other wine before. So when you're looking at the ai, ai, when it's learning, it's learning on a subset of data and if that data tends to be skewed in one direction or the other, it's going to make decisions based on what it knows. We did have slides but unfortunately I don't think they're up. And so Doug, if you want to continue with, yes.

Doug Benalan (04:31):

So in general there are three areas the bias could be introduced. One is the data, the second is the algorithm, and finally is the business, right? So when I say the data bias can be having multiple dimensions, one is the sampling bias. What does that mean is when you take a sample out of your larger data set, it may not be the true representation of the set technically, or it can be a selection bias where some of the data are systematically eliminated or you are omitting some of the data so that the results will be, will be biased or in an incorrect when you look at the algorithm, algorithmic bias can be introduced when there's a logic issue in the programming. So basically it's a kind of fairness test. We can deploy, develop, and deploy to check algorithm algorithmic bias. An example in the P N C world, it would be for example, algorithm or AI which suggests data or properties in a certain neighborhood or zip code is more prone to theft or damage and it charges heavy premium or deny some of the deny coverages to some of the customers. Then it's basically a serious logic towards it since it's not able to account for good profiles in that neighborhood.

Robert Clark (05:56):

Yeah, let's catch up on the slides here. It'll make more sense when you can actually see the slides.

Doug Benalan (06:06):

Yeah, so finally business bias is basically how the business reacts or acts upon the data for the business benefits here. An example would be if an organization considers factors which is not directly related to their drivings for drivings for example, it can be occupation education or credit score and their rating is based on higher and providing a higher rates for customers even though they are safe drivers are providing lower premiums to certain customers even though they are high risk drivers. So technically how the business sees this data and react to the business model for the rating, that's also very critical.

Robert Clark (06:58):

And here's an example. So there's a couple examples you may have seen in the news. One company, I'm not going to mention it's a very large company. I dealt with a class action suit recently where their rating, he's using some AI and it was actually biased to charging more for minority groups in certain regions. So they did settle that suit. You can search for it online and find it, I don't want to mention the name. Some people may be here from that company and in this case New Jersey did find insurers that we're actually discriminating, not intentionally. And as Doug was alluding to, when you start bringing in factors, one thing that we looked at one point in time was delays in making payments. And if you're looking at late payments, you're looking at different criteria which you would think would be a factor in the propensity of having an accident in the case of an auto insurer. Those things you would think would factor in. And sometimes when you are geographically looking at a specific region, you're really looking at it could be a race or a specific demographic or it could be socioeconomic class. And then suddenly the AI by taking in these factors is now discriminating, not intentionally, but it could be culturally, could be other factors that really bring it in. So there's reason to kind of reel back and take a look at it and start doing a litmus test to make sure that this isn't happening.

Doug Benalan (08:31):

So recently we migrated to Guidewire platform. So for our underwriting claims and building, there are reasons why we chose Cloverleaf analytics as our partner for the analytics areas. So first is basically our business wants to understand the entire data set including our legacy data so that they can analyze and understand for their data reporting purposes or operational efficiency purposes. That's one of the key roles we are looking from the clever lift perspective. Also there are multiple things are things like self-service. For example, through Clever Leaf, our business users don't need to write complex queries or they don't need to go into technical details to produce a slice and dice the data to see the reports. And there's an easy methods to in the global leaf built in to send the reports to our customers in an automated fashion through the inbox. Also, another area is customer service and the experience is one of the big thing for us. We developed some of the models through cloverleaf like days since last activity. Through that model we should be able to understand when is the last time the claims file has been touched by our adjusters. So through that the claims team can understand that if there is any gap in the claims handling process, technically they can resolve that through in automated fashion.

(09:55)

Also, one key area we have to focus here is I think we are all interested in introducing cost and improving efficiencies. So through cloverleaf are, as I mentioned, we are bringing all of our legacy data from a mainframe Oracle. Eventually mainframe is going to come. Currently Oracle data is flowing through cloud leaf. So the idea behind this is we have, we'll have one system of analytics and through that most of the IT computing and some of the manual work has been eliminated. So in future it'll be a seamless process for our cost saving purposes.

(10:32)

Yeah, I would like to have, I mean share some of our data as well, how we prepared our data and what are the key factors we should consider in preparation of the data for AI bias or big data perspective. So the most important is going back to the basic. Basically if you had to look for what is the use case we have and the problem statement that those are the main driving factors to consider what data you need to collect and how we are going to use the data, that's more important. The second is some other basic technical sides as well. So making sure your data set doesn't have NUS duplicates, your data types are matching whatever as per your expectation. Also you are not looking at the broader data set of, you know are looking specifically in the sense not at the single source versus multiple source. Also that's very critical too. Before turning to Rob for his inputs, I would like to share technically when you do a claims for fraud analysis example, your data may change frequently looking at history, looking only at the historical data may not be a right fit. When you develop AI models, you should look for ongoing trend of data. So make sure you have both his historical as well as day-to-day operational data so that your claims fraud model can be up to date and it'll not in turn have any biases or having an effective solution there.

Robert Clark (11:58):

Yeah, exactly. As Doug's alluding to the old age, old agen it, garbage in, garbage out applies when you're also doing ai. So there's a lot of cleanup that's done, like you said, eliminating nulls, making sure data is normalized, cleaned up as you're going through the process because if your AI is going to learn from that information, it needs to be pertinent. It needs to be not just quantitative but qualitative as well and so that your results will be something that's predictable and so on.

Doug Benalan (12:37):

Okay, there are some best practices when you develop an ai. So basically ai, any other projects will start in planning but doesn't end in deployment. So it goes through the entire process of monitoring and auditing. So any feedback coming from audits has to be, again, reevaluated goes into the planning phase and gets deployed. So it's a continuous reevaluation IT process. So it's basically think about this way. So it's a cyclic iteration where it's not static first of all. Also, some of the best practices if holistically is again going back to the data, we should have a well represented data that's very important to have a best practices in the AI model. The second would be have a diverse team. So when I say diverse team, it's not need not to be geographically distributed or whatever, but you should have either from your business focus business teams, you end users, your technology team, your data engineers, data scientists, you have a very good collective team to review those data and making sure there is no, there's all in a sense that fairness test has been done.

(13:38)

All these things has to be accorded when you're doing a best practices from the AI perspective. Also, finally one another area we burned our hands is that having, I would say some of the algorithms we generated there is no, I mean traceability or I would say no visibility or some of the transparency were missing in some of the algorithm. So only technology folks were able to understand versus the business folks. So that also it's not a best practice solution when designing an ai. So when you have an AI solution, you all your team folks or whatever the teams you are identified in the project plan has to understand the entire project, entire data set from the start to finish so that they have the entire review process accomplished as well as the feedback transfer to your user groups. And they can, I mean, incorporate the feedbacks.

Robert Clark (14:30):

And I'll add to that. So like Doug saying, it's an iterative process. So as you're deploying AI and you're bringing into your organization, whether it's on the rating side, whether it's claim determinations, things like that, we recently did something with deserving to do adequate reserving, but the idea is is you're preparing the data, you're creating your model, you're then deploying your model, you're monitoring it and auditing it. And what we mean by auditing is you can take very simple things like gender, race, age, those type of things that would commonly, you would be concerned about there being some bias and take a look at your opening, your starting data set and then seeing what your percentage is. If it's 50% female, 50% male, after you run it through your engine and you do an audit on it, are you still within that bounds or all of a sudden is it 90%, 10%?

(15:27)

Well now you've got a bias somehow introduce into the equation. And so the same goes with socioeconomic, all different factors. So you want to go in and kind of do a litmus test on a regular basis. And when we say regular it's, it's not just you set it and forget it because then the attorneys won't forget it when they find something they'll be trying to line as many people up as they can for their suit. So the iterative process of continually refining it and refining it and refining it and then eliminating when you do find data elements that actually create a bias, even though they aren't in themselves bias, they need to be CU out and pulled out or they need to be normalized so that they are no longer creating a bias.

Doug Benalan (16:14):

That's a very good point. So in add to that point is while we collected the data for all these different models, we did a pretty analytics model also. We did a bunch of models through cloverleaf for example, as mentioned, distance activity, workload and server reporting and open place severity reports. Also, we are in the process of developing a claims fraud model as well. So all these different models, the key point to key takeaways, we are not considering any social economic data or PI personally identified data or even geographical data. So that's very important to eliminate those things. Sometimes this can creep in. So that's why you should have a very strong team. When I say strong team, it's not going to happen day one. So it's going to build a team step by step to make the team stronger and stronger so they can manually review because there are a bunch of automation tools in the market too. But again, this are so novice at this point. So keeping your resources and understanding the business process, understanding your data, that's very critical for your success.

(17:14)

Yeah, let's a little bit zoom into your auditing. So this is one of the interesting topic because most of the organization I would say I seen I talked to may not have the entire auditing procedures in place yet. So this is very critical or crucial for us to understand why we need auditing and what are the different things we need to focus on audits. So basically it should always start with the plan and the scope for the audit. So what model you are evaluating and are you, or what interest you have towards the model, meaning you are trying to do a racial check or racial bias or gender bias. So those things has to be documented in the audit plan. Then you know, have to define your measures and metrics. For example, let's say as Rob mentioned before, so you are trying to do a claims handling metrics or claims handling, you know are checking whether the claims handling has bias or not.

(18:06)

That's your end goal. And what you can do is basically you can understand the claims denied data from the deployment and crosscheck the data along with the other geographical location or the demographic locations to make sure this is intended or this is producing the same results from the original one. If it is shifting here and there, then that means there's a bias introduced to it. So that is fairness test we have to do for sure to before approving the model for production. Also, there are some of the areas like in the sense you should always document your document, your model results. And I would recommend one of the other areas lesson learned from for is we didn't allocate enough bandwidth or capacity in our auditing process. Meaning we went for the deployment and after that we didn't focus, we didn't have a good allocation in our project for our audit feedback to handle the audit feedbacks or any issues coming from the audit.

(19:03)

So those are the areas I think is when you develop models I think you should consider. Those are the areas. Finally, I think very important is transparency. So when you build a model or algorithm, it should be transpired, it should have visibility, all your teams should have visibility towards it. So what we can do is, for example, we are trying to build a claims fraud model at this point. So what we are developing is we will have some logic coded behind the scene for this algorithm so that whenever there's an fraud alert triggered, the algorithm will give us a little more visibility. What is the claim patterns or what is the data behind this fraud alerts so we can understand all the business groups as well as the technology groups can be on the same page to understand and react to the fraud alerts. Okay. Yeah, so I strongly think we need to have some ethical AI standards. So this is basically not only provides transparency or fairness to our customers, but also most importantly it provides compliance and risk management for our industry also.

Robert Clark (20:15):

Sorry Doug, I was going to add one of the thoughts that is as we get into the ethical aspect of it is there's always, as we're talking about auditing, there never seems to be enough time to do it right the first time, but there always seems to be time to go back and fix it. And in the case of AI and auditing and doing all that and auditing is not pretty, no one likes to do it, but it's kind of a necessary evil to make sure upfront that you're doing the legwork so that as you get towards the backend you are not having to scramble to try and undo something that has a bias built in and you've already deployed it into the marketplace and so maybe rating is off and now you're having to go back and issue refunds and try and cover your tail before the insurance department comes in and slaps you with fines and things like that. So this is one of those cases where don't be penny wise and pound foolish, put the effort into the front to make sure that on the back end it's correct.

Doug Benalan (21:20):

Yeah, exactly. Also to the Rob's point, one more point would be in a sense ethical AI will strengthen our organization because the customers will have the visibility, when I say algorithm supports some of the logic behind the algorithm. So customers can get a visibility into you what you are trying to do. Also, basically in a sense, if there, there's an issue with the customers or issue coming up through an attorney, so we can always showcase what went behind this fraud alerts or what went behind this claim alert so that traceability and that auditing capability is available already for us to showcase that results to our customers, to our end users or wherever we need to be.

Robert Clark (22:06):

Now before we close out, one of the things I wanted to talk a little bit about is, as you've probably seen in the news recently with the military, pseudo drills that they've done that they've kind of retracted, there's been other activities where professor found out that a lot of master's thesis were written by chat GPT. One of the things that I think a lot of us never consider is that the AI doesn't have morals, it doesn't have a conscience, there is no guilt. So it has an objective and if you're given an objective, write my master's thesis, it's going to do that. And so whether it's morally correct or ethically correct, so one of the things that Doug and I were talking about is because it doesn't have this, it has a goal and it's going to do its goal and it's its purpose, is that in the insurance industry, it's probably time that we start considering creating a consortium or something where we develop standards for ethics around implementing AI within insurance companies so that there's a an concerted effort going into deploying it to making sure that it's unbiased and has an ethic within the insurance company as they're deploying it.

Doug Benalan (23:23):

Exactly. Also, another year to look at it is this is not going to happen more than night technically. All this AI bias and your organization building the data, having a process auditing and all these different flavors will take years. So one of the adv areas to consider is you have to have a collective approach across your organization. So your entire organization, meaning all the stakeholders part of this AI development should understand this is very critical. That's very important. Then having a, I would say collaborative approach to data collection. So data collection cannot be a silo. So initially when we started this AI models and deployments, we started only with the data engineers and data team, which is not at all correct approach. So you should have your user groups, you should have your business team, your end users who's going to use the ai, all these different flavor women, some cases customers also.

(24:15)

So all these flavors should be jelled together so that your data can be successful. Also, constantly have a review sessions, meaning the stakeholders reviews is very critical. So always review this, reevaluate and all this feedback has to go back to the planning phase, planning phase again. And finally I would say in the sense again, the collaboration between all these different, whenever, if you are not able to go with certain models or if you're not able to decide on a model, don't even deploy this to the, because the reason being is, you know, had to go full thorough testing, making sure that all your fairness test has been done, all your, you know, have the enough capacity to take care of all the reviews, then move forward with the deployment.

Robert Clark (25:01):

So with that, we'd like to thank everyone for attending and answer any questions.

Audience 1 (25:07):

Hi. Oh, great presentation. Thank you. In a study that I shared today at lunch, we did some analysis on senior leaders and how they're viewing data and only 31% of senior leaders across insurance organizations are confident in the cleanliness and accuracy of their data. Do you see this as the time now where brands across the spectrum, whether you're insurance or retail, that you make sure that that data is clean and accurate before you start building out the AI? I kind of see this as the coming back before we go up the curve with it.

Doug Benalan (25:51):

It's a great question. So technically in our own experience, so what happened is all if the data is segmented in multiple places, for example, we had it in mainframe, we had it in Oracle based systems, now we have it in Guidewire. So it's a three different system's, an example. So it's very, because when you're trying to bring the data into one platform that are mapping issue, mapping issues, so you should know have an analyst or data analyst or even business team who can understand all these connecting connected points technically. So that is very crucial. So there are multiple steps that it's not like over the night we can do all these things together. But I agree to your point in the sense it's still C level suites are not fully satisfied even today. Morning, I got a question from my CEO, so there are constantly banned for this one on this one. I totally agree.

Robert Clark (26:40):

And to quickly add to that, I agree a hundred percent with you, with 31% of C suite not trusting the data, it's very difficult to trust in having AI make a decision for you. If 75%, 66% is potentially wrong, then 66% of the results could be wrong and so you can be setting yourself up for failure, right?

Audience 2 (27:08):

Yeah, mine's kind of related to that as well, but I recently read that there was some legislation proposed in, I think it was California, where for any AI decision made in underwriting or claims that the customer could decide whether they can appeal it or request a human to review it. And that got me thinking that no model could be, I'm wondering if no model could be completely unbiased. So in that regard, how do you see AI and preference management towards, in terms of using AI, AI for the customer going forward?

Robert Clark (27:51):

You want that or you want, yeah.

Doug Benalan (27:52):

So I can take it. So that's a great question too. So technically in the sense there are still tools out there in the market who can do it, which can do some statistical analysis too on the dataset and so many things on the modeling perspective. But those are still no where as I mentioned. So having building a team across and having a, I would say guidelines are checklist for example, each and every process should have, and you do major things are data and then modeling, then auditing and feedback into this right planning process. So all these different flavors has to have a different checklist. And the checklist has been, we should be making sure, again, going back to the things of basically before running, we should crawl and walk. That means before trusting the AI or automated way of producing results, we should go and check crosscheck.

(28:48)

All your endpoints or your data sets or your algorithm or your final outputs are matching everywhere and connected toward, and all these users, all the user groups, whatever the groups you identified are comfortable with the approach before even deployment. So my suggestion would be before even looking at an AI model to make a decision, I would recommend then an organization which is trying to get a feet wet. Probably they should start with in the sense AI producing information. For example, if an adjuster wants some details for some, let's say some claims, so the AI should be developed for an adjuster to produce some of the claim details, to make some determination as a manual person looking at the data and make some determination. Then once that is getting matured for certain years or certain months, once everybody's comfortable, then deploy into a model model for production. Does that make sense?

Robert Clark (29:46):

And I'll add that some caveats to that as well. I think as we kind of venture AI is new, and I think, not to do a plug for Axiom, but one of the topic at lunch today was about personalizing the experience of a prospect or an insured. And in that case, I think AI is great for that, right? Because it, it's not making a determination, but it's kind of personalizing their experience based on their characteristics, what they're doing on the web, things like that, where it's safe that you're not going to end up having a bias that injures someone financially or in some other way. But then at the same time having some accountability like you're saying with California where if AI is making a decision, insurers are going to be a lot more careful about how they let AI make the decisions. If they know that the insured can come back and say, Hey, I want a human review. And if they start getting more and more human reviews of underwriters having to review it or claims adjusters, and then they're finding that there is an issue there, the insurance department I'm sure is going to come down on them. And so I think that having that option will actually force insurers to be very careful about how they deploy it. And Doug was saying that whole audit processing, making sure that those decisions are in line with what the insurance company's expecting and not rogue or causing issues.

Audience 3 (31:28):

You talked a lot about the audit process, which is to me it is more after the fact, but I was trying to understand, is there something we can do to prevent the bias to creep in the first place? So to that end, I was looking at the data, the three types of bias in taking data, bias in the data. As an example, we all know that correlation is not causality, so we, but humans are very good at judging, jumping into causality when you see a correlation. Whereas I believe machine algorithms are pretty good at parsing out causality from correlation. So my question is, if I take gender as an example for adverse risk versus gender as a parameter, why would the machine these algorithms jump into that conclusion in the first place with gender associating or making it as a cause for an adverse risk? I'm just trying to understand what do you do to prevent that? If a machine is doing that, what do you do to prevent that? Do you mask the gender altogether in that data set? What do you really do? So the two parts to my question, why would machine algorithms learning algorithms jump into those wrong causality conclusions? Part two is if you see the algorithm doing that, what do you do? Do you mask it or what do you do?

Robert Clark (33:19):

Right? And I can take it, and Doug, you can jump in as well on the first part where the AI may make a wrong determination based on gender. In those cases, a lot of times what we've seen is it's characteristics. So let's say that the spending characteristics or the behavior of a woman versus a man when it comes to purchasing insurance, whether they do payment plans or don't do payment plans, whether they do and all these different characteristics when they come in, sometimes it's even cultural where it may be taking these determinations and making a determination. It has nothing to do with the gender itself, but the characteristics that make up a behavior by that gender that could infer the engine to give you a result that's gender bias. Does that make sense? Yeah. And then to get away from that is the hard part, which is how do you determine what those characteristics were that led it to that determination to then take whether it's spending habits or whether it's the type of limits or type of deductibles that are chosen by one gender versus the other.

(34:30)

So in determining that, that's a difficult piece. And so that's really where you start going in and digging into your algorithm and saying, okay, well what were all the pieces that made up the decision on this? And saying, okay, here tend to be this set of data where it's biased, what's common in that data. And that's where you can a lot of times find some of the attributes. And sometimes just taking one or two of those attributes out is enough to then get rid of that bias because it's not that combined with other characteristics makes up the bias, if that makes sense.

Doug Benalan (35:02):

And there is no definite proactive measures at all, right? There's both proactive and reactive. So when I say proactive, you should be engaging your resources, having the checklist and all the basic stuff, understanding your use case, that's a proactive step. But again, there's a reactive part of auditing and testing and fairness test so many things.

Robert Clark (35:20):

Well, and I would say that a proactive step would be taking your existing book of business and running it through your algorithm and having your AI make the recommendations and then doing your audit on that before you've even unleashed it on new prospects or that way you can check right away before actually released it into the wild that it's not an inherent bias in it. All right, thank you everyone.