Data is the lifeblood of nearly all organizations, enabling them to better understand and serve partners and customers, meet compliance regulations, get products to market faster and more efficiently, improve business processes and increase profitability.
However, success at all of these goals depends on having the right data for the right tasks available to the right individuals. That makes the issue of data quality a top concern. Information Management recently hosted a roundtable of data experts in the financial services and insurance industries to discuss their best practices for establishing a data quality strategy, selecting the best data collection and governance processes, and creating an organizational culture that focuses on preserving data quality.
The event, which was underwritten by Trillium Software, was moderated by Information Management Editor-in-Chief David Weldon. Panelists included:
Louis DiModugno, chief data and analytics officer, AXA
SR Ramakrishnan, data management expert and former chief data officer, Citigroup
Dennis Klemenz, chief information officer and vice president, Connex Credit Union
John Ross, director of analytics and business intelligence, Hartford Funds
David Gleason, head of data strategy and chief data officer, JPMorganChase
Jed Maczuba, vice president and director of enterprise architecture, MFS Investment Management
Marc Alvarez, chief data officer, Mizuho Securities USA
Sumedh Mehta, chief technology officer, Putnam Investments
Eric Meltzer, executive vice president and chief technology officer, OppenheimerFunds
Keith Kohl, vice president of product management, Trillium Software
What follows is an edited version of the roundtable discussion:
DAVID WELDON: I thought the best place to start is to discuss what good data really means; why does it matter; what is at stake with it; and what are the causes of poor data.
JED MACZUBA: Data exists to enable folks in our business to make decisions. Therefore, I think one of the greatest risks to the organization is poor data quality. The worst thing that can happen will be workers making bad business decisions based on data they may perceive as trustworthy, as complete, as accurate, as consistent. That can lead to them making decisions that are not in line with the business.
In our case, we are reporting this information to our clients and to the regulators, so I think the impact of poor data quality could have some significant impact to our businesses. In addition, if you have gone through a situation where you have actually given out bad data, you have instilled this lack of confidence in the information. It is very hard to get that confidence back. You only get one chance to make a first impression, and if you put bad information out there, it is hard to get everyone back on board with confidence in the information you are providing them.
ERIC MELTZER: I agree. The point is trust in the data. Once you lose that trust, it is very, very difficult to get it back.
DAVID GLEASON: To me, the trust is very often driven by transparency into what the data contains and what it is. Looking at an individual piece of data, it is difficult to say if it is correct or not. But when you start dealing with data sets—with streams of data coming in to a function or to a report—now it has a whole set of new aspects. For example: Is it complete enough? Is everything I would expect to be included present here? There is no easy answer about whether it is correct or not. Clearly, the level of completeness, accuracy and inclusion you would need for external financial reporting is very different than you would use for an internal cost metric. So I think we are really working hard to make sure we’re providing transparency around what exactly is contained in a given data set, how we measured its completeness, its quality, etc.
SUMEDH MEHTA: I think it starts even before then, when the data first gets into your world. In the sea of data, it is very hard to spot bad data. However, when data is coming into a system you are used to looking at it in a certain way and you can spot the odd data. If you do not do a good job right at the outset and you let bad data in, then you have to live with it for a very long time, and that is where people like us pay a big price.
I think there’s another trend, which is cloud computing and data living outside your organization that you are now relying on. In addition, all of the big data analytics proliferating our world is giving us an order of magnitude change, and people have to think differently if they are going to manage the quality of that data versus internal data.
MARC ALVAREZ: For many people, the first issue is delivering data. But I cannot think of a more highly globalized industry than capitals markets. They are intensely globalized now. To compete, you arm yourself with better weaponry, and you put in more—not necessarily better—data to support your decisions and to do things your competitors can’t or won’t.
MEHTA: I agree. Let me give you an example. McDonald’s introduces the all-day breakfast menu. The Twitter sentiment on it was nuts. Months later, earnings reports start to reflect an increase, and the stock price goes up. However, this data was available well ahead of time. So talk about business models and giving people these insights and having access to information that does not live within our four walls. We have been used to managing data internally and now, all of a sudden, you’ve got this large amount of data being created and accessed externally.
MACZUBA: In addition to any sort of transparency on the regulatory front that we all live in, what I have seen is the growing recognition internally of all the vast amounts of data that is available to make business decisions. The challenge is how we marry up all that data with the people that most need it.
GLEASON: It is interesting, the onslaught of new types of data that just were not available a few years ago. We have data coming in from social media platforms, data from the Internet of Things, and a variety of other sources. It is feeding this insatiable demand for better analytics.
The problem that you often see, though, is the person who first gets that data knows exactly what they are getting, hopefully. They are aware of what filters they are putting on that Twitter feed, how they are acquiring data, and they have a good idea of what it means.
The thing that keeps me up at night is what happens to that second or tertiary consumer that perceives the tremendous potential of this data and wants to use it, but they do it without knowing what it means, where it came from, how it was filtered or grouped, or what has been done to that data. That can be bad data to them. It is bad if it is not there when they want it to be, if it is not as complete as they want it to be, if it does not mean what they thought it meant. Any form of disconnect between what they think it is and what it really represents creates bad data, whether the data is accurate or not.
MEHTA: I think that goes back to who you are as a company. Putnam is an innovator, with CEO Robert Reynolds running the firm. He is looking for change and he is looking to have advantage in the marketplace. We have technology that did not exist before, so let us utilize it. So if you are an innovator, then you want to input that data, but your typical data governance processes want order, not innovation. So if I put in the rules that say, “Here is what must be true before I will let a consumer utilize it,” then you cannot innovate, and then your data strategy isn’t working for you.
SR RAMAKRISHNAN: I think that point is extraordinarily well made. I think there is a continuum here. I think that financial reporting and the disclosures that you want to do formally are one thing, and the dependency on high-quality data and the rules governing it are another. However, at the extreme other end, we really want to be able to exploit that next new piece of data for our competitive advantage and, obviously, we cannot use the very same rules and logic at this extreme end of the spectrum. Now, unfortunately, institutions do not understand this continuum very well, which means they throw the rulebook at the innovator end of the spectrum and they do a disservice to the company as a whole. We have actually seen this happen in large corporations, where extreme prejudice is used in terms of data quality. That flexibility is extraordinarily important to the culture of the company.
MEHTA: With regard to flexibility, I think the data groups in our organizations deserve some credit. They do not get enough love. Those groups have done extraordinarily well, and in many organizations where it is about portfolio data, financial data, I think it’s often run like a utility-like service. The issue now is that with more data coming in and businesses expanding, there is a multidimensional expansion of that data. However, budgets are not going up. So how are they going to look at information and provide the same level of scrutiny so that the output increases and the productivity increases? I think that is what everyone’s grappling with.
WELDON: How do you prioritize your data quality initiatives, based on what you are saying? How does the CDO go about setting priorities?
MEHTA: At the one end, is the regulatory stuff—there is no excuse there, right? You just have to take care of that stuff. There is confidential information you have to deal with, and there are certain policies you are going to put in place.
But it is a sea of data. So you have got to manage information at the source. The prioritization really is focused on keeping the data secure, keeping the business running, taking care of all the regulatory measures. However, the same teams have to find time to help enable the business to go where it is trying to go, which means using all of the information available. There is value to be had in all that data, and that value is critical to our success. Besides, if we do not do it, someone else will.
KEITH KOHL: Who owns the quality of the data?
MEHTA: You know, it is a cliché, but quality is everyone’s concern, and you cannot have it any other way.
ALVAREZ: This concept of ownership that the consultants have us all enjoying is meaningless. I have had knockdown, drag-about arguments with them on this to the point where I banned them from using the word in my presence. It does not mean a thing. If the question is asked, “Who owns it,” it indicates a lack of process. It indicates a lack of standards, and we all know the process is standards, right? We come in every day. My traders go sit on the trade desk, take orders, put orders out, and they know their process.
It has to be in the DNA of the organization. We are quantitative and statistically driven in what we do. It is really the bottom line of what many people are coming to grips with. We all have to get to know our inner statistician.
GLEASON: I will never say the word again! However, at the same time, I think there is some very specific accountability that you need to be able to assign to data. You need to be able to say that the trader who comes in each morning is accountable for certain decisions they make around the way they are capturing data, and they are accountable for following processes and requirements to accurately capture the facts.
People who consume data are accountable for making sure there are clear requirements and definitions for how that data is captured to assume its primary purpose. I think the point was very well made earlier that we have to move beyond the world of “all data is good.” We have gold, silver, bronze and even maybe some tin data, and as consumers of data, we are accountable for understanding what we are consuming, and making appropriate decisions about how to use different types of data. There is a lot of stuff I would do with rapidly acquired streaming data coming in off the Web for analytic purposes that you would never ever want to do for a management report. Part of the job is making sure everyone understands what data has which varying levels of quality or certification attached to it.
ALVAREZ: That is really interesting. I would be really interested in hearing how you have communicated to the developers of those capabilities that they also have this dimension of data awareness they need to be accountable for.
RAMAKRISHNAN: I think that analytics professionals—statisticians, data scientists, as they seem to be called nowadays—are foundationally aware of data types and data quality, because 70 percent or 80 percent of their time is spent just doing that. Model builders spend 80 percent of their time just grappling with data, so to say that somehow they are disconnected from quality concern is not true. In fact, the reverse is true. They hardly spend any time actually doing so-called model building. They are actually spending most of their time either acquiring the data or making sense of it, or putting their heads together and deriving common meaning and understanding.
Our job, essentially, is to make them more productive by building up the capability so individual pools of people are not wasting their time and so it becomes a collective enterprise. Therefore, we are forcing them to “up their game,” and that is a big point here. You have to somehow wean them from this whole data management concept.
MEHTA: I think the field of advanced data analytics will do more for data quality in the next two years than data governance processes have done in the last 20 years, and we can see the writing on the wall. Data analytics will help companies make decisions about the businesses they are going to invest in and the direction they are going to go in and create differentiation.
Therefore, as we look at the marketplace and say, “What’s going on out there and how does a company like Putnam create the best advantage for its shareholders?” We want to deploy data analytics to create that differentiation. As people like data scientists get to work and start looking at that information differently, we are now seeing that exposure again, but it is multiplied tenfold.
Today, data analytics has the CEO’s attention, and it is really a business enabler. It enables business goals to be met. The same is going to happen with data and data quality, where decisions are going to be made based on the analytics. The analytics are going to drive the future, and if that information is not correct, someone at the CEO’s table is going to care deeply.
GLEASON: I think we are only seeing the tip of the iceberg in terms of that potential value add to the corporation and to the shareholders. To SR’s [SR Ramakrishnan’s] point, if you look at data scientists today, the average data scientist is a better programmer than most statisticians, and a better statistician than most programmers are. This is to say that they are spending a lot of time wrangling, and finding and manipulating data. As we get better at giving them the right information, managing the level of available data, and building the quality of the data so they spend more of their day actually doing the analytics and less of their day hunting for and finding and cleansing the data to get it suitable for their purpose, the value just keeps growing. I think we are still early in the capacity of their ability to add value.
RAMAKRISHNAN: You know, there’s a profound point being made here, which is that if the actual consumers of data or the people who interact with the data are most familiar with it and are in the best position to affect or to determine its quality or, in fact, improve its quality, it has a profound bearing on what kind of analytics tools are most appropriate to use.
That whole paradigm needs to be reinvented, and some of the new firms are doing that, which is a self-service approach to looking at data and looking at very large data sets. Now, this has the potential to cut short the entire cycle and empower the people who are most familiar with data—who are most responsible for it—to actually act on it. This works where we are trying to do analytics and trying to get immediate business value, and that shift is something we can see happening.
WELDON: I wonder where data governance sits within everybody’s organization, and what the business participation role is in all of that.
ALVAREZ: We are really benefiting from what many other bigger firms have done—JPMorgan, Citigroup and others—in establishing the framework. What I am doing is to set off a number of small initiatives that have very visible outcomes, one of which is measuring data quality.
I have a background in econometrics. It totally makes sense to me that the first analytics we should be writing are the analytics that tell us the health of the data. It seems obvious to me.
Within the next four to six months, I am guessing we will have an official governance function and track all these things that the regulators want us to. However, to my mind, that is just the starting point. That is just the ticket to doing business. I think everybody has an expectation as a global investment bank at what level you need to be at.
As I mentioned, we have a couple of small initiatives coming out over the next few months. We will start to assemble some stakeholder groups and, frankly, do some internal marketing. We are going to build a community, a constituency, and interesting stuff is just going to fall right on top of that.
WELDON: How do you build that community?
ALVAREZ: Demonstrate value.
RAMAKRISHNAN: To demonstrate value you have to first be able to measure data quality. And if you cannot measure it, you cannot really manage it that well. Has anybody grappled with a large-scale metric around data quality? Because we all have some instinctive sense of what data quality is, but when it comes to actually measuring it, what are you measuring?
ALVAREZ: I am dealing with that right now. I am falling back on the old ISO total quality management model. We start with conditions. We write scripts. The outcome is a set of indices, and we used this for software development projects in the past. It works really well.
RAMAKRISHNAN: This is actually enormously interesting, because development of that type of an index of quality is something I have attempted to do and I am certain many of us have attempted to do. Are there any experiences here about whether it works or does not work -- issues surrounding measurability and the effectiveness?
ALVAREZ: My experience is these things can very easily become interesting IT projects and database management projects. But if that happens you are doomed, because suddenly you will get a whole shelf full of related values, coefficients, and indices that nobody ever looks at and are completely meaningless.
RAMAKRISHNAN: My own experience with this journey of measure is that you start with defect and issue tracking, where people are observing, “I’ve got this problem, I’ve got that problem.” You know, very often it is difficult to move from this defect tracking methodology and mindset into quality tracking in a measurement. Based upon the size of the problem you are dealing with, you can get overwhelmed with the defect management methodology. You are firefighting all the time.
MEHTA: It is also a matter of maturity. If I look at Putnam Investments and what Putnam has gone through over the past 20 years, the level of maturity is really strong around data quality and data governance. There are certain classes of problems that just would not occur because of the processes that are in place, people are aware of them, and you spot things faster.
That does not mean that issues do not happen. Obviously, you react to them. However, what is happening is that you move from taking care of the day-to-day business and making sure that the business is running, the engine is running and the trading systems are up, and you have the ability to go trade. I do not know that we deploy data signs for maximizing profit. I would say we deploy data signs to maximize shareholder value. There is value to be had, and you want to go get that.
LOUIS DIMODUGNO: One of the things we were rarely successful with was when we moved from spreadsheets and reports into visualization capability with our data. That is where we got our business lines to start to participate. They could see where the impacts were, and they were able to then go ahead and really start to dive into how we can go ahead and make fixes.
MEHTA: This is why I say analytics will do more for data quality in the next two years than we have in the last 20.
JOHN ROSS: So, just a question to the group. We talked about data scientists. Where do they sit in the organization? Are they in IT, or are they out in the individual departments across the organization? In our group, we are actually seeing them move out of IT into the organization, where now you have someone in tune with the data, who is in touch with the business side of an issue, and working with that data so they can communicate back the needs to the IT side to provide that information.
MELTZER: But it sometimes helps to have both. Most of the data scientists are sitting with the business units. But within IT, there’s one or two—especially a liaison looking at things and helping to facilitate access to data, self-service or whatever the case may be, or data acquisition, so that’s really important.
ROSS: That is where a lot of policing goes on, because those people are actually out in those departments and know the requirements are ever changing.
MEHTA: Right. I think there is a new breed of people that are embedded within IT—you know the more math-type graduate versus the computer science graduate. They can interpret the results, so I actually see IT taking a lead in this area.
RAMAKRISHNAN: From my experience that is quite the exception, what you are attempting to do, because the role of IT tends to be tremendously more passive than what you are doing. What tends to happen is that they are usually waiting for the business to communicate what they are thinking: “Tell me what you want” versus “Tell me what you got.”
WELDON: What is your role in establishing a culture, and how much buy-in do you get?
MEHTA: Culture is topped out. We are aligning to culture that says, “Be disruptive in the market.” We do not want a fintech to come in and take our market share. Our CEO is a leader. He was, for example, the first person in the financial services industry, in asset management, to use social media.
It is really mindboggling. He is credited with building the 401(k) business for a really large financial services company and now for Putnam and Great-West combined. It is the second largest 401(k) business as measured by participants. So someone with that ability to gain market share and drive business requires that the technology department keep up with them, align with their vision and bring as many business partners along as possible, because that’s what everyone’s trying to do.
MACZUBA: It is a shift for most IT organizations.
RAMAKRISHNAN: So the point here is that IT people are not the natural owners of their own data, or are not naturally provided access to data. They are prevented from doing so, because the developer community works only in the developer domain, not in broad data or even in acceptance testing. The business community needs to be empowered so they have access to the data they need and the tools they require to act on that data.
The deep issue here is that the specification process is done sight unseen. Nobody sees the data before they specify what they want. The developers do not look at the data, because they are walled off from it. The acceptance testing process happens too late in the cycle; you have not profiled it and reworked your specification. Therefore, I think there is a deeply broken methodology here.
ALVAREZ: But the methodologies do exist, right? Like basic ISO 9000 quality management models?
GLEASON: I think we are starting to see the beginning of a sea change in the role of technology and the business owner of technology. I look back at how over the last decade or so, all of the people in our business were aware that we all had a role in managing risk. Now we are all aware we have a role in managing information security. I am sure all of us have mandatory training, and we reinforce the message to everybody at every level in the organization that we all have a role in the security of the data.
What we are finally starting to see is the message cascading out to the business as a whole that we also all have a role in managing the usage and the quality of that data. As people start to take more responsibility for understanding what the data actually looks like and are able to communicate requirements back in a meaningful way to the organization, I hope we are starting to close that gap.
When I look at most firms today, there’s a little group of specialized people—the advanced analytics group, the big data group, scientists, whatever you want to call them—and they are sort of bridging that chasm, and are working with the actual data. They are seeing the data, and they are sitting side by side with the developers.
We have to find a way to bridge that gap even more, so we have business consumers of data who are hands-on with the data and working
side by side with developers, as other industries have figured out how to do. To do that in a very large organization with regulatory commitments, with data privacy concerns, etc., does pose a little bit of a challenge.
MEHTA: I think the paradigm has shifted, and while there is a real need for control and access, you have to redefine the roles and the need. The way that need is defined is that a developer does not need data. Of course, the role is changing and evolving, as you bring in a new set of talent. They are going to demand to be more productive. They need to be able to do certain things.
As the analytics conversation becomes prevalent at the boardroom and at the CEO table, they are going to demand to know where that data came from. That is going to start up a whole new focus on quality, and the people who we said do not get enough love will figure out what happens when they are loved. They will come to the forefront. They will have to, just as technology did from being just data processing into a business enabler. This will happen with data and analytics.
WELDON: Let’s each talk more about your own individual challenges. If you could ask one question to the other roundtable members, what would it be?
GLEASON: I would actually like to ask Dennis Klemenz a question. You are also a professor. What do you see happening in the educational space with training and teaching people about data? Is it changing as much as people like to think it is, or what is really happening?
DENNIS KLEMENZ: I think from an academic perspective, there is a very clear gap in skills between what companies need and what academia today provides. What we are seeing is students sliding away from the traditional degree and going toward these eight-week, quick development programs. It is more hands-on.
WELDON: Like a boot camp type of thing?
KLEMENZ: Yes. They are producing very production-oriented and functional employees. The issue is the depth of the knowledge from a theoretical perspective. They are very targeted and very skilled in that particular type of problem. The issue is that many of those students have not learned how to evolve—so that as the technology changes, the problems change, and you see them needing to go back to boot camp.
In academia, you are seeing a slow evolution. They move at the speed of many very large corporations. However, we are seeing that many of the younger faculty members are bringing in their corporate relationships. Therefore, you are starting to see many relationships between companies. United Technologies—a big corporation in Connecticut—has many relationships with universities. I think it is going to require businesses reaching out to the universities. In the meantime, many of the big universities are starting to make that shift toward those boot camps. They are seeing the demand.
WELDON: Well, Dennis, I have been reading and writing about the skills gap in IT for 25 years. What is the role of industry in addressing that?
KLEMENZ: You have to lean on the universities. You cannot just passively wait for people to graduate and say, “Hey, this person graduated from Yale with a computer science degree. Great.” Be active with academia, create partnerships, and really create relationships with the universities. You have to remember you are dealing with academics, so unless the university knows that this is a problem they are not going to address it. They are going to keep doing their research, keep doing their teaching. They have to know it is a problem.
ROSS: Do you see student applications starting to rise in these areas for these boot camp-type classes?
KLEMENZ: Oh, yeah. Oh, yeah. A lot of it is career shifters—people who have had a career in a different industry—and they say, “Oh, I really need this skill.” They are not going to spend four years going to undergraduate school. They will spend eight weeks; go to a boot camp to try to bridge that gap. So yeah, it is a booming industry right now.
ROSS: I think a lot of that has to do with the fact that you have people who are now seeing the value in the analytics and recognizing that, “Hey, I had better go back and train on this, because this is where we’re evolving.”
MEHTA: I wonder what it will take for academia as a group to wake up to a new reality and see that the way to do business has changed. I do not know if it is we as parents not willing to pay the exorbitant costs anymore, or some other external influence. It would be really be nice if it happened internally and we recognize that if America is to maintain the lead that it has enjoyed up until now, then we have to do something about the talent.
WELDON: SR, your question?
RAMAKRISHNAN: Just kind of a wrinkle to the point everybody is making on academia. My daughter is in an MBA class at Georgetown. I find in broad conversations about the marketing discipline, academia is 10 steps behind the practitioners. The practitioner is fundamentally shifting from the classic branding methodologies to real data and working with actual behaviors. You do not have to guess things and segment stuff. You can look at the data. This is a cultural shift.
WELDON: Dennis, how about you?
KLEMENZ: On the larger scale, there is this tug and pull between centralized and decentralized having control over what data goes in, versus allowing people the flexibility to do their own analysis. So how do you manage that process in giving your analyst the ability to bring in data of which the quality may be suspect, and they may draw wrong conclusions, versus maintaining control? I just worry it is almost like “Jurassic Park”—the T. Rex will stay in the pen for only so long; it is going to break loose. So how do you control the damage that T. Rex does, especially if it is built off poor data?
ALVAREZ: The answer to that, for the moment, for most of the industry, is badly. What you find are points of innovation. It exists in what is happening in our industry, and I am sure it exists in the asset management side. Innovation is very, very central to the success of these businesses.
People are innovating all the time. Whoever shouts loudest gets what they want. Whoever makes the most amount of money for the firm gets what they want. Subservice groups—operations, technology to a great extent—are kept on a very tight leash.
Now the irony is, in the world of trading securities, we have had very precise rules to follow for 30 years now, right? How you use data sourced from exchanges and stuff like that has a very tight playbook of rules that you have to use. If you go into a trading room, there is real-time data distribution going on. Everyone knows about information in control room accounting. It is very ad hoc, and really, it would come down to the culture of the individual firm as to what is and is not allowed.
MACZUBA: I think it’s recognizing the different use cases. We talked about regulatory versus exploratory use of the data and being comfortable having that conversation with the business. I know when we talk about data quality, it is not binary; it has good or bad data. We talk in terms of our level of confidence; in terms of the different dimensions that we check; the tests we have in place; and that we add in certain premiums around complexity.
It is tough going to the business and saying, “Hey, if you want to do exploratory work, bear in mind we have an 85 percent confidence that this data is what it says it is.” However, if you are going into regulatory or client reporting, the confidence has to be higher than that. It comes down to the culture of the firm—is that something you are coming to believe?
MELTZER: It could also be about creating a sandbox. It needs to be segregated and walled off so you can provide that capability or flexibility to those analysts, portfolio managers, or whoever wants access or wants to play with something, so to speak.
MACZUBA: But how do you build the area where they can feel comfortable doing that, playing around?
RAMAKRISHNAN: I think Eric is onto something really, really important. While historical efforts have been in this direction, usually people have acquired the data themselves. What we are really saying here is there is a common acquisition process, and then you are providing something to someone for an established bit of time. You are saying, “You’ve got this for three months, and it will be killed.”
WELDON: Jed, would you like to pose a question?
MACZUBA: Something that came up earlier -- did anyone get over the adoption hurdle with their data governance organization, or are there any words of wisdom they can share with the group?
ROSS: Let me add to that question. How many people feel they have gotten to the point that their data governance program is fully established—that it is fully functional and it is what it was meant to be? Do you ever really feel you have accomplished what your original goal?
MELTZER: At Oppenheimer there has been a data governance council and data policy around for probably two, three years. I do not think the job is ever done, because it is always evolving. Have they gotten the most out of it that they can? At best, I think there has been a lot of really robust discussion.
MELTZER: I think what is problematic-especially in the asset management industry-is you have a single data set for investment decision making. Then, as it goes through the cycle, it could be used for part of the sales process, but it is definitely used as part of the marketing operation. So the challenge becomes how do you resolve or solve those types of issues or problems that satisfy your entire constituency without creating so many derivations or replications of the same types of information?
ROSS: How do you standardize, and yet give everyone the flexibility to answer what his or her particular need is?
ALVAREZ: Different parts of the organization may have different uses of the data, right? It is that simple. Sadly, it is sometimes your risk management group or treasury group throwing spears at one another.
ROSS: I think the goal of these programs should be education. How do you educate the individual users? For example, if you are using this data set, and you want to look at it for the marking aspect of it, you want to apply these filters.
RAMAKRISHNAN: One of the things I have seen most recently surrounding this is organizations trying to get crowdsourcing or collaborative methodology to define data sets. Those who use it most intensively probably call it these things. They give it names, like metadata. However, what tends to happen is that it is only they who know about it. Other people who wish to know about that data are left to their own devices. Therefore, what is happening now is there are tools available that allow that data to be revealed to everyone and for everybody to contribute to that whole process, so that there is an ecosystem of providers and users who have knowledge of the data.
And that opens up a great possibility, because in times past, in the old warehouse definition process, there used to be this ivory tower approach of somebody going off, building a data model, and it never seeing the light of day. However, what we are seeing right now is data is coming in because somebody wanted it and that person at least minimally knows what they want and how to define it. The questions are, can others consume it, and which are the best tools available to help them act on it?
WELDON: Louis, your question for the group?
DIMODUGNO: So I have the fortune and misfortune of being the chief data officer as well as the chief analytics officer. A lot of my data initiatives are driven by my analytics initiatives, kind of a symbiotic supply and demand area. For you folks who focus just on the data component of it, how do you go about really attributing the value of that data to your organization? Even I am starting to read issues around how some organizations are actually looking at this as an asset to put on their books. How do we go ahead and take that data as truly a capitalized item?
RAMAKRISHNAN: I have heard the question asked before, but never answered.
ALVAREZ: My view is data is a service to the organization. This is something I am introducing to my firm right now. We are a global firm, so when I talk to my peers in London or in Hong Kong, they may have a different approach, but as the bank in North America, my view is data management is a service we provide to the firm, principally to the investment bank. Within that, I have user communities, one of which is the analytics community.
The analytics community typically has actual business sponsorship behind them. The business rationale tends to be quite clear, to the point where many of our new initiatives will not move forward unless there is a business plan there. From a qualitative perspective, it is a compelling case as to why the analytics groups are high priority.
DIMODUGNO: I can essentially drive my data budget based on the output of my analytics organization. So without the true connectivity of your analytics group, how do you evaluate what your budget is going to be for the next year?
ALVAREZ: By looking at data as an asset of the organization. In fact, I think some organizations I have done some work with very early in the game, but they are starting to monetize those assets, whether it is coming out with pricing evaluations or indexes.
WELDON: Marc, what is your question for the group?
ALVAREZ: My question is; does anyone foresee increased adoption of self-servicing analytics as part of day-to-day activities?
The reason I ask: I foresee a day where the data scientists are basically an internal consulting group working on projects, and I would like to equip them with the right tools to be efficient. I will have another group who will worry about the data management, cleansing, all that stuff. The question in my mind is, what is the role of self-service to perform these functions, and how good can we get, and how fast?
GLEASON: So I do think there is definitely a movement toward self-servicing—increasing self-servicing, but within a managed ecosystem, as SR described, where there is a central function to acquire, secure and in some cases classify data, and with regard to reliability, availability, quality, whatever, of that data. Therefore, some base-level utility needs to exist that lets individual consuming organizations choose from the menu of available data or data sets, leveraging data-blending technologies to create the particular data set they need.
MACZUBA: I think the tools are out there. We all see them, right? The tools out there make it a lot easier for end users to do self-service.
MELTZER: I think it’s really important—this trend toward self-servicing—because you want the business to be able to answer the questions without, in the one sense, having IT get involved with a lot of this.
ALVAREZ: That’s my entire goal—trying to optimize our use of IT. Right now, businesses are running to IT with a problem far too early. They do not specify what they want. They do not specify how fast it has to be. All they say is, “I need it, and I need it yesterday.”
KOHL: Don’t the tools also need the data quality component you guys have all been talking about?
ALVAREZ: Absolutely, because if you want to sell to us, if you touch on any data we do not give you, you will find yourself in my world of audit. That is it. That is going to be a purchase criteria going forward for us. I just cannot take the risk. There are documented penalties from our friends at the Federal Reserve and others if we do not exert that due diligence.
WELDON: Sumedh, your question for the group?
MEHTA: I would love to hear everyone’s thoughts on collaboration for data management. We face similar issues. How can we better collaborate? How do we handle data issues, and what happens as we move to the cloud and we define the scope to be a little bit different? How do things change?
GLEASON: I think the form of collaboration is changing. What we will see are some intermediaries removed from that equation, and we will be doing near to real-time direct party-to-party collaboration.
ALVAREZ: Well, I think the cloud is going to be the answer for many different reasons. It is an environment where collaboration is feasible.
Today, collaboration requires quite a large base before it really makes sense. For example, we collaborate in payments, right? We are all members of the Swift network. We all own shares of Swift. We collaborate in clearing and settlement, right? Canada has something similar, and Europe has similar things, so we do collaborate with the organization.
When you talk about data, it’s about a community. I mentioned crowdsourcing earlier. If I find something, I can notify you that there has been a payment announcement from Sony that is wrong, or from Bloomberg, or whatever. I think that is of value to you, and I think the cloud reduces barrier entry at some such level. I think there is a real positive there.
Whether we all intentionally want to collaborate, that is another issue.