Put Building Data Culture Ahead Of Buying Data Analytics
December 11, 2017 Jeffrey Burt
In his keynote at the recent AWS re:Invent conference, Amazon vice president and chief technology officer Werner Vogels said that the cloud had created a “egalitarian” computing environment where everyone has access to the same compute, storage, and analytics, and that the real differentiator for enterprises will be the data they generate, and more importantly, the value the enterprises derive from that data.
For Rob Thomas, general manager of IBM Analytics, data is the focus. The company is putting considerable muscle behind data analytics, machine learning, and what it calls more generally cognitive computing, much of it based on its Watson technology. That includes the Watson Data Platform and its Data Catalog, Data Refinery and Analytics Engine.
But when it comes to data analytics, Thomas takes what’s been called an “attitude before aptitude” approach, with the idea being that enterprises need to create a “culture of data” before they can take full advantage of analytics. They need to have in place a belief that data and facts are what’s important when making business decisions rather than instinct, beliefs and what’s been done in the past. And it’s an approach that’s got to come from the top and become part of how the business operates. Here Thomas talks with The Next Platform about the importance of a data culture in enterprises.
Jeffrey Burt: Culture is something we all participate in, but it can be difficult to nail down in terms of what it means. I think it is safe to say that the hyperscalers, who measure everything about their infrastructure and their applications, have pushed the state of the art in creating what could be called a data culture, but what do you mean by a data culture?
Rob Thomas: The fact that there is not a single recipe is what makes it difficult for most companies. It brings to mind a crude sports analogy: Every football team runs a different offense, and just because one team can run that offense doesn’t mean another team can run that offense because it depends on what players you’ve got. That’s true for a data culture.
For some it’s about, ‘Do we have agile methods for how we manage our data? Are we making it available to everyone in our organization to drive better decision making?’ Pretty simple in concept, but harder to do. For others it’s about, ‘We now how to do all that.’ This is about applying the next level of data science – using data science to drive automation, to drive predictions and to really change how we interact with customers. For a third company it could be, ‘We’ve got our arms around both of those, and this is about, how do we create celebrity experiences where everything is personalized?’ It’s about using Artificial intelligence applications.
So it is different for every company. When I’ve been talking with clients, they’ll ask me, ‘OK, what should we do?’ and my response is, ‘It’s dangerous for me to tell you that because I don’t know exactly where you are, so talk to me about where you are because the answer for you is probably different from what it is for somebody else.’ That’s why people struggle with this concept.
Jeffrey Burt: We understand that something called a culture of data might be important to all enterprises. How can enterprises who are not hyperscalers, who have not been steeped in analytics for a decade or almost two, develop it?
Rob Thomas: Given how long many companies have been in the business, we would probably be shocked if we knew how much decision making is still based on instinct and feel. For a long time that was because that was the only option available to anybody.
Where we are now, the cost of computing has plummeted, the cost of acquiring [and] building algorithms has plummeted, the cost of storage has plummeted, so all of this is available and any company can really start to use facts and data to make decisions, but that doesn’t mean most are. In most cases, many are not even today. I think it’s this first principle idea of, all that matters is the facts, and the facts are in the data. Companies have to decide that that’s important, and their culture has to stand for that being important, which is data trumps opinions. Even that starting point has to come from the top to really change the culture.
I didn’t see that even in our own organization for a long time, and everything was gut instinct and opinion. It’s amazing when the leader of an organization starts saying, ‘Don’t bring me your opinions, show me the data first,’ that has a big impact on the culture.
Jeffrey Burt: What do you see happening in enterprises now as they develop what you are calling a data culture?
Rob Thomas: There are three different stages that I see right now. Stage one is, ‘We’re doing what we’ve always done. We have disparate IT systems, we have disparate data sources, we’re doing the best we absolutely can to optimize the costs of this. We’re bringing in new tools. We might be bringing in Hadoop or something else. We’re trying to do what we do today and do it a little better.’ Better means faster and at a lower cost. That’s probably 40 percent of the organizations I interact with.
The other 40 percent are saying that business as usual is not good enough. ‘We are going to fundamentally change what we are doing, which means we are going to be data-first, so we’re building a data science practice, we’re building governance into our data so we can make it available to everybody in the organization.’ Those are the ones that are really in the early stages of forming a data culture.
A phase three organization says, ‘We are going to be AI-led or machine learning-led or whatever you want to call it. Everything we do is going to be based on data analytics and [that’s] how we’re going to make our decisions. If our legacy systems aren’t working out for us, we’re just going to start fresh.’ That’s much more on the aggressive end.
Jeffrey Burt: When looking at these three types, what is IBM’s role in helping them get to this culture of data?
Rob Thomas: I’ve got a slide I use with clients. We call it a ‘maturity curve’ for analytics and data. If I put that slide up in a meeting, that leads to an hour of discussion. Number one, our role is to put awareness on it. What’s important is that organizations are at different stages. Can you even see where you are? When you look at that slide, there are immediate debates where there’s ten different views from that one client of where they sit. Part of our role is the awareness and facilitating that dialogue.
Second is, inevitably, when someone plots where they sit, it’s, ‘How do we get started?’ Like anything else, there’s a bunch of places to get started. We try to drive that point of view to where that right answer is. I’ll use an analogue example: Think of a library. Your job is to store a bunch of books. If you take over a library, and there’s a million books in that library, is your first step extracting knowledge out of the books and then go act on it? Probably not. You probably don’t even know what books you have. Step one is, you have to understand what books you have. Libraries do that by building a card catalog. They organize it, they know every book, they know where it is, they know the purpose of that book. That’s I what I encourage organizations. You’ve got to start with a catalog. We build data catalogs.
Most organizations do not have one. Financial services, specifically the insurance companies, are more advance than others on the topic. We’ve worked with almost every financial institution over the past few years doing that.
So, first, you have to create awareness. And second, you have to have an approach that’s proven. If a client starts by building a catalog, everything else will flourish on top of that, because it gives them the basis. The third step is to come up with the use cases. You know how to optimize the customer interactions, you know how to optimize risk management. All those use cases typically [require] the catalog, which is why I like to start there. Many people jump right to the use cases, but you do get into a garbage-in, garbage-out scenario a good percentage of the time.
Jeffrey Burt: You have talked about success begetting culture. What do you mean by that, and how does this work?
Rob Thomas: This is such an insurmountable thing for a lot of organizations, where, even if they’re buying into everything I say and everything I’ve just told you here, it’s kind of so overwhelming, they can’t even imagine not only how to get started, but how to sustain it. That’s why we need to find a way to get them early successes. You’re talking about a one-year, two-year, multi-year project, and I say, ‘Look, I will fundamentally change something in your business if you give me 60 days.’
That can be simply building a catalog for one source of data and giving access to 50 people in the organization who never had access to it, and it will make their jobs better. Or, in that same 60 days, if you’re really hung up on that use case, we can analyze all the texts in one of your call centers and we can [improve] resolution times for customer problems that are calling in. If we just present it as this big thing the way I described it, it’s so overwhelming that the culture never happens, that it never develops, so I’ve got to create successes quickly so that will reinforce that culture.
Jeffrey Burt: Does the culture need to be in place before an enterprise can decide what technology or what vendor to go with, or whether to stay on-premises or go into the cloud?
Rob Thomas: I don’t think so. To some extent, anything is better than doing nothing. That’s a little dangerous to say because I’ve seen a lot of people do the wrong things, but I don’t think it has to be in place to go down the path you describe.
What absolutely has to be in place is you’ve got to have someone senior enough who really cares and who really wants this to happen, because there is going to be countless pitfalls, there’s going to be countless people who are against it or are deterrents or who are going to fight it. If you don’t have someone pounding the tables saying, ‘We’re going to do this,’ it’s going to falter, inevitably. That is a more critical element of success than, how quickly do I choose a platform or a deployment option.
I know it sounds basic – ‘Yeah, Rob, I know you need corporate sponsorship’ – but what it means to me is, a lot of CEOs will look at this and say, ‘No, this is the CIO’s job. That’s just data, that’s infrastructure, that’s the CIO’s job.’ That’s no longer the case. Every company is a tech company, so it’s your job if you’re the CEO.
Jeffrey Burt: Machine learning plays a critical role in data analytics. What is its role in helping to create this data culture in an enterprise?
Rob Thomas: It’s certainly the more advanced end of what we’ve talked about. What’s amazing about the whole deal of machine learning is it’s become dramatically more accessible over the last five years, partly because of cheaper storage, partly because of cheaper compute, but the biggest factor is, how much is available in open source and in open communities. The number of libraries that you can access anywhere for no charge is obviously unsurpassed at this point. I encourage clients to pick up something that already exists. What we did with our data science platform is we based it on open source. Our point was, we might as well take advantage of everything that’s out there. So pick up one of those libraries that’s there, add in some of your data, use some of your data to train that to build a model. Your access to success is so much faster because you’re not trying to create your own algorithm or create your own model. You’re picking up something that’s already there, and then you’re training it with your own data. That’s why ML is finally taking off.
I’ll give you an example. Here is one of the most mundane IT tasks of all time. It’s a little deep in the weeds, but it will illustrate the point. We’ve had a master data business forever. You go back to the library analogy, and if you’ve got ten copies of one book, how are you going to know which copy is which. That’s what master data does – it helps you understand what your different entities are. There are people in every organization whose job is as a data steward, whether it’s matching data or ‘this one goes here, this one goes here.’ It’s extremely manual. We’ve built machine learning into our metadata product [and] into our master data product to automate that process of data matching. We can scan your data [and] we can do the matching automatically. It totally changes the roles of those data stewards. No one has to do that, while they can go do something that’s more valuable up the stack. You won’t read about that in the newspaper because it’s not an exciting use case, but it fundamentally improves people’s jobs and what they’re working on in their organization.
Jeffrey Burt: You had mentioned earlier that you had seen the creation of a data culture take hold in IBM itself. Can you describe what happened inside of Big Blue and the effects that it has had on the business?
Rob Thomas: I’ll just take my division. Obviously, we should be using a lot of analytics. Most of our data was sitting in databases, in data warehouses, and spreadsheets, the normal stuff, and we launched a big project about a year ago that said, ‘Look we need to know it for it to run out business.’ The first question was, ‘How do we interact with that,’ because people aren’t going to use that if you require them to use something that they don’t want to use, so we actually built it into Slack, because we use Slack for communication, sort of as a collaboration platform. Now if you go into our Slack, I can just go into an intelligent bot that we built and I can say, ‘Hey, do we have any big customer services issues growing right now?” And it can respond instantly and say, ‘We’ve got service issues these five clients.’
Or if I go by product and say, ‘How are we doing in DB2?’ and then we get an immediate response. We’ve taken what would have required me calling up the IT team and somebody cobbling together the data and pasting it into a PowerPoint or whatever, and it’s been integrated into my daily work process. All I have to do is ask a question in Slack and get back the answer.
It’s all based on the ML that we’ve applied in our own business. What used to be a meeting is now just a query, which is literally just a natural-language question that I ask in Slack.
Jeffrey Burt: Given that enterprise are at different stages on the path to building a data culture, how fast do you expect this culture of data take hold in these companies? What do you expect to see in the next two, three or four years?
Rob Thomas: This is the first year that I’ve seen real progress and strides. The real onset of big data – if you want to use that term – just started for years ago [or] five years ago, and that was all about the first category. That was about, ‘Oh, we’re going to bring in Hadoop,’ or, ‘Oh, we’re going to be in another NoSQL platform.’ That was just a different way of doing what you’re already doing today. That wasn’t really evolving or adopting a real serious data culture.
This is really the first year that I’ve seen evidence of the latter, which is applied data science, which is applied ML. It’s one of those things that took a long time to get started, but now it’s finally started. If I go back to January or February of this year, that’s where I would plan the flag and say it really started. It takes longer than you think to get started and you have a little bit of a germination period and then things go incredibly fast.
The second half of next year and into the following year is where we probably get warp speed.
I talked more about ML than AI and the reason I did that is because I know AI gets a lot of hype because it should; there’s a lot of exotic stuff with AI. My point on ML is that I think the greatest value in this next period of time is automating the mundane tasks. It’s not exciting, but that is where the value is, kind of in every iteration of technology through the years. Obviously, anything that’s build in ML can eventually be evolved to be AI APIs, natural language, image recognition, that type of stuff, but there’s a lot of value in optimization and automation with ML.
Jeffrey Burt: In the past, you have also talked about doing data science faster. What do you mean?
Rob Thomas: There’s been a traditional way of data science for the last 40 years. Probably the most classic data scientist that nobody talks about, because they go by a different name, is actuaries. They’re classical way of working is: Proprietary language, proprietary tools, hoard everything, don’t share anything. That’s obviously being totally unbundled by the likes of open source to data science becoming much more collaborative. That’s a big culture change for organizations, where it’s typically done in a silo.
The whole point of doing data science faster is that when it becomes a team sport and you have real collaboration and you have sharing, obviously your ability to go faster increases dramatically, by orders of magnitude. While it seems like a matter of just choosing an open tool in an open collaboration platform and all that, it really all comes back to culture. You’ve got to have an organization willing to choose to change the way that they’ve done things.