big data

  • June 19, 2015
  • Rooster Crowing at SunriseIf you read a lot like me, you might notice almost daily there’s a new study that contradicts some earlier research. Something causes cancer then it’s good for you. You know the drill. What’s going on here? Do we simply not know what our research is saying? Can nobody correctly interpret the data? None of this would mean much to CRM except that with the advance of big data and analytics, the front office, i.e. the relationship between vendors and customers, is coming to resemble many other endeavors that rely on data analysis. Here is my take on all of this.


    Very often the research we get in the popular press and in business interactions represents the findings of correlation studies. Simply put correlation tells how strongly two events are to one another and it takes some sophistication to understand.

    We can think of correlation as probability but we need to understand what it means. A coin toss has a 50/50 chance of coming up heads or tails (.5 probability). So 50% is exactly neutral. If something had a 30 or 40 percent chance of happening, it would be negatively correlated. In other words, the probability of something not happening would be greater. However, a 30 or 40 percent chance of something happening is not zero which is why we still get rain on days when there’s less than a 50 percent chance of it.

    So, a probability of greater than 50 percent is what we’re usually looking for and the higher the number the better the correlation. A 90 percent probability is interesting but 60 or 70 percent—not so much for reasons that are obvious by now. Still a 90 percent correlation is not a sure thing and using the weather analogy, we sometimes see sunny days when rain has a 90 percent chance of occurring.

    In business, we’re beginning to use correlation a lot but that disappoints many because correlation alone won’t tell us another important part of the story, causation.


    Causation is the reason behind the correlation. It’s the data that, added to the correlation data, will provide the necessary information on which to make a decision. So, for example, a sales person evaluating prospects might look for high correlation between a prospect’s need profile and the vendor’s solution. That’s a good start but it’s missing something very important. It says nothing about the prospect’s motivation which might only be found through more traditional means like making a sales call.

    What? Correlation isn’t enough? Consider this—at the correlation level a prospect in need of a solution looks just the same as one that just bought something from your competitor. Causation in this case is another word for a buy signal and if you look at buy signals and not just correlation a customer that just bought will look very different in this one dimension than one still looking.

    In sales and marketing analytics we’re mostly focused on correlation and that means we’re far from foolproof in our predictions. I am not trying to get on anyone’s case but the fact that we’re so vested in correlation simply tells us where we are in the lifecycle of analytics as applied to CRM—there’s more work to do.

    Another way to look at the situation is through the lens of qualitative vs. quantitative data. So far I’ve been focused on quantitative analysis like getting those 90 percent signals. Very often when we’re dealing with quantitative findings we’re looking at correlation data. Finding causation requires more sophistication but it is often qualitative findings that tip the balance. Interestingly, you can develop quantitative findings over qualitative findings but it takes a little more work. You need to ask questions differently and you might need to score the answers to get a quantifiable result.

    Finding causation starts with asking open-ended questions. In my book, Solve for the Customer, I use the example of creating a new candy bar. The quantitative approach might ask about preferences like do you like coconut, prefer milk chocolate or dark, peanuts, almonds, pistachios, nougat—the possibilities are almost limitless. At the end of your research you might have a very detailed understanding of how much your target audience likes various components of a candy bar but you wouldn’t be any closer to making something that would sell.

    The qualitative approach is less sexy in many minds because it implies that you won’t get enough information to work with, but consider this. In designing a candy bar, it would benefit you a lot if you also asked open-ended questions about what people like most about them or their favorite memories involving candy bars, or how they fit into a person’s day. Those questions are almost limitless too and the answers would surprise you and possibly tell you a lot about unmet needs in a crowded market.

    If you don’t believe that’s useful, consider the story of Howard Moskowitz. Back in the day there were two competing makers of jarred spaghetti sauce Ragu and Prego. Prego was the perennial number 2 in the market and wanted to take the lead and they hired market researcher Moskowitz to figure out how. At the time there were also only two kinds of sauce on the market, plain and spicy. That’s it, just two. Moskowitz hired chefs to make what was ultimately 45 kinds of sauce, many with chunks of things in them like tomato, meat, and other veggies.

    Moskowitz discovered that about one third of the American public wanted chunky sauce but incredibly, there was none on the market. Previous research was concentrated on getting quantitative answers to questions about existing choices, which can be boiled down to how do you like our sauce? There were no open-ended questions about what caused people to like spaghetti or Italian food. The Moskowitz taste tests provided the open-ended questioning leading to discovery of a new market that’s been worth billions ever since.

    My point in all this is that you need both quantitative and qualitative information to arrive at correlation and causation if you hope to understand customers. If you’ve embarked on an analytics journey that’s great but keep looking and formulating your strategy. Buying a single product is definitely not the end of the journey but a beginning. If you’re a vendor, don’t make the mistake of thinking that your single product is the final answer to market need. It’s a stepping stone and you need to position yourself accordingly.


    Published: 9 years ago

    New BlackAre we collecting enough data?  It seems like a weird question given the glut of it at most companies and perhaps it is.  Perhaps a better question is are we collecting the right data?  We could get into a long philosophical discussion of just what the right data is, but that would only happen if we made the mistake of thinking all data is the same.

    True all data ultimately gets represented as 0’s and 1’s in the database so at least at that level, all data is remarkably similar.  But the comparison stops there.  All protons are the same too but how many are associated in a nucleus (along with the companion neutrons) is the difference between lead and gold.

    Of course the way through this dilemma is to first ask about the objective of the data collection.  Most business leaders will tell you that data collection is a prerequisite of knowledge creation and that knowledge is the objective because employees make business decisions based on knowledge and not data or even information.  This is especially true in marketing and sales.

    Cultivating knowledge requires us to combine data with other data and information to ultimately result in the thing we need in business, actionable knowledge.  To do that it’s necessary to capture a wide diversity of data that includes the stuff that comes from our social media outposts, CRM, and ERP systems.  But too often we stop there only to discover it’s not enough.

    Sufficiency comes from completeness and relevance and these ideas should be explained.  Knowledge has a context.  If you want to develop sales knowledge, for example, your goal should be in developing leads, which is another way of saying knowledge about who has a business problem to be solved and the authority and budget to cause a solution to be created.  Some would also say that the person represented as a lead should also have cognition of the problem but the counter to this is, that’s what sales people do.

    Relevance means understanding all the things related to completeness plus having a suitable solution.  It does little good to know that someone has great credit and is approved for a car loan if you don’t sell cars.

    So sales knowledge relates to all the data that we cultivate to knowledge plus completeness and relevance.  But the data and information that get you all the way to knowledge comes from different sources which brings me to the point of my questions — are we collecting enough of the right data?

    We collect a lot of our own data, of course, and we may supplement it from data providers both to cleanse our collections and to add both new data profiles and to flesh out existing profiles.  That’s only part of the story, though.  It’s like having a very specialized and up to date version of the phone book.  Filtering can tell us about attitudes, needs, and other valuable things, but by itself, this data has missing pieces.

    Getting all the way to knowledge requires different approaches than simply filtering the social stream or completing profiles.  After all, buyers often don’t simply announce a desire to buy something.  Instead they may do things like make pronouncements, issue press releases, or introduce reports.  Third parties might also supply information that, when married to conventional data produce the knowledge that, under the circumstances, a person or business will need to act in a certain way.  These are the things that ultimately drive completeness and relevance.

    So, a good question to ask about the whole lead generation process in any company is this:  How complete and relevant are the leads that marketing gives to sales?  An even better question is, is that intentional?

    Sales has a role to play in qualifying deals, especially when large sums or novel products and processes are involved.  Frankly, marketing can only go so far in developing a lead.  But too often we treat all “leads” alike as if they were protons or data.  A company’s attitude toward the need for completeness finally drives the process of lead generation but gauging completeness is too often considered to be part of the sales process and here’s the rub.

    Why spend relatively expensive sales rep time getting to completeness if there are better, faster, and cheaper ways to do this?  You might never be able to get every lead to 100 percent completeness and making that attempt might cost your some business.  Nonetheless, being conscious of completeness as a goal will alter some of your marketing process and possibly even cut down the number of leads marketing hands to sales.  So what?  Better qualified leads are more worthy of your sales people’s time and resources; so a reduction in quantity, as long as it is accompanied by better quality, would be a good thing.

    As always, the devil is in the details — how do you get there?  My suggestion is to capture more data or at least some different data.  There are wonderful tools on the market that spider the Web looking for the reports, press releases, and news stories, and ferret out the information that provides the completeness we seek.  At least some of this data and information might need to be scored and fed through an analytics engine so once again simply collecting this data won’t get you to Nirvana.

    But we should all be aware that the bar is being raised for this next level of data collection, and we must understand the importance of completeness and relevance.  It’s a competitive world and getting to completeness before your competition might be the new black.

    Published: 10 years ago

    Ok, the title’s a cheap paraphrase of the T.S. Elliot book that inspired “Cats.”  You have to start somewhere and that’s as good a place as any.  But stay with me, this goes places.  A big group of cats is called a clowder.  What if we could access a clowder of big data?

    Big Data has been taking up a big part of my conscious life lately what with all the analytics vendors out there and so many companies trying to figure it all out.  There are at least three issues that converge when you discuss big data, two that we know and one that we don’t pay a lot of attention to just yet.

    The two devils we know are physical storage and analytics and many people stop there.  Storage has largely been taken care of with dirt-cheap spindles, the cloud, and other advances.  Analytics is an old story that’s gotten better year in and year out.  Huskier processors, bigger spindles, and in memory databases have made real time slicing and dicing easy.  The other day I spoke with Alan Trefler, CEO of Pegasystems who told me that software and hardware have advanced in equal measure over the years to the point that today we can do quite a bit of analysis in very little time.

    Batch pattern analysis gave us the input we needed to get predictive, to assign probabilities to situations based on prior experience and that has given us the ability to stack rank ideas, offers, and generally to be able to say this is the next best thing to do or offer — not always but — in this particular situation.  It gives us the ability to (sort of) be in the moment with customers.

    Predictive modeling has been a great way to enable companies to better understand customers and their needs.  Based on company-gathered and maintained big data we can confidently deploy systems that suggest to employees what to do next.  In case after case that I listened to in Orlando during Pega’s user meeting, Pegaworld, I heard of huge improvements in business process results, in part due to leveraging analytics and big data, so good for them.

    But please pay attention!  In these last few paragraphs we’ve traversed the long path from data to information.  Did you catch it?  Maybe not.  Analytics turns data into information and people (usually) turn information into knowledge.  Our systems serve up information but our people, our employees apply that information in customer facing situations to make decisions that achieve desired outcomes.  There’s nothing more important that good decisions in business today which is why we are fixated on big data and analytics.  But we can’t stop there.

    It’s time to think bigger.  What would it be like if we could amass more data than a single company typically captures for its analysis?  Naturally, this assumes all the data is relevant to a set of business processes.  It’s very hard to do something like this.  Some vendors I’ve spoken with about this say that the data lives in many places and consolidating it is not necessarily quick or easy.

    One place where it might be easier to accomplish this kind of Major Big Data consolidation — can we start calling it Major Knowledge? — might be in SaaS applications.  Of course not just any software as a service but those systems that operate on multitenant storage might have an important leg up.  A company like Salesforce, NetSuite, or Xactly might be a good place to look.  In a recent conversation I had with Chris Cabrera, CEO of Xactly, I heard they were thinking about what that would look like.

    You may recall that Xactly focuses on incentive compensation management.  Xactly collects data that focuses on sales people, deals, credit (for partial deals), compensation, and much more.  If properly scrubbed so that all identifying data is removed, this database would be capable of revealing all the best practices information in sales by examining the way that people are compensated, no small accomplishment.  In the right hands, that information can become powerful knowledge.  I know some of you are saying why not look at other data like revenue or stack ranking the reps or any of a thousand things.

    The answer is simple.  Other data won’t give you the answer.  Incentive compensation is an art at the moment (unless you already use a system like Xactly) and relying on a single data set might only reinforce bad practice.  Capturing data from a wide body of knowledgeable sources is, after all, one of the hallmarks of crowd sourcing.

    People do their jobs and they get compensated and someone is the top rep and someone else is at the bottom.  But these rankings don’t say anything about whether the incentive compensation was really an incentive, whether or not it caused people to modify behavior.

    So what, you might add.  But wouldn’t it be nice to know if the incentive is both effective and in the right proportion?  What do others do?  Effectively, what’s the best practice?  Those are questions you can’t answer if all you’re looking at is your own data.

    Sales comp is a big hairy issue.  It’s estimated that in the United States alone companies spend $800 billion per year supporting sales based incentive compensation.

    It’s messy and full of 25% credit for this deal and 65% credit for that; it also has accelerators and clawbacks so how do you get it right?  For example, Xactly found that 75% of its customers are crediting five or less individuals on a deal, but some outliers credited up to 161 individuals on a single deal. Splitting a deal 161 ways can hardly be motivating.   Many are the stories of sales people who left a job because they felt under appreciated (i.e. compensated) or just thought they could get a better deal elsewhere.  Having access to information based on such a huge volume of data might give every sales manager and HR or- finance department the concrete information they need to do a job they are largely guessing at right now.

    Sales compensation is not the only area worth exploring with this approach either.  The same techniques can be applied to every job category and if you’ve looked lately you will have seen that many jobs are coming to have incentive compensation as a part of the package. In fact, an estimated 84% of companies today are now using some kind of incentive compensation outside the sales function.  If you ask me, there’s never been a greater need for the kind of data resource pooling and analysis I am proposing here and it’s easily within our reach.  We just have to think a bit different about the challenge at hand — how to attract and retain the best talent — because business today is less about your widgets and more about the quality of people you have representing them.

    Published: 11 years ago

    It is very hard to pinpoint a disruptive innovation and the moment it hits the market and I have said this many times before.  It’s easy to know that you use this or that technology today and couldn’t imagine living or working without it but, really, can’t you imagine the time before you started using this stuff?  Social media might be a good example since most of today’s users still kind of remember what life was like before.  Today is one of those days, I think.

    Today Ayasdi came out of start-up stealth-mode and announced itself and you can see an article about it here in the New York Times.

    So what is it in a nutshell?  It’s only the first really new and different way to analyze big data since we started collecting it.  Ayasdi uses something called topological data analysis and here’s one place where it’s different.  Rather than type a query or ask for a report from a big data set, Ayasdi just looks at the whole data set and tells you where the interesting clusters of data are — clusters in places you may not have thought about.

    So that means you no longer have to more or less know what you are looking for to use analytics, you simply need to know that you want to understand the interesting clusters.  That’s a disruptive moment if you ask me — presuming it works as well as the early hype says it does.  To me this sounds more like advanced data mining than business intelligence but I am not an analytics guru so this is simply speculation.

    So what’s it good for?  Well, if you’ve collected a lot of data about a molecule you think might have beneficial pharmaceutical properties, rather than performing a lot of screening tests, you might first examine the data topology and then investigate where the data says there are interesting relationships.  And, yes, substitute customer for molecule in the above and more interesting things happen.

    As with any disruption, it’s hard to think of what the world will be like in the aftermath, but if this works as advertised a few years hence we might all be scratching our heads trying to recall what life was like before.

    Published: 11 years ago

    Everybody has a year-end synopsis these days and it’s fun to see what each person deemed important.  Sometimes you wonder if you lived through the same experiences but it’s a good thing to recall everything one more time and maybe reconsider how you’ll remember each.  Here’s my synopsis which is no more or less valid than anyone else’s.

    Marketing’s resurgence might be the most interesting development of ‘12 for several reasons.  First, the switch to marketing from other areas of emphasis (like service) shows that many people in the CRM universe feel that the economy is not only healing but returning to form.  In the last few years, social and service, and often the two together, were the CRM market drivers but with marketing showing new vigor, it suggests to me that next year will see business accelerate.  Maybe that won’t take us all the way back to 2008 but what it will be an improvement.

    Also, marketing’s renaissance comes via a social salient, especially in using analytics to better understand and segment markets.  Analytics tells me that vendors need ultra low cost ways to get to their customers because the economy is still weak and no one wants to hire people so they’re going for automation and software.  That’s just the new reality and I hate to be bringing the news.  Many markets are price driven — as opposed to quality or service driven — and companies are trying to give customers what they want.

    And speaking of price driven and automation, there’s been a nice uptick in the number of vendors offering software robots that can at least triage a service call.  That includes VirtuOz, a CRM Idol finalist and personal favorite.

    Big Data hit CRM through the link to analytics and marketing and companies like Dun and Bradstreet, Lattice Engines and InsideView are all taking a cut at this important space.  Another one worth checking out is Awareness, another CRM Idol finalist.  They do cool stuff in applying analytics to the big data pile captured with social media.  In all, social and analytics have shown us that there’s more to social marketing than sentiment analysis which can only be good for the future of the market.

    If marketing is becoming automated and socialized, a similar thing is happening in human resources.  Many an HR software vendor has made the leap to the cloud and also to social.  The two will radically transform HR from a back office preserve to something much more front office in its orientation.  HR is rapidly becoming a specialized case of social front office application with important contributions from, Jobscience, Vana and lots more.

    Also, despite what Gartner said in its recent gamification report, I think the future is largely positive in that market.  The major analyst firms put out reports that spell doom when it becomes clear that an early market has gotten frothy and no one in their right minds can reasonably expect the new thing to live up to all the, well, hype.

    But the good news about gamification is that it is reaching its adolescence, a time when some of its early adopters will harness it and make it successful.  So the good news I see is that the vendors and customers who do it right will be fine and it will be clear who has the goods next year.

    Then there’s mobile, mobile, mobile or browser apps, native apps and always connected native apps.  Making mobile work this year was the result of a collaboration of infrastructure vendors and people who make the applications.  I have noticed recently that wireless vendors are getting aggressive about offering tablet packages for only ten bucks a month to users who subscribe for other devices.  Ten bucks is important as it represents a manageable fee so I look for mobile adoption to accelerate now that all the pieces seem to be in place.

    Mobile infrastructure comes at the right time also because numerous vendors have put significant development resources into moving their applications to the tablet.  HTML5 is robust and popular but so are new CRM applications that run natively adopting all of the pinches and swipes that people like about tablets.  Salesforce has a decent solution in Touch and I think we’ll see more vendors produce “develop once, deploy on many devices” solution sets in the year ahead.  Over the last couple of years we’ve watched the early stages of PC and Laptop sales tanking and the hockey stickomatic rise of the tablet and the handheld and next year will be the time when mobile puts its foot down on the accelerator.

    With mobile’s arrival as a more or less equal in the platform wars we will be witnessing the first true global platform that I have been talking about.  A global platform means adding millions of new users and customers to the ranks all at once—ok not ALL, all at once but enough to make you notice.  I have a feeling that while a significant portion of those new users will have a good grasp of English, companies that offer bi-lingual interfaces will be the early leaders.  The first step will, of course, be to analyze where your traffic comes from and then maybe to pilot a few pages.  All this may suggest an opportunity for translation services short term.

    But that’s next year.  For now, thanks for continuing to read this space and please come back in 2013.

    Published: 11 years ago