Ensuring ROI on Predictive Analytics Projects

George Firican
3 min readFeb 27, 2023


With the fast-paced growth in data professionals: Data scientists, analysts, engineers, and the lines between data roles being blurred, measuring and communicating the ROI of data teams is no easy feat. But given the large investment in this area, understanding this value presents an existential question for the data industry. Holistically, to understand the ROI of the data organization, we have to examine it through the lens of each function in the company.

Our guest today is Keith McCormick, an independent consultant, trainer, speaker and author. He has been designing and conducting analytics projects for over 25 years, and today he will be talking all about the ROI of predictive analytics projects.

You will want to hear this episode if you are interested in:

  • [00:12] Introduction to the guest speaker, Keith McCormick
  • [01:25] A fun fact or hobby about Keith
  • [03:28] Challenges and benefits of overly estimating the ROI for analytics projects
  • [09:57] A sneak preview of the confusion matrix
  • [10:40] Keith’s confidence in his ROI for analytics projects
  • [13:03] Stages in a company when Keith is brought in
  • [17:47] Costs to be considered from the start
  • [20:37] When are teams calculating the benefits of a project?
  • [22:13] How easy is it to bring data scientists, analytics leaders, and executives on the same page?
  • [25:47] Keith’s upcoming courses

Notable Quotes

  • The risk around the surprise as we get more into the conversation about the challenges and benefits of overestimating the ROI for analytics projects is so that you can just prioritize the projects available to you.
  • Maybe we put too much pressure on data scientists that they’ve always got to come up with some insight that’s worth money every time they’re looking at data, and we know that it doesn’t happen that way.
  • There has to be a prediction going on because otherwise, machine learning isn’t the right tool, and we all know that folks will haul out machine learning algorithms when they don’t necessarily need them.
  • If you go through the discipline of the confusion matrix and work your way down it, and you just sit down with the appropriate team members, you’d be surprised how much of this you can do in an hour or less than an hour.
  • If you have all these different codes, now you’ve got something you can tackle with a confusion matrix, which means you can think through the problem. Everybody on the team, including the non-data scientists, can follow what you’re talking about.
  • I always recommend that people do a partial rollout, and that’s the best way to do the kind of estimate for the costs that you do.
  • You can’t whiteboard a confusion matrix and know that a human will choose to ignore the prediction, but the reality is that will happen.
  • What I suggest that folks do is they take that experienced data engineers, data stewards, what have you, and put them part-time on the project, try to guesstimate fairly early on what that role is going to be, and then get somebody possibly even temporarily to cover some of their other duties that are easier to delegate than this.
  • With management, you have to be very judicious with their time, but you have to get on their calendars. So part of what comes with experience is knowing where in the lifecycle we will likely need them.

About Keith McCormick

Keith McCormick is an independent consultant, trainer, speaker and author. His consulting specializes in helping analytics from all industry leaders to build and manage their data science teams. His training has reached thousands of individuals trying to learn statistics, machine learning and data science. He specializes in predictive models and segmentation analysis, including classification trees, neural nets, general linear models, cluster analysis, and association rules. He has been designing and conducting analytics projects for over 25 years


Connect with LightsOnData



George Firican

Data governance & BI professional, ranked among Top 5 Global Thought Leaders on Big Data, founder of LightsOnData.com and Co-Host of the Lights On Data Show.