The Dashboard Effect

Don't Drown in a Waterfall!

October 13, 2022 Brick Thompson, Jon Thompson, Caleb Ochs Episode 46
The Dashboard Effect
Don't Drown in a Waterfall!
Show Notes Transcript

Click here to watch this episode on our YouTube channel.

In this episode, Brick and Caleb discuss the risks of trying to design and build everything you think you might ever need in a data warehouse in one single project. A faster, less expensive, and higher success rate approach is to start with smaller, high-value scope to start with, and then add on to it over time as reporting needs present themselves. This is the classic "waterfall" versus "agile" problem.

Blue Margin helps private equity owned and mid-market companies organize their data into dashboards to execute on strategy and create a culture of accountability. We call it The Dashboard Effect, the title of our book and podcast

 Visit Blue Margin's library of additional BI resources here.

For a free, downloadable copy of our book, The Dashboard Effect, click here, or buy a hardcopy or Kindle version on Amazon.

#BI #businessintelligence #reporting #dashboard #datawarehouse

Brick Thompson:

Welcome to The Dashboard Effect Podcast. I'm Brick Thompson.

Caleb Ochs:

I'm Caleb Ochs.

Brick Thompson:

Hey, Caleb, how's it going?

Caleb Ochs:

Pretty good, Brick.

Brick Thompson:

Very good. All right, so we're gonna talk about something interesting today. This is the argument that sometimes comes up about whether a data warehouse should be built in a waterfall fashion. Basically, "Let's figure out everything we might ever want to do, spend months, 18-24 months and lots of money building that out all at once." Versus more of an agile approach, where you're going by various subject areas and building it by priority of what's going to impact the business most. I think you coined a great term for this, Report Driven Data Warehouse Development, which I really like. So yeah, that's our topic.

Caleb Ochs:

All right. Cool. Yeah, that the older way, or the waterfall approach, that's where, you know, people like Ralph Kimball, that was really what they set out to do. Like,"Understand everything in the business, and then you can go about building your data warehouse." And, to give him credit, there's a lot of good stuff that came out of Kimball, and we still use a lot of his his techniques and thinking. But that's one of the things that we definitely do not follow, because it just takes too long, and business changes, and it's fast. And you got to be faster, you just got to be faster.

Brick Thompson:

Yeah. Well, you make a really important point, I think one of the most important reasons not to take that approach is just what you said, businesses change, and over 18 to 24 months, things that you thought were important to start with may not be the most important now, or requirements that you thought were important, actually need to be different, or data sources changed. Now you could probably make an argument to each of those, "Well, if you build it right, your common data model, and you've thought it through, and it's generic enough in the right places, that doesn't matter." But we haven't seen that to be the case. And in fact, I think the waterfall approach to data warehousing frequently ends up in a failure. Or at least that's what we've run into. And then we get brought in to do an agile approach. So they can get reporting now that they need,

Caleb Ochs:

Right, yeah. I think it almost always doesn't. And a lot of times, it's just people just getting impatient. Maybe the project would have been successful, but it's like "We can't wait that long to understand what's happening in our business." So we don't do that.

Brick Thompson:

Yeah. So, you know, it comes back to things that you and I have been preaching for the last 90 days on this podcast about. You need to really identify what are the business goals that you want to be able to impact with your reporting in your data warehouse, and really understand, "What decisions do we need to make? What actions do we want to be driven by our data warehouse?" And then you can look at those various things and put a judgment on, "What's the highest priority, what's going to get the biggest bang for the buck, where's the ROI?" And embark there. And that doesn't mean you have to do one thing at a time. You can do many things at a time if you have the right resources, both both on your service provider or your IT department side, and on the stakeholder side, because everybody has to come together to make that happen. So you can do that. But what you don't want to do is say, "Let's take a generic approach, let's come up with this common data model that will be good forever." It's an old style, and an old approach, and some people cling to it, but it just does not seem to deliver the high efficiency, high impact data warehouses that we see now.

Caleb Ochs:

Right. And that term that you mentioned at the beginning, Report Driven Data Warehouse Development, if you take that just at face value, you might think, "Oh, that's gonna pigeonhole us." But really, coming up with the model first, and then trying to build reports on it that pigeonholes you more than anything because you're stuck with what you've just built and spent so much time building kind of in theory, and then in practice, it actually doesn't work out. So really it's practical data warehouse development. You could almost go at it that way, like,"We have to support something that's going to impact of business." Data warehouse is an interesting spectrum, right? Some people might think, "Oh, okay, well, let's just take my table and put it into a database." And that's a data warehouse, and it's report driven data warehouse development, but it's not really the case. You're accumulating tons of technical debt, it's not going to scale, you're not going to be able to add other data sources, intermingle different data. That's not going to work. And then on the other huge end of the spectrum is that waterfall approach of we have to have all of our data in the data warehouse, and that's not a good approach, either. So it's, it's somewhere in between, and that's what we recommend going after.

Brick Thompson:

Yeah, that's right. I think when you end up doing the more waterfall approach, you end up asking your report writers to sort of contort and have to figure out how to make this architecture that was maybe built for a more generic viewpoint, and make that work for whatever reporting they're doing, when in fact, if you were architecting at the same time, if it was report driven, if you knew the subject areas you are going after and what the key drivers for the business are, you can make it much easier for them.

Caleb Ochs:

Yeah, and a good story of that is, we had a client where we were building, in our typical fashion, this report driven development. And the third party came in said,"No, no, you got to have this canonical data model." And that was just his term. And, yeah, I hate that term now.

Brick Thompson:

A little PTSD.

Caleb Ochs:

Yeah, right, "You've got to have this district generic data model that everything can plug into." Luckily, we were able to talk the client out of doing that, because had we done that, you look at where they're at now, they have so many specific reporting needs and reports that were really impactful for the business. That, quite frankly, would not have been possible by taking that other approach. You just wouldn't be able to do that.

Brick Thompson:

Right, that was a huge win going the right way there. So I do think there are some cases where a business needs to get lots of data available to analysts right away. And there's a solution for that. And we do that with some clients. And that's to use a data lake. And to bring data in from multiple systems, maybe even already consolidated and summarized in some ways to make it easier to deal with, but maybe not. Maybe you're just pulling tables in there and raw CSVs, and so on. And then the analysts can use that to do some ad hoc reporting. And as we realize, or as the stakeholders realize exactly what they need to answer those questions and drive those behaviors, then you can easily go from data lake to data warehouse. So you can get a little bit of a hybrid, best of both worlds situation there.

Caleb Ochs:

Yeah, and you hit on a couple of key points if that's the route you're gonna take. One is, you don't really know what you want yet. And he didn't mention this, but it's worth saying, if your data is in different systems. If you've got one system, there's really no point in putting it into a data lake. Right, just build a data warehouse on it.

Brick Thompson:

Exactly.

Caleb Ochs:

Right. But if you've got, like 10, and they have the same type of data in it, maybe they're just different business units, get that stuff into a data lake, start figuring out the logic, because it's gonna be different across all these systems, and then start building your report driven data warehouse after that.

Brick Thompson:

Yeah, that's a really good point, I should have made that distinction at the beginning. If it's one data source, okay, you're building a data warehouse. Even then you may not do every fact area within that data source, although you probably will, because you're in there, so you're gonna do it. So yeah, good clarification. Yeah, and you probably don't start with every fact area, because you shouldn't go and try and build a sales report and inventory reporting, whatever else you're going to need to report on, financials, you know, all that stuff. You shouldn't go do all that at once. You should start at sales, or wherever it's most important to start at for your particular business, and then continue building out the rest. That's kind of what we're saying. That's right. Yeah, get a high impact area, get good ROI, get adoption, get it to start working, and then you can start pulling other things in. And this is not to say it all has to be serial, "Okay, you're only allowed to do this area, and then you're allowed to do this area," you can parallelize these. But don't just sort of generally say, "Okay, give me everything." Have a stratagy.

Caleb Ochs:

Right, yeah. That's what we see where people have have a misconception that you can just point a team like ours at a data source and say, "Build a data warehouse." That's, that's not how we do it, and that's not how you should do it. You should say, "Why?" And then you can build your data warehouse.

Brick Thompson:

Yeah, exactly. Good point. Okay, well, that's what I wanted to cover today. Anything else you want to say

Caleb Ochs:

I do think that it's worth saying when we talk about here? a data warehouse with our clients, sometimes it can be misconstrued that when we say we're going to build a data warehouse they think of it as this big, large, all my data is here, and it's all reportable, and it's all fantastic. And then we say, "Oh, we're going to do it in six weeks, then they're like, "Oh, my God, there's no way you can't do that." But really, what we're meaning is,"We're building version one of your data warehouse." It's going to have one reporting area of driven off of it. And then in next phases, we're going to start pulling stuff in. And so really, I think, for people that are listening, that's the way you want to develop those things. For all the reasons we mentioned. You don't want to keep the business waiting. You don't want to find out you missed six or 12 or 24 months down the road. You just want to get started. And you're gonna have a data warehouse after the first phase, but it's just not going to be all encompassing. And that's a good thing.

Brick Thompson:

Yeah, that's a really good distinction. So you have a data warehouse after six or eight weeks, maybe less, depending on how small the area is. But that's not it. It's sort of like, you've got your first starter home, and then you're doing additions, as it makes sense and as you need.

Caleb Ochs:

Exactly, thats exactly right.

Brick Thompson:

All right. That's good. I'm glad you glad you covered that.

Caleb Ochs:

Cool.

Brick Thompson:

All right. I think that's it. Thanks, Caleb.

Caleb Ochs:

Thank you.