The Dashboard Effect

Is Snowflake Your Data Warehouse Nirvana?

October 26, 2022 Brick Thompson, Jon Thompson, Caleb Ochs Episode 49
The Dashboard Effect
Is Snowflake Your Data Warehouse Nirvana?
Show Notes Transcript

Click here to watch this episode on our YouTube channel.

In this episode, Brick and Caleb discuss the Snowflake data warehouse service.  They describe what it is and some of the pros and cons to consider when deciding if it might be the right solution for you.

Blue Margin helps private equity owned and mid-market companies organize their data into dashboards to execute on strategy and create a culture of accountability. We call it The Dashboard Effect, the title of our book and podcast

 Visit Blue Margin's library of additional BI resources here.

For a free, downloadable copy of our book, The Dashboard Effect, click here, or buy a hardcopy or Kindle version on Amazon.

#BI #businessintelligence #reporting #dashboard #datawarehouse #Azure #AzureSynapse #Synapse #AWS #RedShift

Brick Thompson:

Welcome to the Dashboard Effect Podcast. I'm Brick Thompson.

Caleb Ochs:

I'm Caleb Ochs.

Brick Thompson:

How's it going, Caleb?

Caleb Ochs:

Pretty good.

Brick Thompson:

Good. All right, so today we're gonna talk about Snowflake. This is a tool that launched back in 2014, but it's really been gaining momentum. We're seeing it more and more out in the market. It's interesting. It's a cloud based data warehouse, for data analytics and reporting, a tool that sits on top of other cloud services, Azure, AWS, and Google. And it's getting a lot of traction, and making a lot of noise out in the marketplace. And it gets generally really good reviews from its users and has some things going for it for sure. But there's also some downsides, and we want to kind of just talk through the good and bad of that.

Caleb Ochs:

Yeah, right. I mean, a lot of our clients are bringing this up as we talk to them. And you know, we've seen some people not have the knowledge they need to make a good decision here. So hopefully, we can shed some light on what the pros and cons are and help somebody out.

Brick Thompson:

Yeah. All right. So let's talk through the the pros. I mean, the first one that comes up a lot you see it on their marketing materials, and people tell you about it is that you don't need an SME, an expert, to run Snowflake, to run your data warehouse, because they've put this simplified layer, this software as a service, on top of the platform as a service, the Azure and AWS and so on. That is kind of an advantage. For sure.

Caleb Ochs:

Yeah, it can be right, depending on your situation. And the talent that you have available, it definitely can be.

Brick Thompson:

Yeah, I think where that can kind of backfire on you is you definitely do need someone that has some technical ability, and they've got to learn how to use that Snowflake system. But you may also because of the simplicity, that simple layer, lose some capabilities to do various configurations and things that you might want to do on the underlying platforms, but you can't.

Caleb Ochs:

Yeah, whenever you think about simplification, it usually means limitation. Those two things usually go hand in hand. I saw this funny graphic the other day, "You wondering why your app is not getting the usage it needs." And it's Google's as a search bar, right. And it says, "Search." And that is Apple, you know, it's just some very simple thing. And it's like "Your app..." and it's got all these 1000 different things on it. So I thought that was interesting. And it kind of plays in here too, because that's essentially what Snowflake has done. It's taken away a lot of the things that you might need, maybe not, but you might need, and they just kind of get rid of them. It hides them from you, really doesn't give you the control over them, but makes the UI simpler and easier to use.

Brick Thompson:

Okay. So there's definitely advantages, especially for some users for that. So I get that some other pluses, at least as Snowflake sells them, that they're serverless. So you're not spinning up SQL server VMs and that type of thing. There are handling all of that for you Behind the scenes. This cross cloud deployment so you don't have to be married to Azure, or AWS, or Google, you can be on whatever systems you want. They separate workload clusters, so you can get some performance advantages there, although there's a downside of that. And various things that they do to to make performance good. You can quickly scale it up and down, automatic performance tuning, we'll talk about that in a second too. And then again, as I said at the beginning, you know, the good user reviews. So it's not just smoke and mirrors. I mean, there are things people really like about it.

Caleb Ochs:

Yeah. Right. And a lot of those things you mentioned are not necessarily specific to Snowflake. You know, their scaling up and down, that's really more cloud native than anything. And some automatic performance tuning, you can get those and other tools, but those are all things that I think are pros to using to using something like Snowflake.

Brick Thompson:

Yeah. So let's talk about some of the cons. And some of these go hand in hand with the pro the pros. And one of the cons and it actually might be sold as a pro, is you don't have parameters, you don't have key management you don't have tuning, as we just said. All that stuff are pros unless you want to do some tuning.

Caleb Ochs:

Typically what we find is that people are super excited about Snowflake at first, and because it's easy, it kind of gets you up and going and you make some progress fast. There might actually be some genius there in how they get that going for you. You get up to speed quick, but then you start hitting the, "Man, I really wish I had this available to me, I wish I could do that or I wish I didn't have to copy this data," which I'll talk more about in a second. But there's some drawbacks once you really get in there and you get your answer and you start doing more advanced stuff. It's honestly a lot like Power BI. Power BI kind of sells itself as "You can connect to anything and do whatever you want." And while that's true, there's definitely limitations on what power queries are going to do for you, and so you can take it to a certain point, and then it just gets frustrating and slow, because that's not the right tool for the job.

Brick Thompson:

You're gonna use a different tool, you're gonna do some custom programming, that type of thing.

Caleb Ochs:

Right, exactly.

Brick Thompson:

So as I think about the disadvantages, one thing I keep coming back to is cost. It sort of gets woven into a lot of the other things that that we talked about as advantages and disadvantages. And this really came up for us as we were talking to a former colleague, who's now at a company that uses Snowflake and, in addition to some technical quibbles that he had, and some real issues, cost was the thing that was really the concern for him.

Caleb Ochs:

Yeah, he told me a story where some of the developers scaled way up to run some queries, and then those queries were just taking forever. And it ultimately cost him like $17,000, just in a few days, to run this stuff. So it's not like he's getting a credit back on that. Its used, it's done. You know, he had to go to the CFO and be like, "Hey, can we just grab some more money?"

Brick Thompson:

Yeah. And this is one of those issues that is sort of the flip side of the Pro, which is you don't have data constraints. "Go ahead, do what you need to do." Whereas on the other platforms you have to explicitly go and spin up additional resource before you're going to spend it. It is pay as you go, but you're not going to end up with those kind of surprise $17,000 bills.

Caleb Ochs:

Yeah, something like that. Don't quote me on it, but it was something like that, some outrageous amount.

Brick Thompson:

Yeah, I saw a study, as I was reading some articles getting ready for this podcast that some people did. I actually saw quoted across a bunch of different sites that Redshift on AWS, is 1.3 times less expensive than Snowflake for just on demand. But if you buy the Redshift, if you know what you're going to use and you buy it a year in advance, it's 1.9 times less expensive, I'm assuming that's half, that's kind of weird way to state it. And if you do three years, it's 3.7 times less. So it can be a lot more expensive, even for the same usage, let alone if you have someone, by mistake, leave some hairy query running for a long time.

Caleb Ochs:

Right. And that type of reserved instance thing is what we're seeing more and more of with AWS and Azure. One of the reasons with Azure synapse, before it was Azure synapse it was Azure SQL Data Warehouse, and one of the reasons why we didn't use it very often was, it's really expensive. The cheapest one you could get was like $900 a month. And a lot of our clients just didn't need that they needed $150 a month database to run their stuff. But as that technology has progressed, and as these cloud providers are becoming more and more advanced, they're doing things like this. Like this reserved instance, it's like saving a bunch of money. So Azure synapse is so much more within reach for a lot more of our clients than it used to be. So you can do that three year reserved and it's, you know,$300 bucks a month, for a really hyperscale, big time enterprise class database and data warehousing solution. It's pretty cool.

Brick Thompson:

Right, and you can still scale it up and down, and will. But you'll know what it's gonna cost you when you do that. You're not getting surprised by it.

Caleb Ochs:

Right, a little bit better controls there.

Brick Thompson:

I think one of the other big cons is something we touched on before, which is that you've got automatic tuning. But the flip side of that is, and I understand this from our developer colleague, that if the automatic tuning doesn't work, you're sort of out of luck. That's it.

Caleb Ochs:

Yeah. Right. The way to get around that is to create a copy of the table and index it differently. Like, let's say you got a table with 10 million rows. and the query you want to run is just slow. You can't add an index to that table. You can only have one way to index the tables, their clustering key. So what you have to do is you create a copy of all 10 million rows into another table that has a different clustering key in order to run that query. That might not sound like that big of a deal and storage costs in Snowflake is extremely cheap. The problem is, is that that is going to lead to so much confusion for people. It's that"Oh, no, if you want to run that query, you have to point at this table, not that table." And that's only if you got two versions of it. You're gonna run many queries against tables, so you might end up with 10 copies of the same data.

Brick Thompson:

Yeah, that's interesting, that's a good point. Another thing that I've read about and actually that this fellow mentioned, was that the user interface is not fully developed for everything that's underneath it on the platform. So for the stuff that it's been developed for, it's pretty darn slick. But then there's stuff that is just not developed for. So you're either running queries to change parameters, that type of thing, or not changing them.

Caleb Ochs:

Yeah. I think one of the cool things about Snowflake is that they have this data marketplace, so you can just go out to a bunch of different data providers and pull their data into your Snowflake instance. Kind of cool. You do have to pay for it, but you know, that's kind of an interesting perk.

Brick Thompson:

Yeah, I like that. Another one was RBAC, Role Based Access Control. There's good security in Snowflake, don't get me wrong, you can really protect stuff. But there were some things about how RBAC works that may not be intuitive. And you may have inheritance and stuff that's going to lead to exposing objects down the road that you didn't mean to. That may just be something that, you know, they need to develop further, or do you need training to be careful on, but the opinion of the fellow I talked to sounded like it was easier to make a mistake on that than, say, on Synapse.

Caleb Ochs:

Right, that's another con of Snowflake, that finding someone who's done everything in Snowflake is gonna be a lot harder than finding a SQL DBA. And someting like Azure SQL, or Azure Synapse, all that stuff's gonna be very familiar to a DBA that's done SQL before.

Brick Thompson:

Right, they've worked on SQL Server, and especially worked on Azure.

Caleb Ochs:

Right. It's gonna be very familiar, so a lot less of a learning curve. So, you know, its one of one of the drawbacks.

Brick Thompson:

I think one of the other potential drawbacks is maybe not perfect synergy with the underlying platforms. So if you're on Azure synapse, it's very easy to use all that connected stuff. They've got their machine learning stuff, their AI stuff, I'm sure you can use that same stuff with your Snowflake data that's sitting on Azure, or on synapse, I would assume. But it's just not quite as transparent how that integration works.

Caleb Ochs:

Yeah, you can even think of even just Power BI. If you have a Snowflake database that you're connecting to, you have to go turn on a setting inside of the Power BI admin portal to say "Allow Snowflake." You would probably run the connection anyway, but you want to turn on SSO for Snowflake so that you can use your Azure AD credentials. It's just not quite as smooth with everything else that you might be doing, because snowflakes kind of its own deal. Yeah. So it sounds like there probably are cases where Snowflake could be a benefit. But it seems like if you're operating at scale, if you have technical staff, you want to probably think hard about it. Yeah, you do. I mean, if you're worried about long term recurring costs, you should definitely think about it. Because Snowflake is easier to get out of control. You can keep it under under control, but it's going to be easier to get yourself into a tough spot there. And it's just going to be, straight up, more expensive. And it's because you're paying for that UI and the ease of doing things, that stuff that they've developed to sit on top of your database and the raw stuff that you would do if you're just dealing with the database itself.

Brick Thompson:

Yeah. There's no special magic in Snowflake for how it gets you access to data, you're still writing SQL queries or connecting with Tableau or Power BI or some other tool, right?

Caleb Ochs:

Right. Yeah, you are. That's a really good point. You still have to query right. With all of these tools, I mean, there's lots more than just Snowflake out there that might look really pretty and shiny, but at the end of the day, you're querying data. Right, it's the same thing. But, you know, hopefully, that is helpful for people to hear that because, you know, you're right. There's no magic there. They might show you some pretty graphics of all this data going into Snowflake. And its like, "Yeah, you can put all that data anywhere in any database technology." Well, thats probably overstated, but in most of them nowadays, in all the new technology, you can do that. So you know, don't get wooed.

Brick Thompson:

Yeah, they've, they've got great marketing, and I don't mean that as a slight, I mean, it's really good. I attended one of their webinars, and it was it was slick. But during the hands on portion, I was reading SQL queries.

Caleb Ochs:

Exactly right.

Brick Thompson:

All right, anything else we should cover on this?

Caleb Ochs:

I don't think so.

Brick Thompson:

Well, thanks Caleb.

Caleb Ochs:

Thanks Brick.

Brick Thompson:

See you soon.