14 Comments

I think you may have a fundamental misunderstanding for the value Snowflake brings to the Data Lake use case. There is a reason why Snowflake no longer calls itself a "Cloud Data Warehouse" because that term is overloaded and can confuse people about the workloads Snowflake can take on. Some the earliest and largest Snowflake wins were really Data Lake use cases. Companies with massive amount of semi-structured files struggled to query them at scale using Hadoop, and Snowflake came along and made it easy. If you have a million JSON or Parquet files and want to query them performantly with SQL, name a better technology than Snowflake to do that today. I'd frame the difference as Snowflake allows you to meet Data Lake and Data Warehouse use cases in the SAME TECHNOLOGY. Using SQL. Without having to create indexes or do performance tuning. The only difference is, Snowflake gleans statistics about the files as it ingests them so you can query the files as is. Describing this as "two tiers" is fundamentally inaccurate--there is no logical difference for a company "loading" a semi-structured file into a "Data Lake" and loading that same exact file into a Snowflake managed data lake. None. In the US pricing is $20-$23/compressed TB on contract depending on your Cloud provider, and compression is excellent, so you get a reasonable storage price while still being able to query ANY of your data at speed.

Beyond that, even if you DIDN'T want to load the data into Snowflake for some reason, and you DID want to maintain a "two-tier" architecture, Snowflake offers a host of features (external tables, streams on external tables, materialized views on top of external tables, etc.) that can provide usability and performance even in that case. Now that Snowflake can load unstructured data (the feature is in preview, but announced), there aren't many data lake use cases that Snowflake can't handle in a world-class way.

Expand full comment

Love the coverage on the data space! Curious what you meant by "I'd love to see Snowflake find a way to better separate / charge for cold storage." as snowflake stores data on S3 and the cost is the same. The bulk of the cost ends up coming from compute which is the key piece to scaling out large data infrastructures with many analytical transformations. Where I've seen storage become a little more expensive is if you choose to store your data in a data lake before moving to snowflake. You then double your storage cost with the redundancy, but S3 storage is relatively cheap. Either way it is awesome to see where both companies are moving. It would be awesome to get to where both data science and BI/analytics can be powered by a single (ware/lake)house rather than needing to have one for each.

Expand full comment

Databricks wins the ML war. Snowflake wins the "we used to have SQL server, now we want something else war."

Snowflake is for SQL companies, more advanced companies and use cases require DataBricks.

Expand full comment

Hi, sorry, I have a silly question to ask.

I have economics background, nothing to know about IT. I have no clue about vast terminology in IT.

But I am interested to learn since I see compounded growth of this business (stocks) in recent decades.

Do you have recommendation, where should I start to learn?

Expand full comment

Thanks for writing!

Expand full comment

Jamin, great article, liked the way you dissected this market! thanks.

Expand full comment

Very interesting, thank you for writing!

Expand full comment

Might In-Memory Computing threaten Snowflake's ability to scale compute and storage separately? I'm more interested in the kind of IMC where compute is close to memory on the same chip, than the software kind. I just know a little about that from video from The Linley Group on Youtube. I don't have a background in tech, so 1) I hope the question makes sense, and 2) I appreciate the good free info from Clouded Judgement.

Expand full comment

That proved to be a timely post, Jamin. Thank you. I would love to see an update of that post incorporating all the annoucement of Snowflake and Databricks' conferences 1-2 weeks ago.

Expand full comment