Get this book -> Problems on Array: For Interviews and Competitive Programming
Table of contents
- History of snowflake
- Snowflake Data Cloud Architecture
- Advantages of Snowflake
- Snowflake clients
- Other Cloud Data service alternatives
History of Snowflake
Snowflake is a cloud computing based data warehousing company. It provides cloud based storage and analytics services as SaaS (Software as a Service). The company was founded by Benoit Dageville and Thierry Cruanes in July 2012. The company has allowed businesses and organization to harness the power of vast amounts of data they generate through making well informed decisions, reducing product wastage.The company was able to make its way to the top of the Forbes Cloud 100 in 2019.
Snowflake Data Cloud Architecture
The Snowflake data cloud architecture has three layers:
- Cloud Services
- Query Processing (Can be called the Compute layer)
- Database Storage (Can be called the Storage layer)
This layer ties all the snowflake components together and it handles all the user requests right from login to query dispatch.
The following are the services managed by this layer:
- Infrastructure management
- Metadata management
- Query parsing and optimization
- Access control
The query processing layer uses Virtual Warehouses to process and optimize user data queries. These virtual Warehouses are actually EC2 instances (Virtual Server or Machine) and they are responsible for retrieving and returning a result set.
Snowflake relies on AWS S3 to provide storage and the decoupling of the Compute layer from the Storage layer is one of its unique design that sets it apart from its competitors.
To further improve perfomrance and speed, data is micro partitioned, compressed, encrypted before being loaded in to Snowflake. And with in Snowflake, data is organised in columns.
Advantages of Snowflake
- Snowflake uses a consumption based billing rather than a subsription based billing. This is not just fair but also a cost effective billing option since the customer only pays for what they use.
- Shared-nothing design model: This kind of design improves performace and speed by giving each user their own disk storage, CPU and memory and not having some shared bandwidth
- Strong data security: User data is encrypted with military grade encryption (AES-256 bit). Snowflake also supports IP blacklists.
- User Access Control: Snowflake provides fine grained control over who has access to what data. And with this in design, users can also securely share their internal data with third parties.
- Easy to scale up and down: Virtual Warehouses come in different sizes and can be easily configured on the fly to suit application demands.
University of Notre Dame
A pre-eminent research university that provides unsurpassed undergraduated education faced failures with their financial reporting system. Their system suffered from I/O waits and eventually the server gave up and quit ,rendering the service unavailable.
The University built a new access layer on top of snowflake and the results were above average with an emphasized appreciation of Snowflake's perfomance, speed and security model.
This is a UK based consumer magazine and digital publisher. Before adopting Snowflake, they were using data silos and one of the problems they faced with these silos was, data was not readily available when it was needed. Being in the business of advertisment, the company heavily relied on data to inform decisions and editorial.
TI Media also points out that they use Snowflake to securely and effectively share data with 3rd party businesses.
Other Cloud Data service alternatives
Despite Snowflake's prosperity in cloud computing services, there are particular Cloud Data Service options worth mentioning in this article.
- Google Big Query
- Amazon Redshift
- Microsoft Azure SQL Data Warehouse