The Top 5 Popular Data Warehouse Tools

Introduction

Being “data-driven” is a worthy goal for companies of all sizes and industries. According to a study by McKinsey & Company, organizations that heavily use customer analytics are 23 times more likely to beat their competitors in terms of new customer acquisition, and 19 times more likely to be highly profitable.

Yet becoming a data-driven organization is easier said than done. The constantly expanding volume and variety of big data, and the increasing number of data sources make this goal more challenging than ever before. The average company now manages 163 terabytes (163,000 gigabytes) of information, according to technology research firm IDG.

In order to deal with this complexity, many organizations are choosing to adopt a cloud-based data warehouse that can simplify and streamline their data processing workflows. By moving your data warehouse from on-premises into the cloud, you can improve the solution’s scalability, availability, and security, all while cutting IT costs.

Not all data warehouses are created equal, and some are better suited for different use cases and customers. If you’re wondering “Which cloud data warehouse is best for me?”, this article is for you. We’ll discuss 5 of the top data warehouse tools so that you can make the right choice for your business.

1. Google BigQuery

What is Google BigQuery?

Google BigQuery is Google Cloud Platform’s data warehousing solution in the cloud. To enable lightning-fast SQL queries, BigQuery leverages the power of Google’s Dremel tool for interactively searching through massive datasets.

As a serverless data warehouse, BigQuery frees users from the obligation to purchase, install, or manage IT infrastructure. BigQuery is especially well-suited for real-time data analytics, and the tool includes a variety of ETL data pipelines from Google and non-Google sources.

Google BigQuery Pricing

Google BigQuery uses a flexible, pay-as-you-go pricing model. The service includes a free tier: the first 10 gigabytes of storage and the first terabyte of query data are free of charge every month.

Beyond this limit, Google BigQuery charges $0.02/gigabyte for active storage and $0.01/gigabyte for long-term storage.

Google BigQuery offers both on-demand and flat-rate pricing for running SQL queries:

  • On-demand: $5/terabyte of query data
  • Flat-rate: $10,000/month for 500 BigQuery slots (or $8,500/month if paid annually)

Google BigQuery Reviews

Reviews of Google BigQuery on business software review sites like G2 are largely positive. Google BigQuery has an average rating of 4.4 out of 5 stars on G2, based on 254 reviews. Given this warm reception, G2 has named Google BigQuery one of its “Top 100 Software Products of 2019.”

G2 reviewer Shivam S. calls Google BigQuery “among the best services on Google Cloud Platform” in a five-star review, praising the tool’s scalability, performance, and ease of integration. He writes: “BigQuery’s main strength is its ability to process huge volumes of data with lightning speed.”

Google BigQuery Pros and Cons

Among the most commonly mentioned advantages of Google BigQuery are the tool’s scalability, performance, speed, and ease of use. The fully-managed nature of BigQuery means that users are largely freed of the responsibilities of maintenance and monitoring. BigQuery particularly excels when it comes to real-time data analytics and crunching massive data sets.

However, BigQuery doesn’t come without its drawbacks. The tool might be considered overkill for smaller organizations with less demanding analytics workflows. In addition, G2 reviewers cite several usability issues with BigQuery, such as problems with auto-schema detection and the fact that not all SQL queries can be executed in BigQuery.

2. Amazon Redshift

What is Amazon Redshift?

Amazon Redshift is a cloud data warehouse solution from a public cloud giant Amazon. Like Google BigQuery, Redshift is a fully managed data warehouse in the cloud, capable of efficiently processing massive quantities of information. Each Redshift data warehouse consists of computing resources called nodes that are organized in clusters.

Amazon Redshift Pricing

Like Google BigQuery, Amazon Redshift offers multiple pricing options, as well as a two-month free trial for new Redshift customers.

Redshift’s on-demand, pay-as-you-go pricing model involves no upfront costs. Customers have the choice of four different cluster types:

  • dc2.large (2 vCPU, 15 GiB memory, 0.16 TB storage): $0.25/hour
  • ds2.xlarge (4 vCPU, 31 GiB memory, 2 TB storage): $0.85/hour
  • dc2.xlarge (32 vCPU, 244 GiB memory, 2.56 TB storage): $4.80/hour
  • ds2.8xlarge (36 vCPU, 244 GiB memory, 16 TB storage): $6.80/hour

Redshift also offers three other pricing models:

  • Redshift Spectrum pricing: Customers pay for the number of bytes scanned during their SQL queries.
  • Concurrency Scaling pricing: Redshift adds more query processing power on an as-needed basis, and customers pay only for what they use.
  • Reserved Instance pricing: Customers commit to a 1-year or 3-year term with Amazon Redshift, saving up to 75 percent off the on-demand Redshift prices.

Amazon Redshift Reviews

As a mature, feature-rich offering from Amazon Web Services, Redshift has been well-received on business software review platforms. Redshift currently has an average rating of 4.2 out of 5 stars on G2, based on 112 reviews.

G2 reviewer Keith G. writes: “Redshift can query huge data sets quickly with little optimization on the developer’s side. It integrates well with other systems, like S3 or DynamoDB… Unlike most other data warehouses that we evaluated, Redshift is pretty simple to set up and use, and you don’t really need to be a business intelligence/data warehouse expert to do so.”

Amazon Redshift Pros and Cons

Amazon Redshift is relatively easy to get up and running, especially given the tool’s on-demand pricing option. Security is a standout feature of Redshift: SSL encryption protects data in transit, and Redshift includes other security features such as access management and Virtual Private Clouds (VPCs). Redshift easily integrates with other Amazon and non-Amazon services, and the tool is extremely scalable to fit your business needs.

The disadvantages of Amazon Redshift include a few technical limitations. For one, Redshift does not natively support parallel loading for sources other than S3, DynamoDB and Amazon EMR. As a result, loading from alternate sources will be significantly slower. In addition, Redshift is not an optimal solution for live web applications, due to performance when running many small transactions in a short period of time.

3. Azure SQL Data Warehouse

What is Azure SQL Data Warehouse?

Azure SQL Data Warehouse is a cloud data warehouse solution from the last of the three public cloud giants, Microsoft. Like BigQuery and Redshift, Azure SQL Data Warehouse is a fully managed data warehouse in the cloud that uses massively parallel processing (MPP) to efficiently query very large datasets.

Azure SQL Data Warehouse Pricing

Azure SQL Data Warehouse pricing is more complex than some of its competitors. Users pay both storages and compute charges, based on a Microsoft metric called the Data Warehouse Unit (DWU).

For example, the prices for computing charges in the East US region, using 100 DWUs, comes to $1,125/month. Pricing is linear, so the use of 1000 DWUs would be $11,125/month for Azure SQL Data Warehouse customers.

The prices for storage charges in the East US region will be $135/month per terabyte or $13,517/month per 100 terabytes. Azure SQL Data Warehouse customers pay only for data stored at rest, not for storage transactions. Inbound data transfers (data entering Azure) are free, while outbound data transfers (data leaving Azure) are charged according to Azure’s standard bandwidth pricing.

Azure SQL Data Warehouse Reviews

Azure SQL Data Warehouse has received high marks on business software review websites such as Capterra, where it currently has 4.5 out of 5 stars based on 923 reviews.

Capterra reviewer and chief product officer Ashish T. writes that he “can’t imagine running our business without Azure,” adding: “Being a startup, we’re able to host and manage a complex enterprise-level business solution without any server administrators.”

Azure SQL Data Warehouse Pros and Cons

Azure SQL Data Warehouse is a highly optimized solution for data analytics workflows.  Like its main competitors BigQuery and Redshift, Azure SQL Data Warehouse is a scalable, high-performance, and lightning-quick data warehouse solution that can meet the demands of large enterprises. The Azure platform is especially strong for machine learning and artificial intelligence applications: it provides APIs for voice and facial recognition, automated language translation, and more.

More negatively, Azure SQL Data Warehouse users frequently complain about the service’s opaque pricing model. Capterra reviewer Alfredo R. writes: “Billing is confusing, especially at the beginning, and it is impossible to estimate costs without seeing the use of an entire application or site.” Other points of dissatisfaction include an unclear and slow user interface, a steep learning curve, and a surfeit of options that leave users confused.

4. Snowflake

What is Snowflake?

Snowflake is a purpose-built SQL cloud data warehouse tool. Snowflake is a SaaS (software as a service) solution that can be hosted either on Amazon Web Services or Microsoft Azure. One noteworthy feature of Snowflake is the ability to “time travel”: accessing historical data (data that has been changed or deleted) from within a defined time period. 

Snowflake Pricing

Snowflake uses an on-demand pricing model that the company claims to make it the “most affordable data warehouse in the world.”

Users pay $23/month per terabyte for capacity (up-front) storage in Snowflake or $40/month per terabyte for on-demand (monthly) storage.

Compute costs in Snowflake depend on the user’s choice of the tier, from “standard” to “business-critical.” The cost is calculated per credit, a unit that depends on the size and runtime of the user’s data warehouse tool. The complete list is as follows:

  • Standard: $2.00/credit
  • Premier: $2.25/credit
  • Enterprise: $3.00/credit
  • Business-critical: $4.00/credit

Snowflake Reviews

Users have given Snowflake reviews that are just as effusive as those of its larger competitors like BigQuery and Redshift. Snowflake currently has an average rating of 4.6 out of 5 stars on G2, with 251 reviews.

Thanks to this strong performance, G2 has named Snowflake a “Leader for Fall 2019.” IT research and advisory firm Gartner also named Snowflake a Leader in its January 2019 report on data management solutions for analytics.

G2 reviewer and chief architect Eric A. writes: “Snowflake completely changes the data warehousing paradigm. All of the implicit weaknesses that are baked into the core architecture of every major DW vendor in the last 30 years are gone. Endless CPU, endless storage, endless RAM. You will never get a spool space or tempdb error. You can store XML, JSON, and flat files natively and join them into your relational table queries.”

Snowflake Pros and Cons

Snowflake excels when it comes to speed and agility. Thanks to its multi-cluster shared data architecture, Snowflake easily supports concurrent execution of multiple compute clusters, providing a massive boost to performance. In addition to its “time travel” feature, another noteworthy Snowflake feature is secure data sharing, which makes it dramatically simpler to share databases and tables between users.

The cons of Snowflake mainly focus on minor technical issues, such as difficulties with particular integrations or suboptimal query times. As an upstart in the cloud data warehouse field, Snowflake is not as mature as offerings like BigQuery and Redshift, although this is rapidly changing as more customers come on board. In addition, some users complain that their Snowflake customer support experience was slow or unhelpful.

5. Panoply

What is Panoply?

Panoply is an AI-driven cloud data management platform for ingesting and integrating data from multiple sources. They claim that it offers the “world’s first end-to-end automated data warehouse tools.” Panoply is built on top of Amazon Redshift, and also makes use of other AWS offerings such as Amazon Elasticsearch Service and Amazon S3 storage.

Panoply Pricing

Users can start with a 14-day free trial. After the trial expires, users have four tiers of options:

  • Starter: $325/month
  • Pro: $665/month (includes automated backups, user permissions, and live chat support)
  • Business: $995/month (includes platform analytics, screen share support, and a guaranteed 1-hour support response time)
  • Enterprise: custom pricing (includes 19 global regions and advanced security)

Panoply Reviews

Panoply reviews are generally positive across the board. The platform has received 4.5 out of 5 stars on G2 based on 30 reviews, and 4 out of 5 stars on Capterra based on 15 reviews.

Thanks to Panoply’s high marks, G2 has named the platform a “High Performer” for fall 2019. In addition, customers have given Panoply 5 out of 5 stars, based on 9 reviews, on Gartner’s Peer Insights review platform.

A Gartner reviewer and project development manager calls Panoply “а turnkеy busіnеss іntеllіgеnсе sоlutіоn fоr аny smаll busіnеss tо сrеаtе dаshbоаrds аnd mаkе dаtа drіvеn dесіsіоns аt а frасtіоn оf thе оff-thе-shеlf sоlutіоns соst. Nо dаtа еngіnееr, nо dаtа аnаlyst, nо busіnеss аnаlyst rеquіrеd.”

Panoply Pros and Cons

Panoply is best suited for organizations without a deep familiarity with data warehouses and the cloud. By automating the ELT process, Panoply makes it easy to get a cloud data warehouse up and running in minutes. Many users also praise Panoply’s customer support team for their helpfulness and responsiveness.

However, Panoply is not quite as high-powered as some of the other cloud data warehouse alternatives. As a newer cloud data warehouse tool, Panoply can be subject to performance issues. One Capterra reviewer writes: “The web UI/UX needs more work, it can be slow, and there are many features that are only available on request from the backend (history tables, SSH tunnel for connections, etc.). The replication frequency is rather limited (down to 1 hour only), and the lack of change log replication is a big con.”

Conclusion

The 5 tools we’ve discussed are some of the top cloud data warehouse tools on the market. Each one has a variety of features and benefits that have earned it rave reviews from its loyal customers.

Picking the right cloud data warehouse is an essential component of your data analytics workflows, but it’s only one part of the puzzle. Extracting, transforming, and loading information into your cloud data warehouse will be of little use if you aren’t able to intelligently analyze this data and mine it for valuable insights.

ironFocus’ team of data experts will help make your data work for you. From identifying the most promising prospects for hunting down new growth opportunities, ironFocus offers a full suite of services that will take your business to the next level. Want to learn more? Get in touch with us today for a chat about your needs and objectives. 

Category
Tags

Comments are closed