How to cut DWH and DataLake costs for Amazon Web Services?
AWS currently provides 175 services.
By choosing a crushing and specialization approach, Amazon has closed two tasks:
maximum efficiency for solving customer's problems at a low price of one service;
high total cost of ownership of infrastructure due to the use of a large number of services. It is impossible to accurately calculate what the project will result in when dozens of different services are involved.
Based on 10 years of experience working with DWH and BI projects, we have compiled an architecture that, in our opinion, is the most efficient and at the same time cheap.
What functions should the system perform:
- collection of data from various sources
- data cleansing
- data enrichment
- upload to DataLake
- loading and building DWH
- data Mapping
- machine learning and predictive analytics
All infrastructure must work in AWS
To build an analytical pipeline, Amazon suggests using about 30 services. Experience shows that you can do five If you do not have the task to build a spaceship and surprise Elon Musk, then you will be enough:
Amazon side: EC2 ECS S3 RDS
open source solutions: Python, PostgreSQL, Hive, Presto, Apache Superset
To deploy all open source solutions, we use Amazon's EC2 and ECS services.
ETL - python, SQL.
ML - python.
- All company data is uploaded to S3-based DataLake
- from DataLake we transform and load data into DWH (PostgreSQL) based on RDS AWS.
- We organize work with DataLake through Hive.
- We unite DataLake and DWH through Presto.
- BI - Apache Superset, PowerBI
PrestoSQL table aggregator driven by SQL language, with support for multiple connectors. With Presto, you can combine different data sources from classic databases to modern hdfs repositories. The internal device automatically performs optimization operations on requests in order to reduce the load and processing time. Presto also has the ability to connect from Python applications, thereby replacing the need to connect to postgreSQL databases directly.
Hive Metastore technology for creating databases that are located on the file system. In particular, it allows you to build DWH and Data Lake based on S3, which in turn provides unlimited disk space for data storage and quick access to them.
Quality results with us
To ensure security, all communication with the outside world can go through IAM and KMS AWS. Thus, the most expensive operations in terms of costs will be conducted on open source solutions, and Amazon services will be responsible for speed and security. This architecture allows you to close most of the tasks of the average customer.
So, using no more than 5 Amazon services and proven open source solutions, you can reduce the cost of analytical pipeline at times, without losing performance and security.
According to average estimates, the cost of owning and using AWS infrastructure for an average company should not exceed $ 1000-3000 per month. Maintenance and modernization of 2000-3500 dollars a month The transfer of the entire infrastructure takes from 2 to 6 months.
We are able to help you pay less for your DWH at Amazon Web Services. Find out more now!
Please feel free to contact us via e-mail: firstname.lastname@example.org or fill the contact form below.
Qlik Select Partner in Belarus
Years in Business
A2 is trusted by more than 2000+ happy users from all around the world.
Using QlikView in daily work helps to analyze the data in real time and detect the smallest shortcomings in work, their localization and, as a result, instantly eliminate them, affecting the work of staff to improve the company's key performance indicators. A2 team - professionals and the only certified partner in Belarus as Qlik product developer
A2 team have implemented the QlikView analytical application, provided our company with an easy-to-use and effective tool for analyzing sales and stocks. Also one of the advantages of this application is, of course, the speed of the formation of various tables and graphs with the ability to instantly save or print data
A2 helped us to move from OLAP-based way of reporting to flexible and modern BI Qlik. As the result we have had an effective tool for sales analysis, KPI dashboards, bill analysis. We also have an optimization of business-data model and education course for our personnel
A2 is high-professional team with fixed terms and cost of project. We've developed sales, bills, finance, loyalty analysis application for all the business group. Now QlikView is a corporate standart for our business. We could recommend A2 team as a professional IT integrator and BI consultant
2009-2021 | 3Alica | A2 Consulting Group | All rights reserved