About Porter
Porter aims to be the best end-to-end logistics platform and revolutionize the global transport logistics sector. The company is committed to delivering a better quality of life for drivers. With over 8 million customers and revolutionizing one delivery at a time, Porter enables numerous businesses to transport anything on demand.
AWS Managed Transactional Databases Add to Time and Cost
Porter’s AWS managed transactional database (OLTP) did not meet the demand spikes or support the analytics team with daily critical business reporting. As the logistics company started looking for a more scalable solution, it went through the capabilities of Snowflake and learned that its runtime was quite promising. It was challenging for Porter’s in-house team to scale compute resources in the data center. Some challenges were the time-intensive nature of the materialized view for daily reporting and generating metrics reports, let alone the associated intricacy and high cost.
Set on to find a scalable data warehouse solution and Snowflake being their preferred platform, Porter turned to Blazeclan for creating a Proof of Concept (POC). The POC involved embracing Snowflake for their data analytics workloads and migrating a Python-based report to the Data Cloud, which would be a unique engagement in the history of Snowflake adoption.
What Gives for Porter’s Existing Environment’s Inefficiencies
In Porter’s cloud data center, the OLTP data is stored in PostgreSQL and utilized for analytics workloads. However, due to this setup in the production environment, there were several performance bottlenecks along with delays in creating daily metrics reports, a crucial need for sovereign business decision-making.
How Blazeclan’s Approach Will Benefit Porter
Data and cloud experts from Blazeclan engaged closely with Porter’s team to understand the core challenges and peripheral requirements concerning the data ecosystem. The Blazeclan team has created a proof of concept, which will
- Use an automated, incremental load pipeline to refresh data every 6 hours from Postgres to Snowflake, making scalability against demand significantly cost-effective.
- Decouple storage from the compute resources, enabling independent scalability for both.
Solution Proposed in POC Offered to the Customer
After conducting a detailed assessment of Porter’s existing environment, Blazeclan created a proof of concept that involved –
- OLTP data migration from PostgreSQL to Snowflake Cloud Data Platform
- Converting the daily reporting format from materialized view to Snowflake view and loading data into the Snowflake table
- Creating an efficient data pipeline for faster processing of daily metrics reports by converting the existing Python scripts into Snowpark scripts
"Keeping Porter’s rapid YoY growth in mind, we were looking to upgrade our data infrastructure. As part of this, we identified Postgres RDS to Snowflake migration as a critical component. While evaluating, we were looking for an IT support partner to help us with use case POC. BlazeClan came recommended, bringing extensive expertise and professionalism. We decided to partner with them and never looked back. The team was prompt, approachable, collaborative, and delivered high quality continually. They made efforts to know and understand our pain points, present and future, and accordingly designed the most suitable architecture, using the most current and enhanced Snowflake features to leverage data best practices. Their team effort made our Snowflake decision easier. BlazeClan met every expectations during our engagement and we have decided to extend our collaboration to actual migration exercise as well. Thanks for making our data journey easier, BlazeClan!"
The Approach to Implementing the Proposed Solution
- Historical data migration from PostgreSQL to Snowflake Cloud Data Platform
- Data ingestion from OLTP to Amazon S3 Bucket with Blazeclan’s file ingestion framework developed using AWS Glue, Python, and Apache Spark
- Reading Amazon S3 files from the ingestion layer using Blazeclan’s database ingestion framework and loading the data into the Snowflake table for analytics purposes
- Completing one-time historical load for all in-scope databases and carrying out the incremental load daily, every 6 hours
- Data pipeline creation for daily incremental load and orchestrating the same with Amazon Managed Workflows for Apache Airflow (MWAA)
Tech Stack
AWS Glue | AWS Secrets Manager | Amazon S3 |
Amazon Managed Workflows for Apache Airflow (MWAA) | MySQL | Snowflake |