About Company
The customer is a prominent media and entertainment company in the Philippines. Committed to pioneering, innovating, and connecting all Filipinos with their community, they strive to provide entertainment, news, and information through their diverse offerings.
The Challenge
The customer faced a critical challenge regarding one of their application modules. They required an exceptionally low Recovery Point Objective (RPO), down to seconds, to minimise data loss in the event of a primary infrastructure failure. Additionally, they aimed to bring up the application within a relative amount of time, ideally within an hour, to ensure minimal service disruption. While addressing these objectives, they also sought to keep costs low.
Key Requirements of the Customer
- Low RPO requirement
- Timely application recovery in case of primary infrastructure failure
- Cost optimisation without compromising resilience
The Solution
To meet the customer’s requirements, the Blazeclan team proposed a comprehensive solution, prioritising resilience, scalability, and cost-effectiveness.
- Database Selection: Since the application was developed from scratch, we carefully selected an open-source database that met the criteria of being developer-friendly, administration-friendly, and capable of replication to another site. After evaluating options, we recommended the adoption of the Aurora Global database, which not only fulfilled these objectives but also offered the added advantage of a headless database. This approach involved replicating only the data layer to the target region, minimising costs by avoiding the need for compute resources in the secondary location.
- Application Layer High Availability: To ensure high availability at the application layer, we leveraged Elastic Compute Cloud (EC2) instances within an Auto Scaling Group (ASG). This allowed for automatic scaling based on demand, ensuring fault tolerance and scalability. In addition, AWS Lambda was utilised to handle S3 event-based jobs, while S3-based hosting served static pages. This combination improved application performance and availability. We utilised Amazon Route 53 for efficient DNS mapping of internal AWS resources, facilitating seamless routing and failover in the event of a disaster. API Gateway was employed for REST-based services, enhancing the overall functionality and accessibility of the application.
- Low Recovery Time Objective (RTO): To minimise the Recovery Time Objective (RTO) and ensure the latest code was readily available in the Disaster Recovery (DR) region, we integrated AWS CodeDeploy, CodeBuild, and CodePipeline. CodeDeploy facilitated deployment to EC2 instances, while CodeBuild enabled deployment to S3 storage. By setting the ASG to zero instances during normal operations, CodeDeploy would deploy the latest code revision automatically during failover or DR situations, ensuring a swift recovery.
Benefits achieved by the company
- Very low database RPO, down to seconds, ensuring minimal data loss during failures.
- Cost optimisation through the headless mode of the Aurora Global database, replicating the data layer and reducing infrastructure expenses.
- Enabled a highly-scalable application by implementing a multi-AZ setup via EC2 ASG and RDS.
- Reduced Recovery Time Objective (RTO) with the integration of AWS CodeDeploy and CodeBuild, ensuring the latest code was readily available in the DR region during failover scenarios.