The customer is Asia’s largest global sports media company with a broadcast spectrum of over 2.6 billion potential viewers across 150+ countries globally. They are one of the top 10 global sports media properties for engagement and viewership.
Business Objective
The main objective of the customer was to analyze their data from various social media platforms for tracking and optimizing ongoing strategic initiatives and campaigns. They wanted to leverage all metrics from their social handles, for both organic & paid campaigns, to derive actionable insights. Enhancing decision-making capabilities around campaign cost optimization for better user engagements was a key priority.
Azure-based Data Lake Solution for Social Media Analytics
After understanding the customer’s requirements, Blazeclan proposed an Azure-based end-to-end data lake solution. The solution was implemented in two phases, namely,
1) Phase 1 – Bringing the data from all social media platforms, for both organic and paid campaigns, on a single platform and developing reports for the same. This was carried out for their multiple accounts in Facebook, Instagram, Twitter, and YouTube.
2) Phase 2 – Extracting the data for additional metrics for social media handles, such as for multiple Instagram accounts that covered backward & forward taps, reach, impressions, etc. Data generated from Google Analytics for the customer’s YouTube channel was also encompassed in this phase.
Solution Architecture
Approach Followed
Data Extraction and Ingestion
The data generated from multiple social media platforms were extracted based on key metrics available for both organic and paid campaigns. It was necessary that the raw data to be sourced is identified and extracted based on the following hierarchy – Accounts > Campaigns > Adsets > Ads. Python scripts were used to extract the raw historical and incremental data, which was then ingested in the Azure Data Lake Storage.
Data Cleansing and Transformation
A robust framework of Python scripts was created to clean and transform the raw data. Attribute level columns were derived from raw data strings and put into respective fields viz. Campaign Market Strategy, Campaign Market Platform, Campaign Market Target, Campaign Audience Target, Campaign Event Code, Campaign Channel, Campaign Channel Country, and Campaign Content Type.
Data Warehouse Loading and Reporting
The cleansed & transformed data was loaded in the Azure SQL Data Warehouse using Python scripts and Azure Data Factory. This facilitated the customer in online analytics processing and reporting. Moreover, reports were developed against every requirement of the customer with the help of PowerBI.
The solution enabled the customer to achieve a well-documented data extraction, ingestion, and warehousing. Also, it future-proofed the data architecture to address any upcoming requirements or challenges. This robust yet configurable analytics solution would cater to any future change in the data structure. In addition, the integrated data warehouse and reporting helped the customer in ad-hoc reporting as well as on-the-fly reporting requirements. It empowered them to achieve the following –
1) Social Media Campaign Analytics – This helped in understanding which campaign performed best on different media platforms and make better decisions for future campaigns
2) Data-driven Insights for User Engagements – The customer was able to measure the level of interactions a particular account experienced and the number of people who viewed their posts excluding their followers.
3) 360 Degree View across multiple accounts – Getting a complete view of their social media performance became easier for the customer, given the cross comparison of multiple accounts and competitive benchmarking.
Key Benefits
- Analytics Driven Business decisions: Consolidation of all data sources in a single platform supported Data Scientists/Analysts/Business Users in making efficient decisions. Furthermore, it helped marketing managers to develop effective strategies and make wise business decisions.
- Cost Optimization: The daily incremental data processing took approximately 15 minutes in the data lake implemented by Blazeclan team. Hence, the compute resources weren’t in the running mode for 24-hours a day, which reduced the overall cost for the company. Moreover, irrespective of the increasing amount of data ingested and processed in the data lake, with the optimized architectural design, savings were considerable.
Tech Stack
Azure Data Factory | Azure Data Lake Storage | Azure SQL Data Warehouse |
Azure Functions | Power BI | Python |