Insurance Data Warehousing Platform
Insurance Data Warehousing Platform
Data Warehousing
Medallion Architecture implementation achieving 40% cloud compute cost reduction through legacy code optimization
- Type
- Data Warehousing
- Role
- Azure Data Engineer
- Service
- Data Engineering / Cloud Migration / Data Architecture
- Year
- 2023

Project Overview
Architected and implemented a comprehensive insurance data warehousing solution at Everest Re using Azure cloud technologies and Databricks. The project focused on modernizing legacy data infrastructure by implementing Medallion Architecture (Raw/Harmonize/Curate) to create a multi-layered, governed data structure.
The solution successfully consolidated all enterprise data into a single source of truth while achieving a remarkable 40% reduction in cloud compute costs through strategic optimization of legacy Spark code and improved distributed computing efficiency.
Key Features
- Medallion Architecture: Implemented three-layer architecture (Bronze/Silver/Gold) for progressive data refinement
- Data Lake Integration: Designed and built robust data ingestion pipelines using Azure Data Lake
- Cost Optimization: Achieved 40% compute cost reduction through code optimization and resource management
- Data Reconciliation: Created comprehensive reconciliation solutions across source systems, Data Lake, and end-user platforms
- Unified Catalog: Reduced data silos by cataloging datasets in Unity Catalog
- Error Handling: Implemented robust logging and alerting using try-catch patterns in Databricks
Technologies Used
- Azure Data Factory (ADF v2): Pipeline orchestration and data movement
- Azure Data Lake Storage: Scalable data lake implementation
- Azure Databricks: Data processing and transformation engine
- Azure Synapse Analytics: Enterprise data warehousing
- Azure Logic Apps: Workflow automation and integration
- Azure DevOps: CI/CD pipeline management and version control
- Delta Lake: Reliable data storage with ACID properties
- Unity Catalog: Data governance and cataloging
Technical Implementation
The solution followed a layered approach where raw data from various source systems was ingested into the Bronze layer, cleansed and harmonized in the Silver layer, and finally curated for business consumption in the Gold layer.
Data ingestion pipelines were orchestrated through Azure Data Factory, with Databricks notebooks handling complex transformations using PySpark and Spark SQL. The implementation included automated data quality checks and reconciliation processes to ensure data integrity throughout the pipeline.
Challenges & Solutions
- Legacy Code Optimization: Refactored inefficient Spark jobs, resulting in 40% cost reduction while maintaining data quality
- Data Consistency: Built reconciliation frameworks that validated data across source systems, Data Lake, and downstream applications
- Performance Issues: Optimized job concurrency and improved distributed compute usage
- Data Silos: Cataloged all datasets in Unity Catalog to create a unified, discoverable data layer
- Production Support: Provided comprehensive support for migration projects and resolved post-deployment issues
Architecture Decisions
Implemented Medallion Architecture to provide:
- Scalability: Independent scaling of storage and compute resources
- Data Quality: Progressive refinement through layered approach
- Governance: Clear data lineage and ownership
- Flexibility: Ability to reprocess data at any layer without affecting downstream consumers
Business Impact
The insurance data warehousing platform delivered significant value:
- 40% reduction in cloud infrastructure costs
- Improved data quality and consistency across the organization
- Faster time-to-insight for business analysts
- Enhanced regulatory compliance through better data governance
- Foundation for advanced analytics and predictive modeling
My Role
As Azure Data Engineer, I led the technical architecture design, implemented the Medallion framework, optimized legacy code for performance, and collaborated with Business Analysts and Solution Architects to translate business requirements into technical solutions. I also authored technical design documents and provided production support throughout the migration.