Last verified 68h ago — we'll re-check this shortly.

Data Engineer - Azure & Microsoft Fabric Platform

PGW Auto Glass·Pittsburgh, PA·Full-time

FabricPower PlatformOn-site

First seen 2d ago · posted Jun 1, 2026

About the role

Data Engineer - Azure & Microsoft Fabric Platform PGW Auto Glass (PGWAG) is seeking a highly motivated Data Engineer to help modernize and scale our enterprise analytics platform on Microsoft Azure and Microsoft Fabric. This role will focus on designing, developing, and maintaining cloud-native data engineering pipelines that support enterprise reporting, real-time analytics, AI/ML initiatives, pricing optimization, and operational intelligence across our network of branches and distribution centers throughout the United States and Canada. The ideal candidate will possess strong experience in Azure-based data engineering, real-time event streaming, batch processing, Lakehouse architectures, and modern analytics platforms. This role sits at the intersection of Data Engineering, Cloud Architecture, Real-Time Analytics, and AI enablement. The candidate will work closely with Pricing, Supply Chain, Operations, IT Infrastructure, and Executive Leadership teams to build scalable and resilient analytics solutions using Microsoft Fabric, Azure Databricks, Event Streaming technologies, and Power BI. We are seeking a mid-level Data Engineer to design, scale, and maintain our dual-engine enterprise data platform on Microsoft Azure and Microsoft Fabric. This role balances both batch and real-time processing architectures, ensuring seamless data flow from transactional systems into analytics-ready storage and semantic models. This position is critical to PGWAG's Cloud Modernization, Data Warehouse Modernization, and AI enablement initiatives. Key Responsibilities & Duties: • Design, develop, and maintain scalable enterprise data pipelines using: • Microsoft Fabric • Azure Data Factory • Fabric Data Factory • Azure Databricks • Azure Event Hubs • OneLake • Fabric Lakehouse • Fabric Data Warehouse • Build analytics-ready datasets supporting: • Pricing Analytics • Supply Chain Analytics • POS Sales Analytics • Customer Behavior Analytics • Executive Dashboards • AI/ML workloads • Dual-Engine Data Pipelines: • Build and manage parallel processing architectures using: • Azure Data Factory for structured batch processing • Azure Event Hubs / Kafka for real-time event ingestion • Support ingestion patterns including: • Batch ETL/ELT • Change Data Capture (CDC) / Database mirroring • Streaming ingestion • API-based integrations • SaaS integrations • Develop near real-time analytics solutions using Eventstream and Real-Time Intelligence capabilities in Microsoft Fabric. • Stream & Batch Processing: • Develop and optimize PySpark workloads using: • Azure Databricks • Fabric Spark • Spark Structured Streaming • Process: • High-volume historical datasets • XML/JSON log files • Streaming transactional events • Operational telemetry data • Build scalable transformation logic for both streaming and batch architecture. • Data Modeling & Transformation: • Model and transform enterprise data using: • ANSI SQL • T-SQL • dbt (Data Build Tool) • Lakehouse design principles • Design: • Star schemas • Snowflake schemas • Semantic models • Curated analytical datasets • Support enterprise-wide self-service analytics initiatives using governed semantic layers. • Storage & Lakehouse Architecture: • Maintain scalable Azure Data Lake Storage (ADLS Gen2) environments. • Implement and optimize: • Delta Lake table formats • ACID-compliant storage patterns • Schema evolution and enforcement • Partitioning and performance tuning • Support enterprise Lakehouse architecture using Microsoft Fabric OneLake. • Power BI & Analytics Enablement: • Partner with Analytics and Business teams to deliver: • Power BI dashboards • Executive scorecards • KPI reporting • Self-service analytics solutions • Build and maintain: • Semantic models • Direct Lake datasets • Row-level security • Data governance standards • Support Copilot-enabled analytics and AI-assisted reporting capabilities. • Infrastructure, Automation & DevOps: • Deploy and maintain cloud infrastructure using: • Terraform • Azure Resource Manager (ARM) • Infrastructure-as-Code principles • Automate CI/CD workflows using: • Azure DevOps • Git • Docker • Author and orchestrate enterprise workflows using: • Azure Data Factory • Fabric Pipelines • Managed Apache Airflow • Control-M integrations where applicable • Data Observability & Reliability: • Implement automated monitoring and alerting for: • Batch failures • Streaming interruptions • Data quality issues • Schema drift • Pipeline latency • Build checksum and reconciliation frameworks between source systems and analytics platforms. • Support enterprise data governance and operational resiliency initiatives. Qualifications & Skills: Required Technical Skills: • Cloud & Data Platforms: • Microsoft Azure • Microsoft Fabric • Azure Data Lake Storage Gen2 (ADLS Gen2) • Azure Databricks • Azure Data Factory • Azure Event Hubs • Azure Synapse Analytics / Fabric Warehouse • Programming & Query Languages: • Python • PySpark • ANSI SQL • T-SQL • Streaming & Batch Technologies: • Apache Spark Structured Streaming • Apache Kafka • Azure Stream Analytics • Event-driven architectures • Data Transformation & Storage: • dbt (Data Build Tool) • Delta Lake • Lakehouse architecture • Data warehousing concepts • Data Modeling: • Star Schema • Snowflake Schema • Semantic Layer Design • Enterprise Data Modeling Preferred Qualifications: • Bachelor's or Master's degree in: • Data Science • Computer