Fabric Data Engineer - Workplace Engineering
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Vanguard is standing up Microsoft Fabric as the enterprise data and analytics foundation that powers our Workplace AI, Power BI, and cross-cloud analytics estate. We are partnering with Microsoft on a CDAO-led Fabric Enablement engagement and are building this capability on an F256 Reserved capacity, integrated with the broader Vanguard data, identity, and security stack - including OneLake Direct Lake against AWS S3, Entra ID and Okta federation, and Microsoft Purview. Role Summary We are hiring a hands-on Fabric Data Engineer to own the data layer of that capability. This is a builder's role, not an architect-only role. The engineer designs and implements scalable data products in OneLake - lakehouses, warehouses, pipelines, notebooks, semantic-model-ready Delta tables - and is accountable for the lifecycle, governance, and operational health of the Fabric platform. The complementary AI Engineer role consumes that foundation to build agents, copilots, and Foundry orchestrations; this engineer makes sure the data underneath is governed, monitored, and ready. You will partner closely with the AI Engineer on AI-ready data products and semantic-layer handoffs; with our Technical Project Manager on program delivery, enablement, and change management; and with our Cloud Domain Architect on platform alignment. You will work alongside the Microsoft CDAO Fabric Enablement team and Vanguard partners across CDAO and Workplace Engineering. You will be a core member of the emerging Workplace AI Fusion Team. This is a strategic engineering and implementation role, not a support position. Key Responsibilities ( Fabric Build & Data Engineering) Design and implement scalable data storage in OneLake using Lakehouses (Delta) and Warehouses (T-SQL); choose the right item for each workload and configure SQL analytics endpoints, shortcuts, and OneLake security. Build and maintain Spark notebooks (PySpark), Data Factory pipelines, Dataflows Gen2, Copy Jobs, and mirroring for batch and incremental ingestion at enterprise scale. Build Real-Time Intelligence solutions: Eventstreams, Eventhouses / KQL databases, Activator reflexes, and Spark structured streaming for low-latency workloads. Optimize Lakehouse tables (OPTIMIZE, V-Order, Z-Order, partitioning) and Direct Lake semantic-model-ready datasets so downstream Power BI and AI agents perform predictably. ALM & Lifecycle Engineering Implement source control, branching, and CI/CD using native Fabric Git integration (Azure DevOps and GitHub), Fabric Deployment Pipelines, and the Microsoft fabric-cicd Python library. Automate Dev / Test / Prod promotion against the Fabric REST API using service principals and Workload Identity Federation; codify environment-aware bindings via Variable Libraries and parameter.yml. Operate a Feature → Dev → UAT → Prod branching pattern - native Git on Feature and Dev workspaces, pipeline-pushed promotion to UAT and Prod - with mandatory PR review, cherry-pick promotion, and one repo per team to scope blast radius. Own the lifecycle of Fabric data components from creation through retirement, ensuring every environment is reproducible from the GitHub pipeline rather than from the Fabric UI. Platform Operations & Monitoring Operate the Fabric F256 capacity: monitor CU consumption with the Capacity Metrics App, manage smoothing windows, diagnose interactive and background throttling, and right-size workloads. Build telemetry using the Monitoring Hub, per-workspace Workspace Monitoring (Eventhouse-based KQL logs), Eventhouse monitoring, and the Admin Monitoring Workspace to surface refresh failures, pipeline errors, and semantic-model health. Define dashboards and alerts for ingestion, transformation, refresh, and capacity health; drive root-cause analysis on production incidents and feed lessons back into platform standards. Define and operate the on-call model for production data pipelines and Fabric items in partnership with Tier 3 Engineering. Standards, Governance & Security Define and enforce Fabric platform standards through Terraform-based IaC using the official microsoft/fabric provider (workspaces, capacities, domains, items), workspace templates, naming and tagging conventions, and automated CI policy checks against the Fabric REST API. Manage tenant settings, domains, and capacity allocation in partnership with the Fabric Center of Excellence; align identity with Entra ID and Okta federation; rotate service principals and use PIM for elevated admin roles. Implement RBAC patterns that separate workspace control-plane roles (Admin / Member / Contributor / Viewer) from OneLake data-plane roles (folder and table level); operate RLS, CLS, OLS, dynamic data masking, and item-level sharing. Integrate Microsoft Purview for sensitivity labels, DLP, metadata scanning, lineage, and impact analysis; manage endorsement (Promoted / Certified) so AI agents and BI consumers only ground on trusted datasets. Integration &