Architect and own end-to-end data pipelines from commercial data ingestion (IQVIA, Symphony, DDD, claims, CRM) through raw, conformed, curated, and AI/ML serving layers on cloud lakehouse platforms (Snowflake, Databricks, or equivalent).
Partner with Application, AI/ML and Commercial Analytics teams to define and deliver purpose-built data products optimized for model training, batch inference, and real-time scoring pipelines.
Lead the technical design of data contracts - codifying schema, SLAs, ownership, and lineage expectations between producers and consumers.
Infrastructure & Architecture
Develop and manage data infrastructure leveraging cloud platforms aligned with Pfizer Digital and Commercial enterprise standards.
Drive adoption of modern engineering patterns including medallion architecture, streaming ingestion, incremental processing, and feature store integration for ML workloads.
Drive maturity in orchestration platforms with a focus on DAG governance, SLA enforcement, and operational observability.
Evaluate and steward the Commercial Data & AI toolchain - including semantic layer tooling, data catalog investments, and real-time vs. batch processing tradeoffs.
Data Quality & Governance
Establish and enforce data product governance frameworks ensuring all commercial data assets meet FDA promotional guidelines, privacy regulations (HIPAA, CCPA, GDPR), and enterprise data policy standards.
Build observability infrastructure (data quality checks, lineage tracking, SLA monitoring, anomaly alerting) to ensure AI and analytics consumers can trust the data powering their decisions.
Own the definition and operationalization of "AI-ready" data standards - including feature engineering pipelines, vector embedding workflows, and structured/unstructured data integration for generative AI use cases.
Collaboration & Cross-Functional Support
Partner with AI/ML engineers, business translators, and product managers to define data requirements.
Translate business needs technical specifications and design artefacts for ingestion and transformation.
Communicate progress, development status, and delivery timelines across cross-functional product pod teams and AI application owners.
Support rapid prototyping and iterative development of AI solutions.
Team Leadership & Development
Manage and mentor a team of data specialists, fostering a culture of collaboration and excellence.
Define engineering standards, career ladders, and capability-building roadmaps for data professionals.
Serve as the engineering thought leader and primary escalation point for commercial data product delivery, bridging the gap between business stakeholder expectations and technical execution.
Coordinate resource allocation, workload balancing, and succession planning to ensure team efficiency and growth.
Requirements
Bachelor's degree in Computer Science, Data Engineering, or related field.
8+ years of experience in data engineering, including pipeline development and data architecture.
Demonstrated experience architecting and delivering production-grade data pipelines on cloud data platforms (Snowflake, Databricks, BigQuery, or Azure Synapse).
Strong knowledge of data modeling, ETL processes, semantic layers, context graphs, and applicability to agentic frameworks and AI-native data consumption.
Deep expertise in ELT/ETL frameworks and transformation tooling - particularly dbt - including modular design, testing practices, and deployment governance.
Experience in the pharmaceutical, biotech, or life sciences industry, part
Additional Information
ROLE SUMMARY
The AI Acceleration (AIA) function within the Chief Marketing Office (CMO) is the single, business-led engine that owns the design, delivery, and scale-up of priority AI capabilities across Commercial operations. AIA works in tight collaboration with various Pfizer functions to deploy and maintain production-grade AI solutions that simplify how we work and drive measurable value across all processes.
The Director, Data Engineering will play a pivotal role in enabling AI-driven innovation by overseeing the design, development, and maintenance of robust data infrastructure and pipelines. This position ensures that clean, well-structured, and well-described data flows seamlessly into AI tools, powering advanced analytics and intelligent solutions across the organization. The role will also lead the creation and upkeep of a semantic layer for data sources, ensuring consistency and accessibility for downstream applications.
Working closely with Pfizer Digital and Commercial Analytics teams, this leader will collaborate on architecture, data governance, and sourcing strategies while leveraging engineering resources to support production-grade data pipelines. The Director, Data Engineering will serve as a key partner to AI/ML engineers, product managers, and compliance stakeholders, ensuring data integrity, scalability, and adherence to regulatory standards.