Spclst , Data Engineering
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Based in Hyderabad, join a global healthcare biopharma company and be part of a 130-year legacy of success backed by ethical integrity, forward momentum, and an inspiring mission to achieve new milestones in global healthcare. Be part of an organisation driven by digital technology and data-backed approaches that support a diversified portfolio of prescription medicines, vaccines, and animal health products. Drive innovation and execution excellence. Join a team that is passionate about using data, analytics, and insights to drive decision-making and create custom software, allowing us to tackle some of the world's greatest health threats. Our Technology Centers focus on creating a space where teams can come together to deliver business solutions that save and improve lives. An integral part of our company's IT operating model, Tech Centers are globally distributed locations where each IT division has employees to enable our digital transformation journey and drive business outcomes. These locations, in addition to the other sites, are essential to supporting our business and strategy. A focused group of leaders in each Tech Center helps ensure we can manage and improve each location, from investing in the growth, success, and well-being of our people to making sure colleagues from each IT division feel a sense of belonging, to managing critical emergencies. Together, we must leverage the strength of our team to collaborate globally to optimize connections and share best practices across the Tech Centers. Role Overview: This role is part of a broader enterprise initiative to establish a Data Context Layer (DCL) - a foundational capability designed to provide consistent, reusable, and scalable context across enterprise data products. The DCL is intended to address challenges related to data fragmentation, lack of shared semantics, and inconsistent interpretation of data across systems and products . It establishes a unified layer for representing context, relationships, and meaning, enabling downstream products to operate with greater consistency, interoperability, and intelligence. In addition, the DCL plays a critical role in enabling agentic AI capabilities across the enterprise by providing the structured context and semantic grounding required for intelligent agents to operate reliably. This includes ensuring that agent-driven workflows and decisions are based on consistent, governed, and interpretable data context , reducing risks associated with fragmentation, ambiguity, and lack of control. Within this initiative, we are seeking a Data Engineer with strong experience in AWS cloud technologies to design, build, and maintain scalable data pipelines and data products that support enterprise data context, analytics, and AI-enabled use cases. This role will focus on ingesting, transforming, validating, and delivering high-quality data in cloud-native environments so it can be used reliably across applications, platforms, and downstream consumers. The ideal candidate has hands-on experience with data engineering, cloud data platforms, and operationalizing data at scale. Familiarity with data context, semantic systems, ontologies, agentic AI, prompt engineering, and context engineering is preferred, as the data products may feed context-serving layers and AI workflows. What will you do:
Responsibilities
- Design, build, and maintain scalable data pipelines in AWS.
- Ingest data from enterprise source systems into cloud-based storage and processing environments.
- Transform and curate data into reliable datasets and data products for downstream consumption.
- Implement data quality checks, validation rules, lineage tracking, and monitoring.
- Work with platform, architecture, and application teams to ensure data products are aligned to enterprise standards.
- Support batch, near-real-time, and event-driven data processing patterns where appropriate.
- Optimize data pipelines for performance, reliability, scalability, and cost efficiency.
- Ensure secure handling of data through access control, encryption, and compliance-aligned practices.
- Document data models, pipeline logic, operational procedures, and data dependencies.
- Support production readiness, incident resolution, and continuous improvement of data services.
- Help ensure data can support both traditional analytics use cases and agentic AI workflows that depend on trusted, structured context.
- What should you have:
- Required Qualifications
- 5+ years of experience in data engineering or related software engineering roles.
- Strong hands-on experience with AWS cloud technologies .
- Experience building and maintaining data pipelines, ETL/ELT processes, and data transformation workflows.
- Familiarity with data quality, governance, lineage, and monitoring concepts.
- Experience working with structured, semi-structured, and/or large-scale enterprise data.
- Ability to collaborate with architects, platform teams, analysts, and application team
Benefits
Additional Information
Job Description
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at Merck? Share your experience