Skip to main content
Back to jobs

Software Development Engineer, Catalog Diagnostics (Agentic) & Analytics

External
Amazon.com Services LLC logoAmazon.com · Sunnyvale, CA
Full-timeOn-site1d ago
Rails
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

The Catalog Diagnostics and Analytics (CDA) team owns one of the largest data lakes within Amazon Retail, managing hundreds of petabytes through processing pipelines that push the boundaries of Amazon's technology infrastructure. We maintain foundational catalog datasets relied upon by 2000+ internal customers for day-to-day business decisions. Our team pioneers data management technologies and architecture patterns across the organization, setting standards for others to adopt. We operate daily and hourly batch datasets alongside services processing catalog data at 100K TPS through a cost-effective micro-batch streaming architecture. CDA also owns Catalog Diagnostics services that provide insight into catalog changes, enabling Amazon stakeholders to quickly identify and resolve catalog issues. We are building an agentic solution to automate and simplify catalog diagnostics-delivering transparency through data provenance and change audit trails that explain what c

Requirements

  • own end-to-end data pipelines from ideation to production deployment-processing petabytes of multimodal data with rigorous evaluation frameworks
  • define product roadmaps aligned with business priorities, balancing foundational research with incremental product improvements
  • represent the team in the broader analytics and diagnostics community-influencing tech roadmaps for big data technologies, peer teams, delivering tech talks, and staying at the forefront of genai, analytics, and agentic system research

Additional Information

At Amazon Selection and Catalog Systems (ASCS), our mission is to power the online buying experience for customers worldwide so they can find, discover, and buy any product they want. We innovate on behalf of our customers to ensure uniqueness and consistency of product identity and to infer relationships between products in Amazon Catalog to drive the selection gateway for the search and browse experiences on the website. We're solving a fundamental AI challenge: establishing product identity and relationships at unprecedented scale. Using Generative AI, Visual Language Models (VLMs), and multimodal reasoning, we determine what makes each product unique and how products relate to one another across Amazon's catalog. The scale is staggering: billions of products, petabytes of multimodal data, millions of sellers, dozens of languages, and infinite product diversity-from electronics to groceries to digital content. Amazon's Item and Relationship Platform group is looking for an innovative and customer-focused software engineer to help us make the world's best product catalog even better. In this role, you will partner with technology and business leaders to build new state-of-the-art agentic diagnostics and Analytics services to provide transparency and explainability into every changes made in catalog and back the reasoning by data traceability. You will pioneer advanced GenAI / Agentic solutions that power next-generation agentic shopping experiences, working in a collaborative environment where you can experiment with massive data from the world's largest product catalog, tackle problems at the frontier of AI research, rapidly implement and deploy your algorithmic ideas at scale, across millions of customers. We are looking for Software Development Engineers with strong technical foundations and a passion for solving complex problems at scale. You'll take ownership from design through coding, testing, and deployment-driving innovative solutions that grow selection on the Amazon platform and improve customer experience. Our engineers tackle big data challenges across web-scale data integration, entity and product matching, data quality improvement, natural language processing, and knowledge inferencing. You'll work closely with data and applied scientists to build state-of-the-art systems that automatically aggregate high-quality product data, identify selection gaps, and discover semantic relationships between products globally. This role exposes you to Agentic solutions, AI/ML, data traceability, data mining, and distributed cloud-scale systems. We partner across Amazon to deliver meaningful customer impact. If you thrive in fast-paced environments and want to push boundaries, we'd love to hear from you. Key job responsibilities * Formulate novel ways to solve foundational challenge in making an Agentic solution to attain maintain production grade accuracy, resilience and close loop problems at the intersection of GenAI, transactional systems operating at 100K TPS, and large-scale information retrieval-translating ambiguous business challenges into tractable scientific frameworks * Design and implement leading Agentic framework and agentic architectures to solve product identity, relationship inference, and catalog understanding at billion-product scale * Pioneer explainable AI methodologies that balance agent performance with scalability requirements for production systems impacting millions of daily customer decisions * Own end-to-end data pipelines from ideation to production deployment-processing petabytes of multimodal data with rigorous evaluation frameworks * Define product roadmaps aligned with business priorities, balancing foundational research with incremental product improvements * Represent the team in the broader Analytics and Diagnostics community-influencing tech roadmaps for Big Data Technologies, peer teams, delivering tech talks, and staying at the forefront of GenAI, Analytics, and agentic system research


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Amazon.com Services LLC? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect