Software Data Engineer, Data Platform

External

Augury · Bengaluru, India

Full-timeOn-site1w ago

AWSAzureBigQueryData ModelingETLGCP

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Benefits

Health insurance

Additional Information

Our mission is to transform how people and machines work together to push the boundaries of human productivity. A leader in Industrial AI, Augury helps the world's manufacturers leverage real-time production insights to drive new levels of efficiency. Combining predictive and prescriptive AI technology with industry expertise, production teams can proactively address alerts, minimize downtime, reduce asset costs, and maximize yield and capacity. Our customers achieve payback in six months or less, enabling global scale. We're looking for team members excited to partner with the world's manufacturers and build the future of production together. Our Data Intelligence Hub (DIH) is building the next generation Industrial Data Intelligence platform: a contextual layer that connects machine health, operational, maintenance, engineering, and enterprise data on top of a site Digital Twin backbone. We use this foundation to power agentic, AI-native experiences that help users explore their sites, answer complex questions, and make better decisions in one place. You will be a core member of DIH, building production-grade data services and pipelines that power our Digital Twin, products, analytics, and AI agents. This is not a traditional ETL or BI-focused Data Engineering role. We are looking for an engineer with experience building data-intensive software systems, with a strong emphasis on clean architecture, reliability, scalability, and testing. Working closely with peers across India, Israel, and other global locations, you will help transform industrial and operational data into trusted, scalable, and actionable context for users, applications, and AI systems. A Day In Your Life Production Data Systems & Pipelines Design and implement end-to-end data flows, from raw event ingestion through durable storage and modeled datasets that power products, Digital Twin experiences, and AI agents. Build reliable, incremental pipelines that support deduplication, late-arriving data, watermarking, reprocessing, and reproducible aggregations at scale. Model context and relationships across machines, lines, factories, sensors, work orders, and tenants to support structured queries and AI-driven experiences. Partner with platform and AI teams to define how datasets are stored, modeled, and exposed through APIs, Digital Twin services, and context graphs. Software Engineering & Quality Build clean, maintainable Python services with strong separation of concerns across validation, persistence, aggregation, and orchestration layers. Apply strong SQL and data modeling practices, including schema design, indexing, constraints, timestamp semantics, and scalable aggregations. Drive engineering quality through automated testing, including unit, integration, and data-focused validation for correctness and reliability. Design for observability through metrics, logging, and tracing that support debugging, data quality monitoring, production incidents, and backfills. Streaming, Lakehouse & Scalability Design and evolve streaming-first architectures using lakehouse and messaging technologies, including partitioning, watermarking, replay, reprocessing, and cost-aware scaling. Work with technologies such as Kafka, Pub/Sub, or similar systems to build reliable event-driven services and data pipelines. Contribute to multi-tenant architectures and data contracts that enable secure, scalable access to data across products, applications, and AI agents. Collaboration & AI-Native Experiences Partner closely with DIH, Smart Canvas, AI, and Product teams to design scalable data models, APIs, and context services that power AI-native experiences. Translate business and product requirements into technical solutions that balance correctness, performance, cost, and long-term maintainability. Participate in design reviews, code reviews, and technical discussions that raise the engineering bar across the organization. Collaborate effectively across distributed teams through clear written and verbal communication. What You Bring Bachelor's degree in Computer Science, Computer Engineering, Information Technology, or a related field. Advanced degrees or equivalent practical experience are also valued. 4+ years of professional software development experience building backend platforms, distributed systems, or data-intensive applications in production environments. Strong software engineering experience in Python, SQL, and data modeling, with a track record of building production-grade data systems and reliable, incremental pipelines. Experience designing systems that handle duplicate, invalid, and late-arriving events while maintaining correctness and reliability for downstream consumers. Experience with at least one cloud platform (AWS, Azure, or GCP) and modern data technologies such as Databricks, Delta Lake, Spark, BigQuery, or similar lakehouse architectures. Experience with streaming or messaging systems such as Kafka, Pub/Sub

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at augury? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect