Design and develop scalable data pipelines using Apache Spark (batch and streaming)
Build and maintain data platform layers: ingestion, transformation, and serving
Optimize Spark jobs for performance, cost, and reliability (partitioning, skew handling, memory tuning)
Implement data quality, observability, and lineage frameworks
Contribute to data architecture decisions ( Lakehouse , data mesh, storage formats, partition strategies)
Define and enforce data contracts and schema evolution practices
Platform APIs & Backend Engineering
Design and build data-driven platform APIs using Java (preferred )
Develop microservices that expose curated datasets for product and partner consumption
Implement RESTful APIs and event-driven services for real-time and near real-time data access
Ensure low-latency, high-availability data serving layers
Integrate with upstream/downstream systems, including legacy APIs where required
Cloud & Platform Integration
Build and deploy solutions on Azure (preferred) / AWS / GCP
Leverage cloud-native services for data storage, compute, and messaging
Work with event streaming systems (Kafka/Event Hubs) for real-time pipelines
Support containerized deployments and orchestration (Kubernetes) where applicable
Quality, Observability & Engineering Excellence
Champion unit tests across both data and service layers
Build automated validation frameworks for data pipelines
Implement end-to-end observability (metrics, logging, tracing) across pipelines and APIs
Drive CI/CD practices for both data and application code
Conduct code reviews and enforce engineering best practices
Product Mindset & Ownership
Engage deeply with product and business stakeholders to understand why, not just what
Translate business problems into scalable data and platform solutions
Take end-to-end ownership from design through production and support
Proactively identify performance bottlenecks, data issues, and system gaps
Mentorship & Leadership
Mentor engineers on distributed systems, Spark optimization, and API design
Promote best practices in data engineering, microservices, and software craftsmanship
Contribute to platform vision and long-term architectural evolution
Required qualifications (Hard requirements)
4+ years of software engineering experience with strong focus on data platforms and/or distributed systems
Hands-on expertise in Apache Spark or Scala or PySpark
Strong programming skills in Java (preferred) / Scala / Python
Experience building large-scale data pipelines (ETL/ELT)
Experience developing backend services or APIs (REST/microservices)
Deep understanding of:
Distributed systems (partitioning, shuffle, fault tolerance)
Data storage formats (Parquet, ORC, Avro)
Data modeling and schema evolution
Experience with cloud platforms (Azure/AWS/GCP)
Familiarity with workflow orchestration tools (Airflow, Dagster, etc.)
Strong system design and performance optimization skills
Requirements
Experience with Spark Structured Streaming
Exposure to Lakehouse architectures (Delta Lake, Iceberg, Hudi)
Experience with event-driven architectures (Kafka, Event Hubs)
Knowledge of data governance, catalog, and lineage tools
Experience with CI/CD for data
Benefits
Health insuranceVision insurance
Additional Information
Do you want to shape the future of fintech and healthtech ? Energized by challenges and inspired by bold goals? Ready to elevate your career alongside driven and talented colleagues? If that sounds like you, explore a career at Alegeus today. Opportunity Happens Here .
Senior Software Engineer - Data Platform
Location
Bangalore, India
Reports to
Development Manager
About Alegeus
Alegeus is the market leader in consumer-directed healthcare (CDH) solutions, powering millions of consumer benefit accounts including FSAs, HSAs, HRAs, dependent care and wellness programs through a modern SaaS and payments platform.
We are investing aggressively in modernization, API-first integration, real-time data access, and AI-enabled automation to redefine how consumers save and spend on healthcare. At this inflection point, we are transforming our platform, elevating engineering rigor, and building a next-generation product and engineering organization.
Role summary
We are looking for a Senior Software Engineer to design, build, and scale our next-generation Data Platform and Data-Driven APIs. This role combines distributed data processing (Apache Spark) with platform and microservices engineering (Java) to enable reliable, scalable, and real-time data access.
You will operate at the intersection of data engineering and backend platform engineering - building systems that not only process large volumes of data but also expose that data through robust, well-designed APIs and services.
This role goes beyond implementing requirements. We expect engineers to understand business context, challenge assumptions, and take end-to-end ownership of delivering meaningful outcomes.