Unified Pipeline Architecture: Proven ability to design and implement end-to-end observability pipelines using OpenTelemetry, Prometheus, and Grafana on centralized infrastructure.
Cross-Account AWS Observability: Deep expertise in centralizing AWS telemetry, including multi-account CloudTrail organization trails, cross-account CloudWatch metrics/logs, and VPC Flow Logs.
Log Aggregation & Routing: Strong experience designing log aggregation strategies, implementing noise reduction/filtering at the collector level, and configuring Splunk HTTP Event Collector (HEC) integrations.
Advanced Alerting & Dashboarding: Hands-on experience building comprehensive alerting frameworks using Alertmanager and CloudWatch Alarms, coupled with advanced dashboard engineering in Grafana (using PromQL).
Infrastructure as Code (IaC): Advanced proficiency in writing Terraform modules specifically for deploying and managing observability stacks and EC2 infrastructure.
Other Qualifications (OQs):
Enterprise Scale Log Management: Demonstrated experience managing, routing, and optimizing log pipelines at massive scale (TB/day).
Kubernetes/Container Observability: Experience deploying Prometheus and OTel within Kubernetes (EKS) or containerized (ECS) environments.
Cost Optimization: Proven track record of reducing observability spend through strategic metric dropping, log filtering, and efficient storage tiering.
Benefits
Join one of the world's fastest-growing AI-first digital engineering companies and make a real impact at scale.Lead and collaborate with a high-energy team of talented, driven individuals solving complex, meaningful challenges.Work with Fortune 500 companies and disruptive innovators in a research-driven environment with 60+ patents.Stay ahead of the curve by gaining hands-on experience with cutting-edge AI, ML, data, and cloud technologies while continuously upskilling.If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us !Health insuranceRemote work options
Additional Information
While technology is the heart of our business, a global and diverse culture is the heart of our success. We love our people and we take pride in catering them to a culture built on transparency, diversity, integrity, learning and growth.
If working in an environment that encourages you to innovate and excel, not just in professional but personal life, interests you- you would enjoy your career with Quantiphi!
About Quantiphi:
Quantiphi is an award-winning, AI-First global digital engineering company that helps the world's leading Fortune 1000 organizations transform bold ideas into measurable business impact. We go beyond building innovative AI technologies-we solve the problems that matter most to our clients.
Since our founding in 2013, Quantiphi has built a proven track record of turning complex challenges into meaningful outcomes across industries.
Headquartered in Boston, with more than 4,000 professionals worldwide, we partner with global enterprises to deliver large-scale digital, cloud, and AI-driven transformation. #SolvingWhatMatters.
We are an Elite and Premier partner to Google Cloud, AWS, NVIDIA, Snowflake, and other leading technology platforms, and our work has been recognized across the industry, including:
3 AWS AI/ML Partner of the Year awards
3 NVIDIA Partner of the Year awards
3 Snowflake Partner of the Year awards
Rated Leaders by Gartner, Forrester, IDC, ISG, Everest Group and other leading analyst firms
Quantiphi delivers First-in-class AI solutions across Life Sciences, Healthcare, Banking, Financial Services, CPG, Manufacturing, Energy, High-Tech, Telecommunications, etc., powered by cutting-edge Generative AI and Agentic AI accelerators.
We are also proud to be certified as a Great Place to Work -reflecting our commitment to our people and our culture.
For more details, visit: Website or LinkedIn Page
Role: DevOps/Observability Engineer
Experience Level: 8 + years
Employment type: Full Time
Location: Remote - USA