Skip to main content
Back to jobs

Application Reliability Engineer

External
gravitonresearchcapital logoGravitonresearchcapital · Gurugram, India
Full-timeOn-site2mo ago
BashCachingCI/CDGrafanaKafkaLinux
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Requirements

  • Possess a degree in a highly analytical field, such as Engineering, Mathematics, or Computer Science
  • Upto 2 years of experience in Python, Shell/Bash scripting.
  • Experience with Linux and shell online tools
  • Hands-on experience with databases (SQL, NoSQL)
  • Strong problem-solving and analytical skills
  • Excellent communication skills
  • Ability to remain calm and analytical under production pressure
  • Good to have:
  • Familiarity with monitoring/alerting stacks (Prometheus, Grafana, ELK, etc.)
  • Familiarity with distributed messaging (Kafka) and caching systems (Redis)
  • Experience with CI/CD pipelines and deployment automation
  • Prior experience in a support, SRE, or production engineering role

Benefits

Paid time off

Additional Information

Graviton is a privately funded quantitative trading firm striving for excellence in financial markets research. We are seeking a skilled Application Reliability Engineer to be the first line of defense for ensuring the reliability, availability, and performance of our databases, services, and trading support systems. You'll not only monitor, troubleshoot, and resolve production issues, but also build automation, improve observability, and drive long-term stability of mission-critical systems. You'll work closely with developers, traders, and infrastructure teams to triage issues, manage deployments, and continuously improve operational workflows. The ideal candidate will possess a strong background in technical support, with a passion for problem-solving and a commitment to excellence. As a Application Reliability Engineer, you will be responsible for: Monitor production services and respond quickly to alerts, incidents, and outages to ensure smooth operation and minimal downtime. Monitor trading systems and infrastructure. Triage issues across trading support services, databases, and infra; escalate and coordinate with the right owners, and drive root-cause analysis and ensure fixes are implemented for long-term stability. Serve as the first line of defense for trading operations. Proactively identify, address recurring issues, and build automation to reduce manual intervention. Improve observability by enhancing monitoring, logging, and alerting systems. Develop and maintain operational runbooks and SLO/SLA metrics.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at gravitonresearchcapital? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect