Skip to main content
Back to jobs

Datadog Administration and Operations (Servicenow)

External
HP logoHp · Bengaluru, India
Full-timeOn-siteToday
AWSAzureBashCI/CDComplianceDatadog
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Observability Architecture & Ownership
  • Design and implement an enterprise‑grade observability strategy spanning Datadog (metrics, logs, traces/APM, synthetics, RUM, network performance, cloud cost) and integrations with ServiceNow .
  • Define monitoring standards, tagging conventions, dashboards, SLOs/SLIs, and alerting policies for infra and apps (on‑prem, cloud, containers).
  • Datadog Implementation & Scale
  • Deploy and manage Datadog agents, integrations (AWS/Azure/GCP, Kubernetes, NGINX, DBs, messaging), and service catalog coverage.
  • Build golden dashboards, standardized monitors, and runbooks for infra components (compute, storage, network), platforms (Kubernetes), and critical apps.
  • ServiceNow Integration & Event Management
  • Implement and optimize Datadog → ServiceNow event routing, correlation rules, deduplication, and Incident/Problem auto‑creation with enriched context.
  • Maintain CI relationships in ServiceNow CMDB , drive discovery mapping, and align alerts with CI ownership and support groups.
  • Enable closed‑loop remediation using IntegrationHub , workflows, and change controls; contribute to Change Advisory Board (CAB) standards.
  • Reliability Engineering & Operational Excellence
  • Maintain SLOs, error budgets, and escalation policies. Reduce alert noise; drive actionable, tiered alerts.
  • Partner with App, Infra, SecOps, and NOC teams to improve MTTR and post‑incident reviews with telemetry‑backed corrective actions.
  • Automation & IaC
  • Automate provisioning of monitors, dashboards, synthetics, tags, and service owner mapping.
  • Build runbooks, remediation scripts, and service workflows; integrate with CI/CD to promote consistent monitoring across environments.
  • Governance, Compliance & Cost Optimization
  • Implement data retention policies, access controls, RBAC, and tagging for chargeback/showback.
  • Optimize Datadog usage (APM sampling, log pipelines/archives, metric volumes) while protecting critical visibility.
  • Preferred Education & Experience
  • Bachelor's degree in Computer Science, Engineering, Information Systems, or equivalent experience.
  • 5-8+ years in Infrastructure/Platform/SRE/Observability roles for enterprise environments.
  • Expert hands‑on Datadog : agents, integrations, logs pipelines, APM/tracing (including OpenTelemetry), RUM, synthetics, dashboards, monitors, service catalogs, tagging strategies.
  • ServiceNow : Event Management, Incident/Problem/Change, CMDB design, Discovery, integration patterns (webhooks, APIs, IntegrationHub), event correlation and enrichment.
  • Strong experience across Linux/Windows/Unix (cluster and workload monitoring).
  • Proficiency with scripting (Python/PowerShell/Bash), Datadog/ServiceNow APIs, and Git‑based workflows.
  • Demonstrated capability to design SLOs/SLIs, reduce false positives, and measurably improve MTTR and service reliability .
  • Excellent communication; able to drive standards across multiple engineering teams.
  • Additional Qualifications
  • Experience across AWS/Azure/GCP , Kubernetes, Terraform
  • Prior ownership of enterprise observability programs (>500 nodes/services; multi‑account/multi‑subscription cloud).
  • Network (e.g., NPM/NTA) and database monitoring expertise (e.g. Postgres/SQL Server/Oracle/MySQL).
  • Experience with message brokers (Tibco), API gateways, and distributed tracing for microservices.
  • Basic experience in administering and maintaining relational and/or non-relational databases.
  • Security/Compliance awareness (SOX, HIPAA, PCI), log retention/archival strategies.
  • Experience with cost governance in Datadog (metrics vs. logs vs. traces), custom metrics, and sampling strategies.
  • ITIL v4 Foundation, Datadog Certifications, and ServiceNow Admin/Developer certifications.
  • Knowledge & Skills
  • Systems thinking, reliability engineering mindset, data‑driven decision making.
  • Strong stakeholder collaboration (Infra, AppDev, SecOps, NOC).
  • Documentation and enablement: clear runbooks, patterns, standards.
  • Bias for automation, consistency, and measurable outcomes.
  • Job -
  • Software
  • Schedule -
  • Full time
  • Shift -
  • No shift premium (India)
  • Travel -
  • Relocation -
  • Equal Opportunity Employer (EEO) -
  • HP, Inc. provides equal employment opportunity to all employees and prospective employees, without regard to race, color, religion, sex, national origin, ancestry, citizenship, sexual orientation, age, disability, or status as a protected veteran, marital status, fam

Benefits

Vision insurance

Additional Information

Datadog Administration and Operations (Servicenow) Description - We're seeking a Datadog administration and operations expert who will be responsible for managing our observability platform to ensure comprehensive monitoring, alerting, and performance analytics across infrastructure and applications. This role is critical for maintaining system reliability, improving incident response, and supporting DevOps and engineering teams with actionable insights. This is an exciting opportunity to get in on the ground floor implementing and scaling the tools, processes, and governance at HP.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at HP? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect