Skip to main content
Back to jobs

Lead Site Reliability Engineer, Data- FreeWheel

External
Comcast logoComcast · Reston, 11951 Freedom Dr Ste 900, VA
Full-timeOn-siteToday
AnsibleApacheAWSAzureCapacity PlanningCassandra
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • System Monitoring and Optimization
  • Design and implement monitoring and alerting systems to ensure the stability, reliability, and performance of data platforms.
  • Quickly respond to and resolve issues impacting data pipelines or storage layers.
  • Automation and Tool Development
  • Develop and maintain automation tools and scripts for deployment, monitoring, backup, recovery, and disaster recovery of data systems.
  • Performance Optimization
  • Analyze and optimize the performance of data storage, query performance, and data flows.
  • Ensure efficient processing of large-scale datasets.
  • Reduce latency and improve processing speed.
  • Incident Response and Troubleshooting
  • Respond quickly to data platform failures.
  • Perform troubleshooting and coordinate cross-team efforts to resolve issues.
  • Ensure high availability and reliability of data platforms.
  • Capacity Planning and Scaling
  • Work with data engineering teams to analyze and forecast capacity requirements.
  • Ensure systems can accommodate data growth and scale infrastructure accordingly.
  • Documentation and Knowledge Sharing
  • Document the architecture, configurations, and operational procedures for data platforms.
  • Share operational knowledge across the team and provide relevant training.
  • Security and Compliance
  • Ensure data platforms meet security standards and compliance requirements.
  • Prevent data breaches, unauthorized access, and misuse of data.
  • Cross-Team Collaboration
  • Collaborate with data science, product, and development teams.
  • Support data product design and implementation efforts.
  • Resolve reliability-related issues and improve platform stability.

Requirements

  • At least 10+ years of experience as an SRE, DevOps, or Data Operations Engineer.
  • Experience with cloud platforms (AWS, GCP, Azure).
  • Familiarity with modern data architectures and technologies including Kafka, Hadoop, Spark, Cassandra, HDFS, and AWS S3.
  • Extensive experience in database management including NoSQL, MySQL, and PostgreSQL.
  • Proficiency with Ansible, Terraform, Kubernetes, and Docker.
  • Programming skills in Python, Go, Java, or Scala.
  • Experience with Prometheus, Grafana, ELK Stack, or similar tools.
  • Strong troubleshooting and debugging skills.
  • Excellent communication skills with technical and non-technical stakeholders.
  • Education: Bachelor's degree or higher in Computer Science, Software Engineering, or a related field.
  • Additional Preferred Skills
  • Experience with Aerospike, Kafka, Snowflake, and other big data technologies.
  • Familiarity with containerization, microservices architecture, and Kubernetes.
  • Experience designing and maintaining large-scale distributed systems.
  • Experience in data quality management, data governance, or ETL pipelines.
  • Disclaimer: This information has been designed to indicate the general nature and level of work performed by employees in this role. It is not designed to contain or be interpreted as a comprehensive inventory of all duties, responsibilities and qualifications.
  • Amazon Web Services (AWS), Apache Kafka, Python (Programming Language)
  • Please visit the benefits summary on our careers site for more details.
  • Education
  • Bachelor's Degree
  • While possessing the stated degree is preferred, Comcast also may consider applicants who hold some combination of coursework and experience, or who have extensive related professional experience.
  • Certifications (if applicable)
  • Relevant Work Experience
  • 10 Years +
  • Comcast is an equal opportunity workplace. We will consider all qualified applicants for employment without regard to race, color, religion, age, sex, sexual orientation, gender identity, national origin, disability, veteran status, genetic information, or any other basis prot

Additional Information

FreeWheel, a Comcast company, provides comprehensive ad platforms for publishers, advertisers, and media buyers. Powered by premium video content, robust data, and advanced technology, we're making it easier for buyers and sellers to transact across all screens, data types, and sales channels. As a global company, we have offices in nine countries and can insert advertisements around the world. Job Summary FreeWheel is seeking an experienced Data SRE to join the FreeWheel Data SRE team. As a member of the Global Operation team, you will be responsible for ensuring the reliability, scalability, and performance of our data systems. Working closely with data engineers and other operation sub-teams, you will manage our data infrastructure, optimize system reliability, automate daily operations, and resolve technical issues that impact our data pipelines and backend data platforms. Job Description


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Comcast? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect