Staff Scientist
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Black Canyon Consulting is seeking a Staff Scientist to work with a Principal Investigatory in the National Institutes of Health at the National Library of Medicine to support the development of high-fidelity artificial intelligence models designed to decode the functional landscape of the human and mouse genomes. This effort will leverage Telomere-to-Telomere (T2T) reference assemblies to advance understanding of gene regulation, particularly within complex and repetitive genomic regions. This position requires a unique combination of computational genomics expertise, machine learning proficiency, and scalable software engineering capabilities to support large-scale data integration and model development.
Responsibilities
- Lead the design, development, and implementation of AI-driven models for gene regulation analysis
- Architect and scale a TREDNet-based framework for cloud-native execution
- Optimize models for distributed, multi-GPU training environments
- Integrate and analyze large-scale genomic and epigenomic datasets, including:
- ENCODE / modENCODE
- NIH Roadmap Epigenomics
- UCSC Genome Database
- Apply AI methodologies to functionally annotate repetitive genomic regions, including centromeres and telomeres
- Develop and maintain scalable, containerized pipelines using Docker and/or Singularity
- Implement MLOps best practices, including experiment tracking, model versioning, and reproducibility
- Deploy and manage workflows in cloud environments (AWS, GCP, or Azure)
- Collaborate with interdisciplinary teams across computational and life sciences domains
- Required Qualifications
- PhD in Computer Science, Computational Biology, Bioinformatics, or a related field
- Minimum of 5 years of experience developing and deploying machine learning or deep learning models
- Strong experience with cloud platforms (AWS, GCP, or Azure)
- Proficiency in deep learning frameworks (PyTorch preferred; TensorFlow or HuggingFace acceptable)
- Deep understanding of neural network architectures (CNNs, transformers, sequence models)
- Strong programming skills in Python and experience working in Linux-based environments
- Experience with MLOps practices, including experiment tracking and model versioning
- Experience building and deploying containerized workflows (Docker and/or Singularity)
- Experience with distributed training across GPUs or multi-node environments
- Strong knowledge of genomics, gene regulation, and epigenomics
- Experience working with large-scale biological datasets (e.g., ENCODE, Roadmap Epigenomics, UCSC Genome Browser)
- Familiarity with genomics data formats (FASTA, VCF, BAM/CRAM, BED)
Requirements
- Experience with Telomere-to-Telomere (T2T) genome assemblies
- Experience analyzing repetitive genomic regions (e.g., centromeres, telomeres)
- Background in regulatory, functional, or comparative genomics (e.g., human vs. mouse)
- Experience with hyperparameter tuning and large-scale model optimization
- Familiarity with genomic foundation models or sequence-based deep learning approaches
- Experience running ML workloads on GPU-enabled cloud or HPC environments
- Familiarity with workflow orchestration tools (e.g., Nextflow, Snakemake, Airflow)
- Experience transitioning research models into production-grade pipelines
- Familiarity with CI/CD and infrastructure-as-code tools (e.g., Terraform)
- Experience working in interdisciplinary teams
- Deliverables
- Develop a containerized (Docker/Singularity) TREDNet pipeline capable of scaling across multiple GPU nodes in a cloud environment
- Produce a comprehensive functional map of the T2T reference genome, identifying regulatory motifs in previously unresolved regions
- Develop comparative models between human and mouse cell lines to identify conserved regulatory mechanisms
- Benefits and Salary
- We attract the best people in the business with our competitive benefits package, including medical, dental, and vision coverage; a 401(k) plan with employer contribution; paid holidays, vacation, and tuition reimbursement.
- We offer a competitive salary commensurate with experience and location. The targeted range for this position is $110,000 - $140,000.
- If you enjoy being part of a high-performing, professional, technology-focused organization, please apply today!
Benefits
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at BCC-NIH? Share your experience