Member of Technical Staff, Pretraining

External

Hark · San Jose

$180K–$450K/yrFull-timeOn-site1mo ago

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

The Omni team at Hark is building the next generation of AI experiences beyond text, enabling models to understand and generate content across multiple modalities, including text, audio, and vision. Our goal is to create seamless, real-time multimodal intelligence that powers intuitive and immersive user experiences. As part of the Omni team, you will focus on developing large-scale pretraining systems and foundation models. This includes working across the full stack-from data curation and large-scale training infrastructure to model architecture and optimization. You will play a key role in advancing the core capabilities of our models through pretraining at scale.

Responsibilities

Drive research and development in large-scale LLM and multimodal pretraining, focusing on improving model capability through better data, scaling, and architecture.
Develop and optimize data pipelines for pretraining, including large-scale data curation, filtering, deduplication, and synthetic data generation.
Design and implement efficient training strategies for foundation models, including distributed training, scaling laws, and optimization techniques.
Build and improve pretraining infrastructure, including training systems, data pipelines, and compute efficiency.
Develop evaluation frameworks and internal benchmarks to measure pretraining progress and model capability.
Collaborate with research and engineering teams to push the frontier of foundation model performance and scalability.

Requirements

Proven track record of improving large-scale neural network performance through advances in pretraining data, modeling, or training systems.
Strong experience with large-scale distributed training (e.g., Megatron, DeepSpeed, or similar frameworks).
Deep understanding of LLM or multimodal pretraining, including data pipelines, scaling behavior, and optimization.
Experience in data-driven experimentation, systematic analysis, and debugging at scale.
Experience building or working with large-scale training infrastructure and high-performance computing systems.
Strong ownership mindset and ability to operate in fast-paced, research-driven environments.
Bonus Qualifications
Experience with multimodal pretraining (text, audio, vision) is a strong plus.

Benefits

The US base salary range for this full-time position is between $180,000 - $450,000 annually.Vision insurancePerformance bonus

Additional Information

About Hark Hark is an artificial intelligence company building advanced, personalized intelligence. One that is proactive, multimodal, and capable of interacting with the world through speech, text, vision, and persistent memory. We're pairing that intelligence with next-generation hardware to create a universal interface between humans and machines. While today's AI largely operates through chat boxes and decade-old devices, Hark is focused on what comes next: agentic systems that interact naturally with people and the real world. To get there, we're developing multimodal models and next-generation AI hardware together - designed from the ground up as a single, unified interface for a new era of intelligent systems.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Hark? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect