Skip to main content
Back to jobs

Senior Software Engineer

External
lancedb logoLancedb · Hq
Full-timeRemote8mo ago
ApacheHadoopJavaMovePandasPyTorch
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

We're looking for a Senior Software Engineer to help expand the reach of Lance and LanceDB within the broader data infrastructure ecosystem. You'll work at the intersection of high-performance computing, big data, and open-source systems. You will contribute scale and performance improvements, integrations with the wider data and AI ecosystem, simplifying distributed operations, and usability and maintainability enhancements. You'll be responsible for Designing and maintaining efficient distributed Lance dataset operations Building efficient indices to enable predicate pushdown and accelerate queries in Spark, Ray, or Trino Working on table formats, data encodings, and various aspects of the Lance format in Rust Driving open-source community efforts to integrate the Lance format with Spark, Hive Metastore, Presto, Trino, Ray, and other data infrastructure systems Operating and improving internal data processing infrastructure Promoting the Lance format in open-source communities and at Big Data conferences

Requirements

  • 10+ years of experience building high-performance databases, big data systems, or large-scale data services
  • Deep understanding of internals of open-source Big Data or AI training systems (e.g., Hadoop, Spark, Flink, Ray, Iceberg, Delta Lake, Hudi, ClickHouse, Trino, Presto, PyTorch, or JAX)
  • Strong experience with high-performance computing in C++, Java, and/or Scala
  • Experience with Rust (or willingness to learn it)
  • Proven ability to move fast, work independently, and collaborate with a high-caliber team
  • Contributor, committer, or PMC member in Apache or other large open-source projects
  • Experience with Apache Arrow, DataFusion, Parquet, Iceberg, or Delta Lake
  • Track record of driving large features or integrations in distributed systems
  • Strong community presence and passion for open-source collaboration

Benefits

A key role shaping an open-source project with real production usageRemote-first team with flexible hoursCompetitive compensation, equity, and benefitsGenerous learning budget and support for open-source contributionsWhy Join UsYou'll join a world-class team of open-source builders, including co-authors of pandas, and contributors to HDFS, Arrow, Iceberg, and HBase. You'll collaborate on systems that power next-generation AI workloads while shaping how LanceDB operates and scales production environments.Remote work optionsFlexible scheduleEquity / stock options

Additional Information

About LanceDB LanceDB is the preeminent data platform for multimodal AI use cases. From hyper-scalable vector search to advanced retrieval for RAG, from streaming training data to interactive exploration of large-scale AI datasets, LanceDB is the best foundation for your AI application, and powers some of the most groundbreaking applications and challenging requirements today.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at lancedb? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect