Python Developer (Databricks, Medallion Architecture)
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
CACI is seeking a highly skilled and motivated Python Developer with extensive experience in Databricks and a strong understanding of medallion architecture principles to join our BEAGLE (Border Enforcement Applications for Government Leading-Edge Information Technology) Agile Solution Factory (ASF) Team supporting Customs and Border Protection (CBP) client located in Northern Virginia! In this hands-on role, you will be instrumental in building, optimizing, and maintaining our modern data platform. You will leverage Databricks' powerful capabilities to construct scalable, high-performance data pipelines that power critical business insights and drive data-informed decision-making across the organization. Join this passionate team of industry-leading individuals supporting best practices in agile software development for the Department of Homeland Security (DHS). You will support the men and women charged with safeguarding the American people and enhancing the nation's safety and security.
Responsibilities
- Design, develop, and implement robust, scalable, and performant ETL/ELT pipelines within the Databricks environment using Python and PySpark.
- Build and manage data layers (Bronze, Silver, Gold) adhering to best practices of the medallion architecture, ensuring data quality, reliability, and discoverability.
- Leverage Databricks features extensively, including Spark, Delta Lake, SQL Analytics, and Unity Catalog, to construct efficient and maintainable data solutions.
- Collaborate closely with data scientists, analysts, and business stakeholders to understand data requirements and translate them into actionable data engineering solutions.
- Implement comprehensive data quality checks, validation rules, and lineage tracking mechanisms within the Databricks ecosystem.
- Optimize data pipelines and Spark jobs for performance, cost-efficiency, and scalability, utilizing Databricks and Delta Lake best practices.
- Write clean, well-documented, and testable Python code, adhering to coding standards and promoting code quality through rigorous code reviews.
- Troubleshoot and resolve complex issues related to data pipelines, Databricks jobs, and data integrity across development, staging, and production environments.
- Contribute to the design and implementation of efficient data models optimized for query performance and data governance.
- Stay abreast of emerging technologies and trends in data engineering and Databricks, and champion their adoption where appropriate to enhance our data platform.
- Collaborate within an agile development framework, actively participating in team ceremonies and contributing to a culture of continuous improvement.
Requirements
- Required:
- Candidate must be available to work a hybrid schedule in Ashburn, VA.
- Must be a U.S. Citizen with the ability to pass CBP background investigation, criteria includes, but not limited to:
- 3-year check for felony convictions
- 1-year check for illegal drug use
- 1-year check for misconduct such as theft or fraud
- College degree (B.S.) in Computer Science, Software Engineering, Information Management Systems or a related discipline. Equivalent professional experience will be considered in lieu of degree.
- Professional Experience: at least seven (7) years related technical experience.
- Extensive hands-on experience with Databricks and its core components (Spark, Delta Lake).
- Proven understanding and practical application of the Medallion Architecture (Bronze, Silver, Gold layers) and its benefits for data management.
- Proficiency in Python for data manipulation, processing, and ETL development (e.g., using Pandas, PySpark).
- Extensive experience with Spark SQL and PySpark for distributed data processing.
- Deep understanding of Delta Lake features, including ACID transactions, schema evolution, time travel, and performance optimizations.
- Experience with data warehousing concepts and best practices.
- Familiarity with SQL for querying and data manipulation.
- Experience with source code control systems and concurrent development workflows (Git preferred).
- Strong analytical and problem-solving skills with the ability to troubleshoot complex data issues.
- Excellent communication and interpersonal skills, with the ability to explain technical concepts clearly.
- Strong ability to analyze complex project-related problems and create innovative solutions.
- Desired:
- Experience with Databricks Unity Catalog for data governance, security, and discovery.
- Familiarity with cloud platforms such as AWS, Azure, or GCP.
- Experience with orchestration tools like Apache Airflow or Databricks Workflows.
- Knowledge of CI/CD practices and tools (e.g., Jenkins, GitLab CI, GitHub Actions) for automated
Additional Information
Job Title: Python Developer (Databricks, Medallion Architecture) Job Category: Information Technology Time Type: Full time Minimum Clearance Required to Start: None Employee Type: Regular Percentage of Travel Required: Up to 10% Type of Travel: Local * * *
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at CACI? Share your experience