Data Engineer, Data Platform

External

Vtex · Brazil

Full-timeOn-site1d ago

ApacheAWSAzureData ModelingETLGCP

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

As a Data Engineer on the Data Platform team, you'll help design, build, and evolve the data infrastructure that powers analytics, AI, and machine learning across VTEX. This is a hands-on, mid-level role: you'll own features end-to-end - from ingestion and processing to storage and consumption - taking on problems that come with real ambiguity, and delivering them with growing independence. We're not looking for someone who arrives knowing everything. We're looking for someone with a strong engineering foundation and a high ceiling: a fast learner with sharp problem-solving instincts who is energized by a data platform going through a deep transformation of its architecture and the way it's built. HOW WE WORK The way we build has changed completely in the last year, and this role is defined by it. On this team, you don't measure yourself by lines of code written - you measure yourself by the value and quality of what you ship. AI tools are part of every step of our work: we direct them, review their output critically, and are accountable for the result. That makes two things essential: Leveraging AI to scale your impact. We expect you to use AI extensively and well - designing workflows where agents handle the "known-knowns" (standard ETL, repetitive modeling) so your judgment goes to the hard, ambiguous problems. The goal is genuine leverage: producing at a level that simply wasn't possible a year ago. Clarity as a contract. The quality of your written word is the quality of your AI output. If you can't describe a transformation clearly in a spec or doc, the code isn't ready to be written - by you or an agent. Strong technical writing and spec-driven thinking aren't "nice to have" here; they're how the work gets done and how it scales to teammates and agents. Data Platform is the team inside VTEX's Data & Analytics org that builds the foundation the rest of the company's data products are built on. We're in the middle of a deep transformation: migrating from a Redshift-centric warehouse to a multi-engine Data Lakehouse on Apache Iceberg, moving our infrastructure to EKS, and re-architecting ingestion for scale and cost. We run a lot of initiatives in parallel, with high autonomy and a high quality bar. You won't be siloed on one thing. You'll typically own at least one larger initiative while moving smaller improvements forward alongside it. Depending on where you fit best, your work could span: Data ingestion - migrating high-volume pipelines from Kinesis/Firehose to our new Kafka/AutoMQ-on-EKS stack. Lakehouse migration & evolution - moving workloads onto Iceberg, Spark (EMR-on-EKS) processing, and new maintenance and consumption tooling. Platform infrastructure - Kubernetes/EKS, compute efficiency, reliability. Query & consumption - engines like Trino, Cube, DuckDB, and Athena, and smart routing across them. Platform tech debt, observability, and engineering-process improvements that keep the platform fast, reliable, and cost-efficient. You'll also have the opportunity to join the team's on-call rotation, helping keep a platform that ingests billions of events a day healthy.

Requirements

You think like a platform builder. You're motivated by designing the architecture, services, and capabilities that other teams, products, and AI agents build on top of - not just by shipping a single pipeline. You care about the foundation as a product.
You can own a piece of the platform from design through to production with minimal supervision, taking on problems that carry real ambiguity.
You're comfortable with modern data architectures - data warehouses, data lakes, and data lakehouses - and understand the trade-offs that shape a platform others depend on.
You understand what makes a platform genuinely usable for its consumers: clean data modeling , well-designed data services and APIs , reliability, and performance at scale (relational and/or NoSQL).
You're proficient in Python and its data-processing ecosystem, SQL , and you produce work that is idempotent, reproducible, and documented - whether you wrote it or an agent did.
You've worked on a cloud data platform (AWS preferred; GCP/Azure welcome) and understand the realities of large datasets and performance optimization.
You have a data-driven mindset. You define how the impact of your work will be measured before you build it, and you validate outcomes against real-world results rather than assuming success.
You're proficient with AI assistants and code-generation tools , and you design solutions that thoughtfully consider when and how AI should be used to enhance value delivery - directing, reviewing, and critiquing AI-generated work rather than just accepting it.
You communicate clearly in writing - specs, docs, and design notes that other people (and agents) can act on - and you collaborate well across engineering and non-engineering peers.
Real-time / streaming data processing (Kafka, A

Benefits

Health insuranceVision insurance

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at vtex? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect