AI Evaluation Engineer

External

Yes Energy · Bucharest, Romania

Full-timeOn-site2d ago

Prompt EngineeringPythonRAGSnowflakeVector Databases

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

Yes Energy delivers real-time market data and electric powe

Requirements

Bachelor's degree in Computer Science, Data Science, Mathematics, or related field
5+ years of software engineering experience, with significant focus on testing, evaluation, quality systems, or AI/ML tooling
Strong Python skills; experience building production-grade evaluation or testing pipelines
Solid understanding of LLM evaluation concepts: benchmarking, hallucination detection, retrieval-augmented generation quality, and semantic similarity
Experience designing and owning quality frameworks in ambiguous, fast-moving environments
Ability to work closely with non-engineer subject matter experts to define and codify what "good" looks like
Excellent judgment - knows when an eval result signals a real problem vs. noise
Strong communicator; able to present quality findings clearly to technical and non-technical audiences
Key Competencies / Preferred Qualifications
Technical Execution & Engineering Tech : Hands-on experience with eval frameworks (RAGAS, LangSmith, PromptFlow) and proficiency in building production data pipelines using RAG or vector databases
Problem Solving : Applies structured thinking and sound judgment to determine when an eval result signals a real problem versus noise in complex data sets
Curious : Demonstrates an innate curiosity and a strong desire to learn the energy industry or apply experience from equities/commodities trading environments
Stakeholder Focus : Rooted in listening to technical and non-technical business partners to define and translate deep operational requirements into codified test criteria
Data Management & Quality : Background in data-intensive or highly regulated environments where answer accuracy is high-stakes and strict data governance is required
Effective Communication : Strong written and verbal communication skills, adjusting delivery style seamlessly based on the technical depth of the audience
At Yes Energy, we value connecting directly with candidates. We kindly ask that third-party recruiters and agencies not submit resumes, as we are not open to external recruiting partnerships.
ABOUT YES ENERGY

Benefits

Vision insurance

Additional Information

Join the Market Leader in Electric Power Data and Analytics Solutions The electrical grid is the largest and most complicated machine ever built. Yes Energy's industry-leading electric power trading analytics software provides real-time visibility into the massive amount of data generated by the North American electrical grid daily. Our unique and innovative view of the data informs real-time trading decisions and mid-to-long-term investment decisions that keep utility prices low, support the energy transition, and keep the grid running. It's both challenging work and work with a purpose. Be a part of our successful, growing business during international transformation. Position Summary In the Engineering group, we are accelerating the product vision by building solutions in a fun, collaborative, efficient, and adaptive team. As an AI Evaluation Engineer, you will own the quality bar for Yes AI - defining what "correct" looks like when an AI answers a complex power market question, and building the frameworks to measure it at scale. This is a foundational role at the intersection of LLM engineering, domain expertise, and rigorous testing. You will also work directly on projects to integrate systems and serve energy data to customers. We are curious people who take pride in our work, value high-quality communication, and strive to improve continuously. Position Details Salary range: Net 16.000 - 19.000 RON/month Location: Bucharest, Romania Full-time Hybrid - Will be required to work in the office 2-3 days a week Reporting to: Engineering Manager Primary Responsibilities Own the evaluation strategy for Yes AI end-to-end - design, build, and continuously improve frameworks that measure the quality, accuracy, and reliability of LLM-powered outputs across natural language answers, data retrieval, and semantic context responses Develop benchmarks and automated test suites to detect hallucination, factual drift, and retrieval failures; set the quality bar for what "correct" means in power market AI Partner directly with domain experts and senior engineers to translate deep power market knowledge into rigorous, repeatable evaluation criteria Build and maintain scalable eval pipelines that run continuously against prompt changes, model updates, and Snowflake data changes - surfacing regressions before they reach customers Lead prompt engineering and RAG experimentation efforts, using evaluation results to drive measurable improvements in answer quality Influence engineering and product decisions by making AI quality visible and quantifiable - advocate for reliability as a first-class product requirement Help define the discipline of AI evaluation at Yes Energy; this is a foundational role with room to build process and tooling from scratch

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Yes Energy? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect