Senior Product Manager, Conversational AI Chatbot & Agent Quality

External

Okx · Singapore, Singapore

Full-timeOn-site2w ago

BlockchainObservabilityRoutingSAFe

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

At OKX, we believe that the future will be reshaped by crypto, and ultimately contribute to every individual's freedom. OKX is a leading crypto exchange, and the developer of OKX Wallet, giving millions access to crypto trading and decentralized crypto applications (dApps). OKX is also a trusted brand by hundreds of large institutions seeking access to crypto markets. We are safe and reliable, backed by our Proof of Reserves. Across our multiple offices globally, we are united by our core principles: We Before Me , Do the Right Thing , and Get Things Done . These shared values drive our culture, shape our processes, and foster a friendly, rewarding, and diverse environment for every OK-er. OKX is part of OKG, a group that brings the value of Blockchain to users around the world, through our leading products OKX, OKX Wallet, OKLink and more. We are looking for an execution-focused Product Manager who has built and improved conversational AI products in production - and has business results to prove it. A strong plus is hands-on experience with agent evaluation harnesses or internal agent platform product design: you've defined the systems that test, score, and operate agents at scale, not just shipped the agents themselves. You work in logs and specs, not just decks. You know what a bad retrieval chunk looks like, you've personally written labeling guidelines, and you can point to a quarter where your work moved resolution rate by double digits.

Responsibilities

Chatbot Operations
Knowledge base ownership: structure, content quality, retrieval coverage, freshness governance
SOP & dialogue flow design: business process → agent flow → edge case handling → escalation paths
Labeling pipeline: annotation specs, annotator QA, training batch impact tracking
Daily quality work: log review, failure triage, weekly knowledge/flow update cadence
Metrics ownership: resolution rate, fallback rate, per-intent accuracy, CSAT
Agent Harness & Platform Product
Define and maintain agent evaluation frameworks : te

Requirements

Knowledge Base & Data Quality - knowledge base architecture, retrieval quality tuning, content governance, labeling pipelines, annotation guidelines, training data impact tracking, and dataset freshness management
Agent Evaluation & Quality Assurance - evaluation harness design, test case schemas, automated scoring rubrics (correctness, groundedness, tool-use accuracy), LLM-as-judge evaluation, regression testing for non-deterministic systems, and feedback-driven improvement loops
Chatbot Operations & Dialogue Design - SOP-to-agent-flow translation, edge case handling, escalation path design, log-based failure triage, and metrics ownership (resolution rate, fallback rate, per-intent accuracy, CSAT)
Agent Runtime & Observability Platforms - agent runtime product requirements, tool permission models, task configuration interfaces, developer-facing observability dashboards, failure alerting logic, and debugging workflows
Human-in-the-Loop Workflows - low-confidence case routing, reviewer task interface design, correction data capture, and feedback loop integration back into training or knowledge pipelines
Chatbot & Knowledge Base (Core)
Built or rebuilt a knowledge base - defined structure, wrote/reviewed content, fixed retrieval quality, saw metrics improve
Designed SOPs that became agent flows - mapped real business processes, handled edge cases, shipped as working dialogue flows
Owned a labeling pipeline - wrote annotation guidelines, QA'd batches, tracked whether labeled data moved production metrics
Moved a metric that mattered - resolution rate, fallback rate, CSAT - and can explain exactly what changed
Agent Harness & Platform Product (Strong Plus)
Designed an agent evaluation harness: defined test case schemas, scoring rubrics, and spec'd automated evaluation pipelines with engineering
Product-designed an internal agent platform: defined requirements for agent runtime - tool permission models, task configuration interfaces, developer-facing observability dashboards, and failure debugging workflows; owned the roadmap and shipped iteratively
Closed the eval-to-improvement loop: used harness output to prioritize knowledge fixes, prompt revisions, or flow changes - not just reported scores but drove action from them
Designed human-in-the-loop review workflows: low-confidence case routing, reviewer task interfaces, correction data capture and feedback loop back into training or knowledge pipelines

Benefits

Vision insurancePaid time off

Additional Information

OKX will be prioritising applicants who have a current right to work in Singapore, and do not require OKX's sponsorship of a visa.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at okx? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect