Perplexity serves tens of millions of users daily with reliable, high-quality answers grounded in an LLM-first search engine and specialized data sources. The Answer Quality team ensures that our prompts, tools, search, and specialized datasets, combined with both frontier and in-house models, create the best possible experience for our users. As our product evolves, our evaluations must remain fast, accurate, and actionable. In this role, you will build the data flywheel that serves teams across Perplexity. Responsibilities Build the systems and pipelines that enable Search, Product, and other teams to independently access and utilize reliable eval verdicts without bottlenecks Take ownership of the "evals-to-product" loop, autonomously determining the best way to turn raw signals into durable datasets that power decision-making across the company Build a robust simulator pipeline capable of replaying user interactions with the product in formats legible to LLMs and VLMs, reflecting product changes as they are shipped Maintain data trust by implementing monitoring, lineage, and quality checks, ensuring downstream consumers can rely on the results implicitly Operate in a small, high
Pro unlocks apply links & auto-apply
Spam, scam, fake employer, broken apply link — let us know and we’ll review within 24h.
Report this listing