Skip to content
RemoteJobs44
Back to Jobs
a

Research Engineer, Reward Models Platform

Remote-Friendly (Travel-Required) | San Francisco, CA | Seattle, WA | New York City, NYabout 2 months ago
NewEngineeringfull-timemidAggregator

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About the role

You will deeply understand the research workflows of our Finetuning teams and automate the high-friction parts – turning days of manual experimentation into hours. You’ll build the tools and infrastructure that enable researchers across the organization to develop, evaluate, and optimize reward signals for training our models. Your scalable platforms will make it easy to experiment with different reward methodologies, assess their robustness, and iterate rapidly on improvements to help the rest of Anthropic train our reward models.

This is a role for someone who wants to stay close to the science while having outsized leverage. You'll

Pro unlocks apply links & auto-apply

See something off?

Spam, scam, fake employer, broken apply link — let us know and we’ll review within 24h.

Report this listing