For Employers

AIML - Sr Machine Learning Engineer, Evaluation

Apple
Cupertino, California, United StatesPosted yesterday
Location
Cupertino, California, United States

About the role

We are seeking a highly skilled and experienced machine learning engineer to join AIML Evaluation to build the systems that evaluate and refine Apple's foundation models and agents. As a key member of the team, you will help design and develop benchmarks, evaluators, simulation environments, and prompt and context optimization pipelines that drive quality improvements across Apple's AI experiences. You will collaborate with product teams and the foundation model team to close the loop between observation and improvement, contributing datasets, environments, and reward signals that drive model and agent quality.

Responsibilities

Our team builds the benchmarks, environments, and tooling that power model and agent refinement, and turns observations into actionable opportunities for the next model and agent iteration. We work across the full spectrum of evaluation: offline benchmarks, device-in-the-loop simulation, and on-device observation in production.

In this role, you will:

  • Design and develop evaluation and refinement infrastructure that supports a broad range of AI products at Apple
  • Work on agent and model evaluation across offline, device-in-the-loop, and on-device settings
  • Build automated prompt and context optimization pipelines
  • Partner with product and research teams to translate failure analysis into measurable model and agent improvements
  • Develop LLM-as-judge evaluators, train reward models calibrated against human feedback, optimize prompts and context for agents, and contribute targeted datasets and reward signals to foundation model post-training
  • Engage with product teams across Apple and contribute to advancements in large language models and agentic systems that will reach millions of users

Minimum Qualifications

  • Strong background in machine learning and distributed systems
  • Experience building and maintaining ML infrastructure for evaluation, training, or deployment
  • Ability to work effectively across multiple codebases, teams, and organizations
  • 8+ years of professional experience as a software engineer, preferably in machine learning or a related field
  • Bachelor's or Master's degree in Computer Science or a related field

Preferred Qualifications

  • Experience with LLM evaluation, LLM-as-judge, or reward modeling
  • Experience with prompt optimization, agent harness development, or post-training (SFT, DPO, RLHF)
  • Proficiency in Python and ML frameworks such as PyTorch
  • Experience with agentic systems, simulation environments, or trajectory-based data generation
  • Familiarity with on-device or privacy-preserving ML
  • Proactive and determined problem-solving skills
  • Excellent communication skills

About Apple

Apple Inc. is a technology company that designs and sells consumer electronics, software, and services. Its core product lines are the iPhone line of smartphones, the iPad line of tablet computers, and the Mac line of personal computers, and it offers its products online and through a chain of retail stores known as Apple Stores. Other products include Apple Watch, Apple TV, and AirPods, along with services and platforms such as iOS, macOS, the App Store, and Apple TV.

Industry
Technology / Consumer electronics and software
Head office
Cupertino, California, United States
Company size
Approximately 164,000 full-time employees worldwide (as of late September 2024)
Founded
1976
iPhone smartphonesMac personal computersiPad tabletsApple WatchApple TVSoftware and services (iOS, macOS, App Store, Apple TV)Apple-designed siliconSpeech, audio, and conversational AI / machine learning
View Apple’s profile →

Interested in this role?

Apply now to join Apple.

Apply for this position

Similar roles

AIML - Sr Machine Learning Engineer, Evaluation

Apply