Join the Siri team at Apple to build production automation for foundation model training. You'll own the model lifecycle, design agent-based pipelines, and develop LLM-native tooling for evaluation and release.
Responsibilities
Own the end-to-end model lifecycle building model pipelines, integrating with other Apple frameworks to enable rapid model iteration, staging promotion, production rollout and deprecation.
Design and operate agent-based automation pipelines for ML models where agents own decision logic at each gate and humans approve only at defined escalation points.
Develop multi-agent workflows using LLM-native tooling for on-device evaluation, regression triage, release readiness decisions, and automated root cause analysis.
Own the launch tooling to build and improve the shell scripts and CLI commands that turn a config-name and a dataset into a running training job — across SFT, LoRA adapter, and RL phases.
Requirements
Strong software engineering fundamentals; comfortable in Python and Bash.
5+ years experience in Machine Learning Operations.
Production experience with one or more cloud ML platforms (GCP TPU, AWS GPU clusters, Kubernetes-backed training infra).
Familiarity with the ML training lifecycle: data preprocessing pipelines, distributed training, checkpoint formats, multi-slice / multi-region considerations.
Apple Inc. is an American multinational technology company that designs, develops, and sells consumer electronics, computer software, and online services. Its core products include the iPhone, iPad, Mac computers, Apple Watch, and various digital services.