Join Apple's Cloud AI Platform team to build the data systems and infrastructure that power Apple's next-generation intelligent products. You will work on ML data pipelines, feature platforms, and scalable compute for generative AI, enabling ML engineers to build and ship models at Apple's quality and privacy standards.
Responsibilities
Design and build the platform behind Apple's largest model builds, including ingestion, versioning, lineage, and governance at petabyte scale.
Develop Python SDKs and core data libraries for ML engineers to access and transform datasets.
Build high-throughput data access and loading primitives to feed GPU fleets.
Operate distributed data pipelines using Spark, Daft, and Rust-based systems.
Optimize platform components for tight integration with PyTorch, JAX, and TensorFlow.
Partner with research teams to onboard new data sources for GenAI workloads.
Ensure governance including legal terms, privacy controls, and data lineage.
Drive efficiency and automation across the data plane and control plane.
Support next-generation workloads like foundation models, multimodal data, and retrieval-augmented systems.
Requirements
Proficiency with one or more modern ML frameworks (PyTorch, JAX, or TensorFlow), especially data loading and dataset access layers.