Unity Technologies is seeking a Staff Machine Learning Engineer to lead the development of next-generation AI-driven game experiences, focusing on computer vision and multi-modal models. The role involves bringing state-of-the-art models from research to production, including transformers, diffusion networks, and vision-language models. You will mentor a team and drive architectural decisions across the ML stack.
Responsibilities
Set technical vision and roadmap for computer vision and multi-modal AI models
Drive design and implementation of models for image/video understanding, generation, segmentation, detection, and dense prediction
Own the path from research prototype to production: training, fine-tuning, distillation, export, and serving
Collaborate with research scientists to translate novel architectures into deployable implementations
Design scalable systems for multi-modal inference
Lead and mentor a team of ML engineers; define engineering best practices and evaluation methodology
Requirements
6+ years in ML engineering with significant depth in computer vision and/or multi-modal modeling
Proven production experience with transformer-based and diffusion-based vision models (e.g., ViT, CLIP/SigLIP, Stable Diffusion, DETR/SAM)
Strong command of the full model lifecycle: data curation, training, evaluation, and serving at scale
Unity Technologies (Unity Software Inc.) provides the leading real-time 3D development platform for creating, deploying, and growing interactive games and industrial experiences. Their tools support the entire development lifecycle, from prototyping and rendering to monetization and user acquisition across mobile, PC, console, and extended reality (XR) platforms.