Garner Health is seeking a Senior MLOps Engineer to join the Platform Engineering team, building and operating production ML systems that power healthcare products. This role involves ensuring reliability, performance, and cost-efficiency of ML systems, building ML platform components, implementing CI/CD pipelines, and establishing drift monitoring. The position is based in NYC with a hybrid schedule.
Responsibilities
Help ensure the reliability, performance, functionality, and cost-efficiency of Garner's production ML systems, contributing to SLOs, observability, and on-call responsibilities.
Build key components of Garner's ML platform, including data infrastructure (feature store, model registry, CI/CD for models) and standardized service patterns.
Implement ML-specific CI/CD pipelines, transitioning from manual notebook hand-offs to automated PR-driven workflows with data quality checks and statistical model validation.
Drive down cost and latency through improved architecture, hardware choices, and model optimization.
Contribute to workflows, standards, and KPIs for a growing MLOps function.
Help establish drift monitoring: design and implement automated data drift and concept drift monitoring systems.
Requirements
5+ years of software engineering experience, with meaningful time spent operating ML or data-intensive systems in production.