Senior Principal Software Engineer leading LLM and GNN model serving solutions at JPMorganChase. Responsible for MLOps/LLMOps strategy, optimizing inference, and building scalable platforms on AWS. Collaborates with data science and SRE teams to productionize AI models at enterprise scale.
Responsibilities
Advise and lead strategy for model serving solutions for LLMs and GNNs across cloud and on-premises.
Define and implement MLOps and LLMOps strategies for end-to-end model lifecycle management.
Drive optimization of model inferencing using quantization, model parallelism, intelligent batching, and hardware acceleration.
Create durable, reusable software and platform frameworks to standardize ML Engineering services.
Establish best practices for automation, CI/CD, and infrastructure-as-code using containerization and orchestration.
Partner with data science, platform engineering, and SRE teams to productionize models on AWS.
Lead deployment and optimization using Triton Inference Server and vLLM for high-throughput, low-latency serving.
Oversee production operations for AI workloads, including monitoring, incident response, security, and compliance.
Requirements
10+ years applied software engineering experience.
8+ years AI/ML engineering experience with LLMs, GNNs, and other model architectures (e.g., GPT, Llama, Falcon, Mistral).