LLM Inference Engineer at Hippocratic AI (Closed)

LLM Inference Engineer

PALO ALTO · US

—

Full-time

SENIOR

About the Role

Hippocratic AI is seeking an LLM Inference Engineer to optimize large language model serving infrastructure. The role involves designing multi-node serving architectures, applying quantization techniques, and implementing speculative decoding. Candidates need expertise in Python, C++, CUDA, and GPU optimization.

Responsibilities

Design and implement multi-node serving architectures for distributed LLM inference; Optimize multi-LoRA serving systems; Apply advanced quantization techniques (FP4/FP6); Implement speculative decoding and other latency optimization strategies; Develop disaggregated serving solutions with optimized caching strategies; Continuously benchmark and improve system performance.

Requirements

Experience optimizing LLM inference systems at scale; Proven expertise with distributed serving architectures; Hands-on experience implementing quantization techniques; Strong understanding of modern inference optimization methods including speculative decoding; Proficiency in Python and C++; Experience with CUDA programming and GPU optimization.

Nice to Have

Contributions to open-source inference frameworks such as vLLM, SGLang, or TensorRT-LLM; Experience with custom CUDA kernels; Track record of deploying inference systems in production environments; Deep understanding of performance optimization systems.

Hippocratic AI

Palo Alto · US · 200+ employees

Hippocratic AI is a healthcare technology company focused on developing safe, patient-facing large language models (LLMs) to improve healthcare accessibility. The company prioritizes safety and ethical AI development to address global healthcare worker shortages.

AI Engineer, Physical Systems and Sensing

CHAOS Industries

EL SEGUNDO · US

$145K–$250K

EL SEGUNDO · US

$145K–$250K

LLM Inference Engineer

About the Role

Responsibilities

Requirements

Nice to Have

Hippocratic AI

Related Jobs

AI Software Engineer

AI Engineer, Physical Systems and Sensing

Tech Stack

Senior Applied AI Engineer

Applied AI Engineer

Sr. AI Enablement Engineer

Artificial Intelligence Engineer