Search
Go
Hippocratic AI is seeking an LLM Inference Engineer to optimize large language model serving infrastructure. The role involves designing multi-node serving architectures, applying quantization techniques, and implementing speculative decoding. Candidates need expertise in Python, C++, CUDA, and GPU optimization.

Palo Alto · US
Snowflake