Senior Machine Learning Engineer - Optimization
Apply Now Please email resume to careers@ellipsishealth.com
Ellipsis Health is creating cutting-edge AI/ML products that solve healthcare staffing issues and administrative burdens using conversational AI and our patented voice biomarker technology in the delivery of better healthcare for everyone. We are headquartered in Silicon Valley and are funded and supported by some of the most preeminent venture capital teams.
We are currently looking for a Senior Machine Learning Engineer with a strong background in optimizing conversational AI systems. The ideal candidate will have hands-on experience in deploying and fine-tuning large language models, building scalable NLP pipelines, and improving the efficiency of conversational models in production environments. This role will be instrumental in ensuring our platform’s AI-driven interactions are both fast and reliable for users, including healthcare providers and patients.
Ellipsis Health is located in the San Francisco Bay Area, but we are open to remote candidates for this role.
RESPONSIBILITIES
-
Identify and optimize performance bottlenecks in large language models and conversational AI pipelines, focusing on speed, parallelization, and memory usage.
-
Design, refine, and maintain scalable components for conversational AI agents that ensure efficient performance in production environments.
-
Conduct thorough profiling, load/stress testing, and capacity planning to support large-scale data and real-time interactions.
-
Implement advanced optimization techniques (e.g. model pruning, quantization, distillation) to reduce memory usage and improve latency without compromising accuracy.
-
Collaborate with cross-functional teams (engineering, product management, data science) to define optimization priorities and deliver performance improvements aligned with business goals.
REQUIREMENTS
-
Degree in Computer Science, Machine Learning, Data Science, or a related field, or equivalent experience.
-
5+ years of ML engineering experience, with a strong track record in performance optimization for complex ML systems.
-
Proven ability to analyze and optimize bottlenecks in ML systems, particularly for real-time inference with generative AI models.
-
Strong programming skills in Python, C++, and ML frameworks (e.g., PyTorch, TensorFlow).
-
Proficiency in generative AI models, including hands-on fine-tuning for production use, as well as advanced optimization techniques (model distillation, pruning, quantization) to optimize large language models.
-
Familiarity with distributed computing methods, cloud clusters, GPU-accelerated or multiprocessor environments for large-scale training and inference.
-
Excellent problem-solving and debugging skills, evidenced by measurable performance improvements in production.
-
Good communication and documentation skills to effectively articulate complex technical
findings and recommendations to both technical and non-technical stakeholders.
Ellipsis Health is an inclusive company where diversity is celebrated. We are an equal opportunity workplace and affirmative action employer. We are committed to equal opportunity regardless of race, color, religion, sex, gender identity, ancestry, citizenship, age, physical or mental disability, military or veteran status, marital status, domestic partner status or any other basis protected by local, state or federal laws.
BENEFITS
-
Competitive salary
-
Meaningful stock options
-
Generous PTO policy
-
Health insurance (medical, dental, vision)
-
401(k)
Apply Now Please email resume to careers@ellipsishealth.com