HPC/AI Platform Engineering
Company: Eli Lilly and Company
Location: Indianapolis
Posted on: January 2, 2026
|
|
|
Job Description:
At Lilly, we unite caring with discovery to make life better for
people around the world. We are a global healthcare leader
headquartered in Indianapolis, Indiana. Our employees around the
world work to discover and bring life-changing medicines to those
who need them, improve the understanding and management of disease,
and give back to our communities through philanthropy and
volunteerism. We give our best effort to our work, and we put
people first. We’re looking for people who are determined to make
life better for people around the world. Come help us unlock the
power of HPC and AI based POGPU and Accelerated Compute
infrastructure! The Cloud and Connectivity organization is seeking
experts and leaders in AI and High-Performance Computing (HPC), and
Nvidia DGX server management. This role will also focus on DGX
Server mgmt., Spectrum X networking technologies, and Weka Storage
integration to support cutting-edge AI/ML workloads. What You’ll Be
Doing You will be driving the engineering and operations of
advanced Linux platforms supporting AI and HPC workloads, managing
Nvidia DGX systems using Mission Control, Base Command and Run:AI,
and optimizing Spectrum X networking and WEKA storage for AI/ML
applications. You will play a crucial role in boosting productivity
for our Advanced Intelligence and Data science teams through
implementing advancements across our AI/HPC infrastructure tooling
and operational excellence You will work in our Infrastructure
Hosting Platform area leading the strategy, engineering and
development of Advanced Linux computing capabilities for AI/ML.
Additionally, you would advise with our senior Linux platform
engineer directing the global Linux strategy for on-premises
private cloud and public IaaS Linux services. How You’ll Succeed Be
Bold - You will bring a high learning agility and Infrastructure
availability and reliability Engineer skills to help us enable the
Lilly Technology strategy, identifying tech opportunities, and
accelerate our cloud journey. Be Fast - You will accelerate
initiatives in areas such as: AI/ML acceleration, Infrastructure AI
OPS automation, HPC management, and infrastructure as code to
enable critical business projects. Be Proactive - You will have
groundbreaking chances to build secure, resilient, and reliable
hybrid cloud services using proactive, predictive, and automated
capabilities. Be Your Best - You will learn about new technologies,
AI/ML based HPC, large scale GPU clustering, Infrastructure as
Code, and Enterprise Scale Hyper Cloud providers, agile ways of
working, and willingness to become an expert. What You Should Bring
Expertise in Linux system administration, HPC environments, and
Nvidia DGX server management. Experience with Spectrum X networking
and parallel file systems is essential. Strong scripting skills and
familiarity with containerization and automation tools are highly
valued. 6 years of demonstrated experience in AI/ML and HPC
workloads and infrastructure. Hands-on experience in using or
operating High Performance Computing (HPC) grade infrastructure as
well as in-depth knowledge of accelerated computing (e.g., GPU),
storage (e.g., Weka), scheduling & orchestration (e.g., Slurm,
Kubernetes, LSF), high-speed networking (e.g., Ultra-Ethernet, RoCE
), and containers technologies (Docker). Passion for continual
learning and keeping abreast of new technologies and effective
approaches in the AI/ML infrastructure field. Expertise in running
and optimizing large-scale distributed training workloads using
PyTorch (DDP, FSDP), NeMo, or JAX. Also, possess a deep
understanding of AI/ML workflows, encompassing data processing,
model training, and inference pipelines. Some proficiency in at
least one scripting language such as Bash, Python, or equivalent.
Basic Qualifications Bachelor’s degree in computer science,
Information Technology, or related technical field. 7 years’
experience as a Linux OS/ Platform Engineer. Demonstrated
experience leading a global large-scale Infrastructure project.
Additional Information: Hybrid role located in Indianapolis, IN
(relocation required) Organization Overview Lilly IT builds and
maintains capabilities using cutting edge technologies like most
prominent tech companies. What differentiates Lilly IT is that we
redefine what’s possible through tech to advance our purpose –
creating medicines that make life better for people around the
world, like data driven drug discovery and connected clinical
trials. We hire the best technology professionals from a variety of
backgrounds, so they can bring an assortment of knowledge, skills,
and diverse thinking to deliver innovative solutions in every area
of our business. The Global Information and Services Tech team is
at the forefront of digitalization to enable and advance the entire
company, with increased productivity and best-in-class Customer
experiences. This team provides a robust and sustainable
infrastructure of hardware, software and services that are critical
to enable our global workforce and business to operate and
transform. As leaders in technology and understanding business
requirements and challenges, this team defines and leads the
overall company technology strategy. Lilly is dedicated to helping
individuals with disabilities to actively engage in the workforce,
ensuring equal opportunities when vying for positions. If you
require accommodation to submit a resume for a position at Lilly,
please complete the accommodation request form (
https://careers.lilly.com/us/en/workplace-accommodation ) for
further assistance. Please note this is for individuals to request
an accommodation as part of the application process and any other
correspondence will not receive a response. Lilly is proud to be an
EEO Employer and does not discriminate on the basis of age, race,
color, religion, gender identity, sex, gender expression, sexual
orientation, genetic information, ancestry, national origin,
protected veteran status, disability, or any other legally
protected status. Our employee resource groups (ERGs) offer strong
support networks for their members and are open to all employees.
Our current groups include: Africa, Middle East, Central Asia
Network, Black Employees at Lilly, Chinese Culture Network,
Japanese International Leadership Network (JILN), Lilly India
Network, Organization of Latinx at Lilly (OLA), PRIDE (LGBTQ
Allies), Veterans Leadership Network (VLN), Women’s Initiative for
Leading at Lilly (WILL), enAble (for people with disabilities).
Learn more about all of our groups. Actual compensation will depend
on a candidate’s education, experience, skills, and geographic
location. The anticipated wage for this position is $135,000 -
$213,400 Full-time equivalent employees also will be eligible for a
company bonus (depending, in part, on company and individual
performance). In addition, Lilly offers a comprehensive benefit
program to eligible employees, including eligibility to participate
in a company-sponsored 401(k); pension; vacation benefits;
eligibility for medical, dental, vision and prescription drug
benefits; flexible benefits (e.g., healthcare and/or dependent day
care flexible spending accounts); life insurance and death
benefits; certain time off and leave of absence benefits; and
well-being benefits (e.g., employee assistance program, fitness
benefits, and employee clubs and activities).Lilly reserves the
right to amend, modify, or terminate its compensation and benefit
programs in its sole discretion and Lilly’s compensation practices
and guidelines will apply regarding the details of any promotion or
transfer of Lilly employees. WeAreLilly
Keywords: Eli Lilly and Company, Anderson , HPC/AI Platform Engineering, IT / Software / Systems , Indianapolis, Indiana