OpenShift AI Ops Resident (SRE / MLOps)
Pennington, NJ (Onsite 3 days/week – non-negotiable)
⏳ Contract (6 months, strong extension potential)
Overview
We’re seeking a senior AI Platform SRE / MLOps engineer to support and stabilize a production Generative AI platform , running on Red Hat OpenShift.
This is a hands-on, high-impact role focused on operational excellence, reliability engineering, and performance tuning of GPU-accelerated AI workloads in a regulated enterprise environment.
You will act as a key technical resource within Dell’s delivery team, helping bring structure, stability, and scalability to an evolving GenAI platform.
What You’ll Do
+ Own day-to-day operations of a production GenAI platform running on OpenShift/Kubernetes
+ Diagnose and resolve performance, stability, and scaling issues across AI workloads
+ Optimize GPU-based inference pipelines using tools like:
+ NVIDIA Triton Inference Server
+ ...