About OXMIQ
OXMIQ designs GPU and AI silicon for large-scale model inference and training and is developing an infrastructure and AI service orchestration platform that runs on heterogeneous accelerator hardware.
The Role
The Architect, AI Cloud Platform, owns the inference-serving architecture of OXMIQ's infrastructure and AI service orchestration platform — the layer through which customer workloads are served from accelerator fleets at scale. The role is responsible for the end-to-end serving path: how a model is loaded, scheduled, batched, cached, dispatched, and routed across heterogeneous hardware to deliver competitive latency, throughput, and token-per-dollar economics.
The Architect must also have a working understanding of the broader platform layers on which inference serving depends — Kubernetes-based orchestration, multi-tenant isolation, observabi...
Take the next step and apply for this exciting opportunity
Apply Now