Position Overview
System Software Engineers in this role operate at the intersection of LLM inference optimization and novel hardware bring‑up, co‑designing software abstractions with the hardware architecture team. You will extend leading open‑source inference engines for CXL‑aware memory management and build an open software layer that enables any host server to leverage a CXL‑attached KV‑cache accelerator, including cryptographic acceleration for confidential LLM inference on sensitive enterprise workloads.
Key Responsibilities
▸ Extend advanced attention mechanisms in leading inference engines for CXL‑based block‑level KV‑cache offloading, enabling seamless hot/cold tiering between local high‑bandwidth memory and CXL‑attached DDR5 pools on the target hardware platform.
▸ Design and implement the Open KV Connector (OKC) protocol stack, including host‑side drivers and device‑side firmware, so that inference engines c...
Take the next step and apply for this exciting opportunity
Apply Now