Today, my teammate Carlos Camacho published a blog post that continues the work I started on NVIDIA MIG GPUs:

In the part one about fractional GPUs, we talked about time slicing as “carpooling” for your GPU – getting more people (processes) into the same car (GPU) to use it more efficiently. In this second strategy, called multi-instance GPU (MIG) partitioning, imagine that for the same “carpooling” we get numbered and sized seats for each person so everyone knows where to sit and where they fit. This approach allows for dividing GPUs into isolated and static instances for concurrent usage by different applications.

RHOAI MIG Sharing