VOSYS September Newsletter

in virtualization | newsletter

Virtual Open Systems September newsletter was published today, with two articles about my work:

Mixed-critical virtualization: VOSySmcs, mixed-criticality graphic support

VOSySmcs consists of a full fledged software stack to support a modern generation of car virtual cockpit where the In-Vehicle Infotainment (IVI) system and the Instrument Digital Cluster are consolidated and interact on a single platform. Indeed, traditional gauges and lamps are replaced by digital screens offering opportunities for new functions and interactivity. Vehicle information, entertainment, navigation, camera/video and device connectivity are being combined into displays. However, this different information does not have the same level of criticality and the consolidation of mixed-critical applications represent a real challenge.

In this context, VOSySmcs includes a mixed-criticality graphic support that enables the integration of safety-critical and non-critical information on a single display, while providing rendering guarantees for the safety-critical output. In addition, VOSySmcs supports GPU virtualization in order to provide hardware acceleration capacity for the Virtual Machines running in the non-critical partition such as Linux, Android, etc.

VosysMCS

Computation acceleration: OpenCL inside VMs and containers

As part of the ExaNoDe H2020 research project, Virtual Open Systems develops a software API remoting solution for OpenCL. OpenCL is an open standard maintained by the Khronos Group, used for offloading computation tasks into accelerators, such as GPUs and FPGAs.

Software API remoting is a para-virtualization technique that allows accessing a host native library from the inside of a virtual machines. It operates by intercepting API function calls from the application in the guest system, and forwarding them to a helper process on the host through the use of shared memory pages. API remoting for containers can be achieved similarly, by replacing the host-to-VM communication layer (based on Virtio) with Linux inter-process communication mechanisms.

To comply with the high performance requirements of OpenCL usage, it is important to reduce as much as possible the overhead of the API remoting layer. Hence, the work has focused on passing the data buffers (that may account for several gigabytes of memory) with zero copies, that to guest physical pages lookup and remapping.

OpenCL

GPU virtualization solutions for HPC @ Compas'18

in Presentation | intro

Compas

The 4th of July, I was in Toulouse, France, to present our work on GPU virtualization solution for HPC, at the French COMPAS conference on parallelism, archictures and systems. The presentation was about OpenCL accelerator API remoting for HPC computin, and GPU hardware-assisted pass-through. The poster was about virtual machine live and incremental checkpointing.

OpenCL API Remoting and Qemu live and incremental checkpointing is part of our ExaNoDe activities.

GPU virtualization solutions for HPC

Kevin Pouget, Alvise Rigo, Daniel Raho (Virtual Open Systems)

  • OpenCL API Remoting
  • GPU Hardware-Assisted Pass-through
Continue Reading

ExaNode/ExaNeSt @ eXdci'18 ExascaleHPC workshop

in Presentation | talk

eXdci

Today, my colleage Radoslav Dimitrov is at Ljubljana, Slovenia, at the eXdci European HPC Summit Week to present our work on virtualisation at the second ExascaleHPC joint-Workshop between ExaNoDe, ExaNeSt, ECOSCALE and EuroEXA projects.

The talk is entitled:

Virtualization technologies in modern HPC systems

It presents two aspects of our virtualization work:

  • Software switches
  • API Remoting in OpenCL and MPI
Continue Reading

VOSYS March Newsletter

in virtualization | newsletter

Virtual Open Systems March newsletter was published today, with an article about my work:

Checkpointing for HPC: High performance live checkpointing

At the 2018 HiPEAC ExascaleHPC workshop organized in the context of the ExaNoDe EC project, Virtual Open Systems has presented the progress of its implementation of live and incremental checkpointing, for Qemu-KVM.

The live aspect of this work reduce the virtual machine (VM) downtime to a few milliseconds, while the RAM is copied to disk in background; and the incremental aspect allows to checkpoint only the pages actually modified since the previous checkpoint. Periodic virtual machine checkpointing improves the reliability of HPC and cloud-computing environments, as it prevents the loss of volatile data in case of hardware failure. The live aspect makes it virtually transparent for the user, whose VM keeps running unaltered during the checkpointing. The incremental aspect further reduces the checkpoint impact on the system, as only part of the RAM is saved, and also reduces the footprint of the checkpoints on the disk.

The challenges behind both aspects of the checkpointing are related to the tracking and handling of the memory pages being modified by the guest system. Virtual Open Systems developed a novel approach to track these changes in Qemu, which guarantees the consistency of every checkpoint, regardless the activity of the guest system.

Qemu checkpointing

ExaNoDe @ DSD 2017

in Paper

Today, I presented the ExaNoDe positioning paper at the Euromicro DSD conference in Vienna. VOSYS is leading the dissemination work package of ExaNoDe, and coordinated the writing of the paper.

Vienna

The paper is entitled Paving the way towards a highly energy-efficient and highly integrated compute node for the Exascale revolution: the ExaNode approach:

Power consumption and high compute density are the key factors to be considered when building a compute node for the upcoming Exascale revolution. Current architectural design and manufacturing technologies are not able to provide the requested level of density and power efficiency to realise an operational Exascale machine. A disruptive change in the hardware design and integration process is needed in order to cope with the requirements of this forthcoming computing target. This paper presents the ExaNoDe H2020 research project aiming to design a highly energy efficient and highly integrated heterogeneous compute node targeting Exascale level computing, mixing low-power processors, heterogeneous co-processors and using advanced hardware integration technologies with the novel UNIMEM Global Address Space memory system.

Continue Reading

BOAST: A metaprogramming framework to produce portable and efficient computing kernels for HPC applications

in Paper

The journal article I co-authored with B. Videau within the Mont-Blanc project is now published in The International Journal of High Performance Computing Applications.

The part I wrote are related to the porting of Specfem3D to OpenCL, and testing non-regression using debugging traces.

  • 5.3 Porting SPECFEM3D application kernels: From CUDA to OpenCL using BOAST
  • 4.3 Non-regression testing using trace debugging

Mont-Blanc

The portability of real high-performance computing (HPC) applications on new platforms is an open and very delicate problem. Especially, the performance portability of the underlying computing kernels is problematic as they need to be tuned for each and every platform the application encounters. This article presents BOAST, a metaprogramming framework dedicated to computing kernels. BOAST allows the description of a kernel and its possible optimizations using a domain-specific language. BOAST runtime will then compare the different versions’performance as well as verify their exactness. BOAST is applied to three use cases: a Laplace kernel in OpenCL and two HPC applications BigDFT (electronic density computation) and SPECFEM3D (seismic and wave propagation).

Continue Reading

Interactive Runtime Verification - When Interactive Debugging Meets Runtime Verification

in Paper

The article I co-authored with R. Jakse got accepted at the 28th ISSRE conference. It will take place in Toulouse, France, on October 23-26th, 2017.

ISSRE

Monitoring is the study of a system at runtime, looking for input and output events to discover, check or enforce behavioral properties. Interactive debugging is the study of a system at runtime in order to discover and understand its bugs and fix them, inspecting interactively its internal state.

Interactive Runtime Verification (i-RV) combines monitoring and interactive debugging. We define an efficient and convenient way to check behavioral properties automatically on a program using a debugger. We aim at helping bug discovery while keeping the classical debugging techniques and interactivity, which allow understanding and fixing bugs.